Compare commits
No commits in common. "master" and "v0.2.3" have entirely different histories.
@ -5,13 +5,6 @@
|
||||
# Docker
|
||||
.docker
|
||||
|
||||
# Backend development
|
||||
backend/static
|
||||
backend/staticfiles
|
||||
|
||||
# Frontend development
|
||||
frontend/node_modules
|
||||
|
||||
# Python
|
||||
tubearchivist/__pycache__/
|
||||
tubearchivist/*/__pycache__/
|
||||
@ -24,5 +17,8 @@ venv/
|
||||
# Unneeded graphics
|
||||
assets/*
|
||||
|
||||
# Unneeded docs
|
||||
docs/*
|
||||
|
||||
# for local testing only
|
||||
testing.sh
|
1
.gitattributes
vendored
@ -1 +0,0 @@
|
||||
docker_assets\run.sh eol=lf
|
6
.github/ISSUE_TEMPLATE/BUG-REPORT.yml
vendored
@ -6,17 +6,15 @@ body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thanks for taking the time to help improve this project! Please read the [how to open an issue](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#how-to-open-an-issue) guide carefully before continuing.
|
||||
Thanks for taking the time to help improve this project!
|
||||
|
||||
- type: checkboxes
|
||||
id: latest
|
||||
attributes:
|
||||
label: "I've read the documentation"
|
||||
label: Latest and Greatest
|
||||
options:
|
||||
- label: I'm running the latest version of Tube Archivist and have read the [release notes](https://github.com/tubearchivist/tubearchivist/releases/latest).
|
||||
required: true
|
||||
- label: I have read the [how to open an issue](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#how-to-open-an-issue) guide, particularly the [bug report](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#bug-report) section.
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: os
|
||||
|
35
.github/ISSUE_TEMPLATE/FEATURE-REQUEST.yml
vendored
@ -1,14 +1,37 @@
|
||||
name: Feature Request
|
||||
description: This Project currently doesn't take any new feature requests.
|
||||
description: Create a new feature request
|
||||
title: "[Feature Request]: "
|
||||
|
||||
body:
|
||||
- type: checkboxes
|
||||
id: block
|
||||
- type: markdown
|
||||
attributes:
|
||||
label: "This project doesn't accept any new feature requests for the foreseeable future. There is no shortage of ideas and the next development steps are clear for years to come."
|
||||
value: |
|
||||
Thanks for taking the time to help improve this project!
|
||||
|
||||
- type: checkboxes
|
||||
id: already
|
||||
attributes:
|
||||
label: Already implemented?
|
||||
options:
|
||||
- label: I understand that this issue will be closed without comment.
|
||||
- label: I have read through the [wiki](https://github.com/tubearchivist/tubearchivist/wiki).
|
||||
required: true
|
||||
- label: I will resist the temptation and I will not submit this issue. If I submit this, I understand I might get blocked from this repo.
|
||||
- label: I understand the [scope](https://github.com/tubearchivist/tubearchivist/wiki/FAQ) of this project and am aware of the [known limitations](https://github.com/tubearchivist/tubearchivist#known-limitations) and my idea is not already on the [roadmap](https://github.com/tubearchivist/tubearchivist#roadmap).
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: description
|
||||
attributes:
|
||||
label: Your Feature Request
|
||||
value: "## Is your feature request related to a problem? Please describe.\n\n## Describe the solution you'd like\n\n## Additional context"
|
||||
placeholder: Tell us what you see!
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: checkboxes
|
||||
id: help
|
||||
attributes:
|
||||
label: Your help is needed!
|
||||
description: This project is ambitious as it is, please contribute.
|
||||
options:
|
||||
- label: Yes I can help with this feature request!
|
||||
required: false
|
||||
|
23
.github/ISSUE_TEMPLATE/FRONTEND-MIGRATION.yml
vendored
@ -1,23 +0,0 @@
|
||||
name: Frontend Migration
|
||||
description: Tracking our new React based frontend
|
||||
title: "[Frontend Migration]: "
|
||||
labels: ["react migration"]
|
||||
|
||||
body:
|
||||
- type: dropdown
|
||||
id: domain
|
||||
attributes:
|
||||
label: Domain
|
||||
options:
|
||||
- Frontend
|
||||
- Backend
|
||||
- Combined
|
||||
validations:
|
||||
required: true
|
||||
- type: textarea
|
||||
id: description
|
||||
attributes:
|
||||
label: Description
|
||||
placeholder: Organizing our React frontend migration
|
||||
validations:
|
||||
required: true
|
6
.github/ISSUE_TEMPLATE/INSTALLATION-HELP.yml
vendored
@ -13,7 +13,9 @@ body:
|
||||
attributes:
|
||||
label: Installation instructions
|
||||
options:
|
||||
- label: I have read the [how to open an issue](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#how-to-open-an-issue) guide, particularly the [installation help](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#installation-help) section.
|
||||
- label: I have read and understand the [installation instructions](https://github.com/tubearchivist/tubearchivist#installing-and-updating).
|
||||
required: true
|
||||
- label: My issue is not described in the [potential pitfalls](https://github.com/tubearchivist/tubearchivist#potential-pitfalls) section.
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
@ -38,6 +40,6 @@ body:
|
||||
attributes:
|
||||
label: Relevant log output
|
||||
description: Please copy and paste any relevant Docker logs. This will be automatically formatted into code, so no need for backticks.
|
||||
render: Shell
|
||||
render: shell
|
||||
validations:
|
||||
required: true
|
||||
|
1
.github/ISSUE_TEMPLATE/config.yml
vendored
@ -1 +0,0 @@
|
||||
blank_issues_enabled: false
|
3
.github/pull_request_template.md
vendored
@ -1,3 +0,0 @@
|
||||
Thank you for taking the time to improve this project. Please take a look at the [How to make a Pull Request](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#how-to-make-a-pull-request) section to help get your contribution merged.
|
||||
|
||||
You can delete this text before submitting.
|
22
.github/workflows/lint_python.yml
vendored
Normal file
@ -0,0 +1,22 @@
|
||||
name: lint_python
|
||||
on: [pull_request, push]
|
||||
jobs:
|
||||
lint_python:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- uses: actions/setup-python@v2
|
||||
- run: pip install --upgrade pip wheel
|
||||
- run: pip install bandit black codespell flake8 flake8-bugbear
|
||||
flake8-comprehensions isort
|
||||
- run: black --check --diff --line-length 79 .
|
||||
- run: codespell
|
||||
- run: flake8 . --count --max-complexity=10 --max-line-length=79
|
||||
--show-source --statistics
|
||||
- run: isort --check-only --line-length 79 --profile black .
|
||||
# - run: pip install -r tubearchivist/requirements.txt
|
||||
# - run: mkdir --parents --verbose .mypy_cache
|
||||
# - run: mypy --ignore-missing-imports --install-types --non-interactive .
|
||||
# - run: python3 tubearchivist/manage.py test || true
|
||||
# - run: shopt -s globstar && pyupgrade --py36-plus **/*.py || true
|
||||
# - run: safety check
|
47
.github/workflows/pre_commit.yml
vendored
@ -1,47 +0,0 @@
|
||||
name: Lint, Test, Build, and Push Docker Image
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- '**'
|
||||
tags:
|
||||
- '**'
|
||||
pull_request:
|
||||
branches:
|
||||
- '**'
|
||||
|
||||
jobs:
|
||||
lint:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Node.js
|
||||
uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: '23'
|
||||
|
||||
- name: Install frontend dependencies
|
||||
run: |
|
||||
cd frontend
|
||||
npm install
|
||||
|
||||
- name: Cache pre-commit environment
|
||||
uses: actions/cache@v3
|
||||
with:
|
||||
path: |
|
||||
~/.cache/pre-commit
|
||||
key: ${{ runner.os }}-pre-commit-${{ hashFiles('**/.pre-commit-config.yaml') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-pre-commit-
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install pre-commit
|
||||
pre-commit install
|
||||
|
||||
- name: Run pre-commit
|
||||
run: |
|
||||
pre-commit run --all-files
|
43
.github/workflows/unit_tests.yml
vendored
@ -1,43 +0,0 @@
|
||||
name: python_unit_tests
|
||||
|
||||
on:
|
||||
push:
|
||||
paths:
|
||||
- '**/*.py'
|
||||
pull_request:
|
||||
paths:
|
||||
- '**/*.py'
|
||||
|
||||
jobs:
|
||||
unit-tests:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Install system dependencies
|
||||
run: |
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y gcc libldap2-dev libsasl2-dev libssl-dev
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Cache pip
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.cache/pip
|
||||
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-pip-
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -r backend/requirements-dev.txt
|
||||
|
||||
- name: Run unit tests
|
||||
run: pytest backend
|
12
.gitignore
vendored
@ -1,16 +1,8 @@
|
||||
# python testing cache
|
||||
__pycache__
|
||||
.venv
|
||||
|
||||
# django testing
|
||||
backend/static
|
||||
backend/staticfiles
|
||||
backend/.env
|
||||
# django testing db
|
||||
db.sqlite3
|
||||
|
||||
# vscode custom conf
|
||||
.vscode
|
||||
|
||||
# JavaScript stuff
|
||||
node_modules
|
||||
|
||||
.editorconfig
|
||||
|
@ -1,49 +0,0 @@
|
||||
repos:
|
||||
- repo: https://github.com/pre-commit/pre-commit-hooks
|
||||
rev: v5.0.0
|
||||
hooks:
|
||||
- id: end-of-file-fixer
|
||||
- repo: https://github.com/psf/black
|
||||
rev: 25.1.0
|
||||
hooks:
|
||||
- id: black
|
||||
alias: python
|
||||
files: ^backend/
|
||||
args: ["--line-length=79"]
|
||||
- repo: https://github.com/pycqa/isort
|
||||
rev: 6.0.1
|
||||
hooks:
|
||||
- id: isort
|
||||
name: isort (python)
|
||||
alias: python
|
||||
files: ^backend/
|
||||
args: ["--profile", "black", "-l 79"]
|
||||
- repo: https://github.com/pycqa/flake8
|
||||
rev: 7.1.2
|
||||
hooks:
|
||||
- id: flake8
|
||||
alias: python
|
||||
files: ^backend/
|
||||
args: ["--max-complexity=10", "--max-line-length=79"]
|
||||
- repo: https://github.com/codespell-project/codespell
|
||||
rev: v2.4.1
|
||||
hooks:
|
||||
- id: codespell
|
||||
exclude: ^frontend/package-lock.json
|
||||
- repo: https://github.com/pre-commit/mirrors-eslint
|
||||
rev: v9.22.0
|
||||
hooks:
|
||||
- id: eslint
|
||||
name: eslint
|
||||
files: \.[jt]sx?$
|
||||
types: [file]
|
||||
entry: npm run --prefix ./frontend lint
|
||||
pass_filenames: false
|
||||
- repo: https://github.com/pre-commit/mirrors-prettier
|
||||
rev: v4.0.0-alpha.8
|
||||
hooks:
|
||||
- id: prettier
|
||||
entry: npm run --prefix ./frontend format
|
||||
pass_filenames: false
|
||||
|
||||
exclude: '.*(\.svg|/migrations/).*'
|
258
CONTRIBUTING.md
@ -1,207 +1,27 @@
|
||||
# Contributing to Tube Archivist
|
||||
## Contributing to Tube Archivist
|
||||
|
||||
Welcome, and thanks for showing interest in improving Tube Archivist!
|
||||
If you haven't already, the best place to start is the README. This will give you an overview on what the project is all about.
|
||||
|
||||
## Table of Content
|
||||
- [Beta Testing](#beta-testing)
|
||||
- [How to open an issue](#how-to-open-an-issue)
|
||||
- [Bug Report](#bug-report)
|
||||
- [Feature Request](#feature-request)
|
||||
- [Installation Help](#installation-help)
|
||||
- [How to make a Pull Request](#how-to-make-a-pull-request)
|
||||
- [Contributions beyond the scope](#contributions-beyond-the-scope)
|
||||
- [User Scripts](#user-scripts)
|
||||
- [Improve to the Documentation](#improve-to-the-documentation)
|
||||
- [Development Environment](#development-environment)
|
||||
---
|
||||
## Report a bug
|
||||
|
||||
## Beta Testing
|
||||
Be the first to help test new features/improvements and provide feedback! Regular `:unstable` builds are available for early access. These are for the tinkerers and the brave. Ideally, use a testing environment first, before upgrading your main installation.
|
||||
If you notice something is not working as expected, check to see if it has been previously reported in the [open issues](https://github.com/tubearchivist/tubearchivist/issues).
|
||||
If it has not yet been disclosed, go ahead and create an issue.
|
||||
If the issue doesn't move forward due to a lack of response, I assume it's solved and will close it after some time to keep the list fresh.
|
||||
|
||||
There is always something that can get missed during development. Look at the commit messages tagged with `#build` - these are the unstable builds and give a quick overview of what has changed.
|
||||
## Wiki
|
||||
|
||||
- Test the features mentioned, play around, try to break it.
|
||||
- Test the update path by installing the `:latest` release first, then upgrade to `:unstable` to check for any errors.
|
||||
- Test the unstable build on a fresh install.
|
||||
|
||||
Then provide feedback - even if you don't encounter any issues! You can do this in the `#beta-testing` channel on the [Discord](https://tubearchivist.com/discord) Discord server.
|
||||
|
||||
This helps ensure a smooth update for the stable release. Plus you get to test things out early!
|
||||
|
||||
## How to open an issue
|
||||
Please read this carefully before opening any [issue](https://github.com/tubearchivist/tubearchivist/issues) on GitHub.
|
||||
|
||||
**Do**:
|
||||
- Do provide details and context, this matters a lot and makes it easier for people to help.
|
||||
- Do familiarize yourself with the project first, some questions answer themselves when using the project for some time. Familiarize yourself with the [Readme](https://github.com/tubearchivist/tubearchivist) and the [documentation](https://docs.tubearchivist.com/), this covers a lot of the common questions, particularly the [FAQ](https://docs.tubearchivist.com/faq/).
|
||||
- Do respond to questions within a day or two so issues can progress. If the issue doesn't move forward due to a lack of response, we'll assume it's solved and we'll close it after some time to keep the list fresh.
|
||||
|
||||
**Don't**:
|
||||
- Don't open *duplicates*, that includes open and closed issues.
|
||||
- Don't open an issue for something that's already on the [roadmap](https://github.com/tubearchivist/tubearchivist#roadmap), this needs your help to implement it, not another issue.
|
||||
- Don't open an issue for something that's a [known limitation](https://github.com/tubearchivist/tubearchivist#known-limitations). These are *known* by definition and don't need another reminder. Some limitations may be solved in the future, maybe by you?
|
||||
- Don't overwrite the *issue template*, they are there for a reason. Overwriting that shows that you don't really care about this project. It shows that you have a misunderstanding how open source collaboration works and just want to push your ideas through. Overwriting the template may result in a ban.
|
||||
|
||||
### Bug Report
|
||||
Bug reports are highly welcome! This project has improved a lot due to your help by providing feedback when something doesn't work as expected. The developers can't possibly cover all edge cases in an ever changing environment like YouTube and yt-dlp.
|
||||
|
||||
Please keep in mind:
|
||||
- Docker logs are the easiest way to understand what's happening when something goes wrong, *always* provide the logs upfront.
|
||||
- Set the environment variable `DJANGO_DEBUG=True` to Tube Archivist and reproduce the bug for a better log output. Don't forget to remove that variable again after.
|
||||
- A bug that can't be reproduced, is difficult or sometimes even impossible to fix. Provide very clear steps *how to reproduce*.
|
||||
|
||||
### Feature Request
|
||||
This project doesn't take any new feature requests. This project doesn't lack ideas, see the currently open tasks and roadmap. New feature requests aren't helpful at this point in time. Thank you for your understanding.
|
||||
|
||||
### Installation Help
|
||||
GitHub is most likely not the best place to ask for installation help. That's inherently individual and one on one.
|
||||
1. First step is always, help yourself. Start at the [Readme](https://github.com/tubearchivist/tubearchivist) or the additional platform specific installation pages in the [docs](https://docs.tubearchivist.com/).
|
||||
2. If that doesn't answer your question, open a `#support` thread on [Discord](https://www.tubearchivist.com/discord).
|
||||
3. Only if that is not an option, open an issue here.
|
||||
|
||||
IMPORTANT: When receiving help, contribute back to the community by improving the installation instructions with your newly gained knowledge.
|
||||
|
||||
---
|
||||
|
||||
## How to make a Pull Request
|
||||
|
||||
Thank you for contributing and helping improve this project. Focus for the foreseeable future is on improving and building on existing functionality, *not* on adding and expanding the application.
|
||||
|
||||
This is a quick checklist to help streamline the process:
|
||||
|
||||
- For **code changes**, make your PR against the [testing branch](https://github.com/tubearchivist/tubearchivist/tree/testing). That's where all active development happens. This simplifies the later merging into *master*, minimizes any conflicts and usually allows for easy and convenient *fast-forward* merging.
|
||||
- Show off your progress, even if not yet complete, by creating a [draft](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests#draft-pull-requests) PR first and switch it as *ready* when you are ready.
|
||||
- Make sure all your code is linted and formatted correctly, see below.
|
||||
|
||||
### Documentation Changes
|
||||
|
||||
All documentation is intended to represent the state of the [latest](https://github.com/tubearchivist/tubearchivist/releases/latest) release.
|
||||
|
||||
- If your PR with code changes also requires changes to documentation *.md files here in this repo, create a separate PR for that, so it can be merged separately at release.
|
||||
- You can make the PR directly against the *master* branch.
|
||||
- If your PR requires changes on the [tubearchivist/docs](https://github.com/tubearchivist/docs), make the PR over there.
|
||||
- Prepare your documentation updates at the same time as the code changes, so people testing your PR can consult the prepared docs if needed.
|
||||
|
||||
### Code formatting and linting
|
||||
|
||||
This project uses the excellent [pre-commit](https://github.com/pre-commit/pre-commit) library. The [pre-commit-config.yml](https://github.com/tubearchivist/tubearchivist/blob/master/.pre-commit-config.yaml) file is part of this repo.
|
||||
|
||||
**Quick Start**
|
||||
- Run `pre-commit install` from the root of the repo.
|
||||
- Next time you commit to your local git repo, the defined hooks will run.
|
||||
- On first run, this will download and install the needed environments to your local machine, that can take some time. But that will be reused on sunsequent commits.
|
||||
|
||||
That is also running as a Git Hub action.
|
||||
|
||||
---
|
||||
|
||||
## Contributions beyond the scope
|
||||
|
||||
As you have read the [FAQ](https://docs.tubearchivist.com/faq/) and the [known limitations](https://github.com/tubearchivist/tubearchivist#known-limitations) and have gotten an idea what this project tries to do, there will be some obvious shortcomings that stand out, that have been explicitly excluded from the scope of this project, at least for the time being.
|
||||
|
||||
Extending the scope of this project will only be feasible with more [regular contributors](https://github.com/tubearchivist/tubearchivist/graphs/contributors) that are willing to help improve this project in the long run. Contributors that have an overall improvement of the project in mind and not just about implementing this *one* thing.
|
||||
|
||||
Small minor additions, or making a PR for a documented feature request or bug, even if that was and will be your only contribution to this project, are always welcome and is *not* what this is about.
|
||||
|
||||
Beyond that, general rules to consider:
|
||||
|
||||
- Maintainability is key: It's not just about implementing something and being done with it, it's about maintaining it, fixing bugs as they occur, improving on it and supporting it in the long run.
|
||||
- Others can do it better: Some problems have been solved by very talented developers. These things don't need to be reinvented again here in this project.
|
||||
- Develop for the 80%: New features and additions *should* be beneficial for 80% of the users. If you are trying to solve your own problem that only applies to you, maybe that would be better to do in your own fork or if possible by a standalone implementation using the API.
|
||||
- If all of that sounds too strict for you, as stated above, start becoming a regular contributor to this project.
|
||||
|
||||
---
|
||||
|
||||
## User Scripts
|
||||
Some of you might have created useful scripts or API integrations around this project. Sharing is caring! Please add a link to your script to the Readme [here](https://github.com/tubearchivist/tubearchivist#user-scripts).
|
||||
- Your repo should have a `LICENSE` file with one of the common open source licenses. People are expected to fork, adapt and build upon your great work.
|
||||
- Your script should not modify the *official* files of Tube Archivist. E.g. your symlink script should build links *outside* of your `/youtube` folder. Or your fancy script that creates a beautiful artwork gallery should do that *outside* of the `/cache` folder. Modifying the *official* files and folders of TA are probably not supported.
|
||||
- On the top of the repo you should have a mention and a link back to the Tube Archivist repo. Clearly state to **not** to open any issues on the main TA repo regarding your script.
|
||||
- Example template:
|
||||
- `[<user>/<repo>](https://linktoyourrepo.com)`: A short one line description.
|
||||
|
||||
---
|
||||
|
||||
## Improve to the Documentation
|
||||
|
||||
The documentation available at [docs.tubearchivist.com](https://docs.tubearchivist.com/) and is build from a separate repo [tubearchivist/docs](https://github.com/tubearchivist/docs). The Readme there has additional instructions on how to make changes.
|
||||
|
||||
---
|
||||
The wiki is where all user functions are documented in detail. These pages are mirrored into the **docs** folder of the repo. This allows for pull requests and all other features like regular code. Make any changes there, and I'll sync them with the wiki tab.
|
||||
|
||||
## Development Environment
|
||||
|
||||
This codebase is set up to be developed natively outside of docker as well as in a docker container. Developing outside of a docker container can be convenient, as IDE and hot reload usually works out of the box. But testing inside of a container is still essential, as there are subtle differences, especially when working with the filesystem and networking between containers.
|
||||
I have learned the hard way, that working on a dockerized application outside of docker is very error prone and in general not a good idea. So if you want to test your changes, it's best to run them in a docker testing environment.
|
||||
|
||||
Note:
|
||||
- Subtitles currently fail to load with `DJANGO_DEBUG=True`, that is due to incorrect `Content-Type` error set by Django's static file implementation. That's only if you run the Django dev server, Nginx sets the correct headers.
|
||||
|
||||
### Native Instruction
|
||||
|
||||
For convenience, it's recommended to still run Redis and ES in a docker container. Make sure both containers can be reachable over the network.
|
||||
|
||||
Set up your virtual environment and install the requirements defined in `requirements-dev.txt`.
|
||||
|
||||
There are options built in to load environment variables from a file using `load_dotenv`. Example `.env` file to place in the same folder as `manage.py`:
|
||||
|
||||
```
|
||||
TA_HOST="localhost"
|
||||
TA_USERNAME=tubearchivist
|
||||
TA_PASSWORD=verysecret
|
||||
TA_MEDIA_DIR="static/volume/media"
|
||||
TA_CACHE_DIR="static"
|
||||
TA_APP_DIR="."
|
||||
REDIS_CON=redis://localhost:6379
|
||||
ES_URL="http://localhost:9200"
|
||||
ELASTIC_PASSWORD=verysecret
|
||||
TZ=America/New_York
|
||||
DJANGO_DEBUG=True
|
||||
```
|
||||
|
||||
Then look at the container startup script `run.sh`, make sure all needed migrations and startup checks ran. To start the dev backend server from the same folder as `manage.py` run:
|
||||
|
||||
```bash
|
||||
python manage.py runserver
|
||||
```
|
||||
|
||||
The backend will be available on [localhost:8000/api/](localhost:8000/api/).
|
||||
|
||||
You'll probably also want to have a Celery worker instance running, refer to `run.sh` for that. The Beat Scheduler might not be needed.
|
||||
|
||||
Then from the frontend folder, install the dependencies with:
|
||||
|
||||
```bash
|
||||
npm install
|
||||
```
|
||||
|
||||
Then to start the frontend development server:
|
||||
|
||||
```bash
|
||||
npm run dev
|
||||
```
|
||||
|
||||
And the frontend should be available at [localhost:3000](localhost:3000).
|
||||
|
||||
### Docker Instructions
|
||||
|
||||
Set up docker on your development machine.
|
||||
|
||||
Clone this repository.
|
||||
|
||||
Functional changes should be made against the unstable `testing` branch, so check that branch out, then make a new branch for your work.
|
||||
|
||||
Edit the `docker-compose.yml` file and replace the [`image: bbilly1/tubearchivist` line](https://github.com/tubearchivist/tubearchivist/blob/4af12aee15620e330adf3624c984c3acf6d0ac8b/docker-compose.yml#L7) with `build: .`. Also make any other changes to the environment variables and so on necessary to run the application, just like you're launching the application as normal.
|
||||
|
||||
Run `docker compose up --build`. This will bring up the application. Kill it with `ctrl-c` or by running `docker compose down` from a new terminal window in the same directory.
|
||||
|
||||
Make your changes locally and re-run `docker compose up --build`. The `Dockerfile` is structured in a way that the actual application code is in the last layer so rebuilding the image with only code changes utilizes the build cache for everything else and will just take a few seconds.
|
||||
|
||||
### Develop environment inside a VM
|
||||
|
||||
You may find it nice to run everything inside of a VM for complete environment snapshots and encapsulation, though this is not strictly necessary. There's a `deploy.sh` script which has some helpers for this use case:
|
||||
|
||||
- This assumes a standard Ubuntu Server VM with docker and docker compose already installed.
|
||||
- Configure your local DNS to resolve `tubearchivist.local` to the IP of the VM.
|
||||
- To deploy the latest changes and rebuild the application to the testing VM run:
|
||||
This is my setup I have landed on, YMMV:
|
||||
- Clone the repo, work on it with your favorite code editor in your local filesystem. *testing* branch is the where all the changes are happening, might be unstable and is WIP.
|
||||
- Then I have a VM running standard Ubuntu Server LTS with docker installed. The VM keeps my projects separate and offers convenient snapshot functionality. The VM also offers ways to simulate lowend environments by limiting CPU cores and memory. You can use this [Ansible Docker Ubuntu](https://github.com/bbilly1/ansible-playbooks) playbook to get started quickly. But you could also just run docker on your host system.
|
||||
- The `Dockerfile` is structured in a way that the actual application code is in the last layer so rebuilding the image with only code changes utilizes the build cache for everything else and will just take a few seconds.
|
||||
- Take a look at the `deploy.sh` file. I have my local DNS resolve `tubearchivist.local` to the IP of the VM for convenience. To deploy the latest changes and rebuild the application to the testing VM run:
|
||||
```bash
|
||||
./deploy.sh test
|
||||
```
|
||||
@ -209,7 +29,7 @@ You may find it nice to run everything inside of a VM for complete environment s
|
||||
- The `test` argument takes another optional argument to build for a specific architecture valid options are: `amd64`, `arm64` and `multi`, default is `amd64`.
|
||||
- This `deploy.sh` script is not meant to be universally usable for every possible environment but could serve as an idea on how to automatically rebuild containers to test changes - customize to your liking.
|
||||
|
||||
### Working with Elasticsearch
|
||||
## Working with Elasticsearch
|
||||
Additionally to the required services as listed in the example docker-compose file, the **Dev Tools** of [Kibana](https://www.elastic.co/guide/en/kibana/current/docker.html) are invaluable for running and testing Elasticsearch queries.
|
||||
|
||||
**Quick start**
|
||||
@ -220,31 +40,41 @@ bin/elasticsearch-service-tokens create elastic/kibana kibana
|
||||
|
||||
Example docker compose, use same version as for Elasticsearch:
|
||||
```yml
|
||||
services:
|
||||
kibana:
|
||||
image: docker.elastic.co/kibana/kibana:0.0.0
|
||||
container_name: kibana
|
||||
environment:
|
||||
kibana:
|
||||
image: docker.elastic.co/kibana/kibana:0.0.0
|
||||
container_name: kibana
|
||||
environment:
|
||||
- "ELASTICSEARCH_HOSTS=http://archivist-es:9200"
|
||||
- "ELASTICSEARCH_SERVICEACCOUNTTOKEN=<your-token-here>"
|
||||
ports:
|
||||
ports:
|
||||
- "5601:5601"
|
||||
```
|
||||
|
||||
If you want to run queries on the Elasticsearch container directly from your host with for example `curl` or something like *postman*, you might want to **publish** the port 9200 instead of just **exposing** it.
|
||||
|
||||
**Persist Token**
|
||||
The token will get stored in ES in the `config` folder, and not in the `data` folder. To persist the token between ES container rebuilds, you'll need to persist the config folder as an additional volume:
|
||||
## Implementing a new feature
|
||||
|
||||
1. Create the token as described above
|
||||
2. While the container is running, copy the current config folder out of the container, e.g.:
|
||||
```
|
||||
docker cp archivist-es:/usr/share/elasticsearch/config/ volume/es_config
|
||||
```
|
||||
3. Then stop all containers and mount this folder into the container as an additional volume:
|
||||
```yml
|
||||
- ./volume/es_config:/usr/share/elasticsearch/config
|
||||
```
|
||||
4. Start all containers back up.
|
||||
Do you see anything on the roadmap that you would like to take a closer look at but you are not sure, what's the best way to tackle that? Or anything not on there yet you'd like to implement but are not sure how? Reach out on Discord and we'll look into it together.
|
||||
|
||||
Now your token will persist between ES container rebuilds.
|
||||
## Making changes
|
||||
|
||||
To fix a bug or implement a feature, fork the repository and make all changes to the testing branch. When ready, create a pull request.
|
||||
|
||||
## Releases
|
||||
|
||||
There are three different docker tags:
|
||||
- **latest**: As the name implies is the latest multiarch release for regular usage.
|
||||
- **unstable**: Intermediate amd64 builds for quick testing and improved collaboration. Don't mix with a *latest* installation, for your testing environment only. This is untested and WIP and will have breaking changes between commits that might require a reset to resolve.
|
||||
- **semantic versioning**: There will be a handful named version tags that will also have a matching release and tag on github.
|
||||
|
||||
If you want to see what's in your container, checkout the matching release tag. A merge to **master** usually means a *latest* or *unstable* release. If you want to preview changes in your testing environment, pull the *unstable* tag or clone the repository and build the docker container with the Dockerfile from the **testing** branch.
|
||||
|
||||
## Code formatting and linting
|
||||
|
||||
To keep things clean and consistent for everybody, there is a github action setup to lint and check the changes. You can test your code locally first if you want. For example if you made changes in the **video** module, run
|
||||
|
||||
```shell
|
||||
./deploy.sh validate tubearchivist/home/src/index/video.py
|
||||
```
|
||||
|
||||
to validate your changes. If you omit the path, all the project files will get checked. This is subject to change as the codebase improves.
|
||||
|
77
Dockerfile
@ -1,66 +1,56 @@
|
||||
# multi stage to build tube archivist
|
||||
# build python wheel, download and extract ffmpeg, copy into final image
|
||||
# first stage to build python wheel, copy into final image
|
||||
|
||||
FROM node:lts-alpine AS npm-builder
|
||||
COPY frontend/package.json frontend/package-lock.json /
|
||||
RUN npm i
|
||||
|
||||
FROM node:lts-alpine AS node-builder
|
||||
|
||||
# RUN npm config set registry https://registry.npmjs.org/
|
||||
|
||||
COPY --from=npm-builder ./node_modules /frontend/node_modules
|
||||
COPY ./frontend /frontend
|
||||
WORKDIR /frontend
|
||||
|
||||
RUN npm run build:deploy
|
||||
|
||||
WORKDIR /
|
||||
|
||||
# First stage to build python wheel
|
||||
FROM python:3.11.13-slim-bookworm AS builder
|
||||
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
build-essential gcc libldap2-dev libsasl2-dev libssl-dev git
|
||||
|
||||
# install requirements
|
||||
COPY ./backend/requirements.txt /requirements.txt
|
||||
RUN pip install --user -r requirements.txt
|
||||
|
||||
# build ffmpeg
|
||||
FROM python:3.11.13-slim-bookworm AS ffmpeg-builder
|
||||
|
||||
FROM python:3.10.8-slim-bullseye AS builder
|
||||
ARG TARGETPLATFORM
|
||||
|
||||
COPY docker_assets/ffmpeg_download.py ffmpeg_download.py
|
||||
RUN python ffmpeg_download.py $TARGETPLATFORM
|
||||
RUN apt-get update
|
||||
RUN apt-get install -y --no-install-recommends build-essential gcc libldap2-dev libsasl2-dev libssl-dev
|
||||
|
||||
# install requirements
|
||||
COPY ./tubearchivist/requirements.txt /requirements.txt
|
||||
RUN pip install --user -r requirements.txt
|
||||
|
||||
# build final image
|
||||
FROM python:3.11.13-slim-bookworm AS tubearchivist
|
||||
FROM python:3.10.8-slim-bullseye as tubearchivist
|
||||
|
||||
ARG TARGETPLATFORM
|
||||
ARG INSTALL_DEBUG
|
||||
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
ENV PYTHONUNBUFFERED 1
|
||||
|
||||
# copy build requirements
|
||||
COPY --from=builder /root/.local /root/.local
|
||||
ENV PATH=/root/.local/bin:$PATH
|
||||
|
||||
# copy ffmpeg
|
||||
COPY --from=ffmpeg-builder ./ffmpeg/ffmpeg /usr/bin/ffmpeg
|
||||
COPY --from=ffmpeg-builder ./ffprobe/ffprobe /usr/bin/ffprobe
|
||||
|
||||
# install distro packages needed
|
||||
RUN apt-get clean && apt-get -y update && apt-get -y install --no-install-recommends \
|
||||
nginx \
|
||||
atomicparsley \
|
||||
curl && rm -rf /var/lib/apt/lists/*
|
||||
curl \
|
||||
xz-utils && rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# get newest patched ffmpeg and ffprobe builds for amd64 fall back to repo ffmpeg for arm64
|
||||
RUN if [ "$TARGETPLATFORM" = "linux/amd64" ] ; then \
|
||||
curl -s https://api.github.com/repos/yt-dlp/FFmpeg-Builds/releases/latest \
|
||||
| grep browser_download_url \
|
||||
| grep ".*master.*linux64.*tar.xz" \
|
||||
| cut -d '"' -f 4 \
|
||||
| xargs curl -L --output ffmpeg.tar.xz && \
|
||||
tar -xf ffmpeg.tar.xz --strip-components=2 --no-anchored -C /usr/bin/ "ffmpeg" && \
|
||||
tar -xf ffmpeg.tar.xz --strip-components=2 --no-anchored -C /usr/bin/ "ffprobe" && \
|
||||
rm ffmpeg.tar.xz \
|
||||
; elif [ "$TARGETPLATFORM" = "linux/arm64" ] ; then \
|
||||
apt-get -y update && apt-get -y install --no-install-recommends ffmpeg && rm -rf /var/lib/apt/lists/* \
|
||||
; fi
|
||||
|
||||
# install debug tools for testing environment
|
||||
RUN if [ "$INSTALL_DEBUG" ] ; then \
|
||||
apt-get -y update && apt-get -y install --no-install-recommends \
|
||||
vim htop bmon net-tools iputils-ping procps lsof \
|
||||
&& pip install --user ipython pytest pytest-django \
|
||||
apt-get -y update && apt-get -y install --no-install-recommends \
|
||||
vim htop bmon net-tools iputils-ping procps \
|
||||
&& pip install --user ipython \
|
||||
; fi
|
||||
|
||||
# make folders
|
||||
@ -71,12 +61,9 @@ COPY docker_assets/nginx.conf /etc/nginx/sites-available/default
|
||||
RUN sed -i 's/^user www\-data\;$/user root\;/' /etc/nginx/nginx.conf
|
||||
|
||||
# copy application into container
|
||||
COPY ./backend /app
|
||||
COPY ./tubearchivist /app
|
||||
COPY ./docker_assets/run.sh /app
|
||||
COPY ./docker_assets/backend_start.py /app
|
||||
COPY ./docker_assets/beat_auto_spawn.sh /app
|
||||
|
||||
COPY --from=node-builder ./frontend/dist /app/static
|
||||
COPY ./docker_assets/uwsgi.ini /app
|
||||
|
||||
# volumes
|
||||
VOLUME /cache
|
||||
|
295
README.md
@ -1,184 +1,195 @@
|
||||

|
||||
[*more screenshots and video*](SHOWCASE.MD)
|
||||

|
||||
|
||||
<h1 align="center">Your self hosted YouTube media server</h1>
|
||||
<div align="center">
|
||||
<a href="https://hub.docker.com/r/bbilly1/tubearchivist" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-docker.png" alt="tubearchivist-docker" title="Tube Archivist Docker Pulls" height="50" width="190"/></a>
|
||||
<a href="https://github.com/tubearchivist/tubearchivist/stargazers" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-github-star.png" alt="tubearchivist-github-star" title="Tube Archivist GitHub Stars" height="50" width="190"/></a>
|
||||
<a href="https://github.com/tubearchivist/tubearchivist/forks" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-github-forks.png" alt="tubearchivist-github-forks" title="Tube Archivist GitHub Forks" height="50" width="190"/></a>
|
||||
<a href="https://www.tubearchivist.com/discord" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-discord.png" alt="tubearchivist-discord" title="TA Discord Server Members" height="50" width="190"/></a>
|
||||
<a href="https://github.com/bbilly1/tilefy" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-docker.png" alt="tubearchivist-docker" title="Tube Archivist Docker Pulls" height="50" width="200"/></a>
|
||||
<a href="https://github.com/bbilly1/tilefy" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-github-star.png" alt="tubearchivist-github-star" title="Tube Archivist GitHub Stars" height="50" width="200"/></a>
|
||||
<a href="https://github.com/bbilly1/tilefy" target="_blank"><img src="https://tiles.tilefy.me/t/tubearchivist-github-forks.png" alt="tubearchivist-github-forks" title="Tube Archivist GitHub Forks" height="50" width="200"/></a>
|
||||
</div>
|
||||
|
||||
## Table of contents:
|
||||
* [Docs](https://docs.tubearchivist.com/) with [FAQ](https://docs.tubearchivist.com/faq/), and API documentation
|
||||
* [Wiki](https://github.com/tubearchivist/tubearchivist/wiki) with [FAQ](https://github.com/tubearchivist/tubearchivist/wiki/FAQ)
|
||||
* [Core functionality](#core-functionality)
|
||||
* [Resources](#resources)
|
||||
* [Installing](#installing)
|
||||
* [Screenshots](#screenshots)
|
||||
* [Problem Tube Archivist tries to solve](#problem-tube-archivist-tries-to-solve)
|
||||
* [Connect](#connect)
|
||||
* [Extended Universe](#extended-universe)
|
||||
* [Installing and updating](#installing-and-updating)
|
||||
* [Getting Started](#getting-started)
|
||||
* [Known limitations](#known-limitations)
|
||||
* [Port Collisions](#port-collisions)
|
||||
* [Common Errors](#common-errors)
|
||||
* [Potential pitfalls](#potential-pitfalls)
|
||||
* [Roadmap](#roadmap)
|
||||
* [Known limitations](#known-limitations)
|
||||
* [Donate](#donate)
|
||||
|
||||
------------------------
|
||||
|
||||
## Core functionality
|
||||
Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface. This includes:
|
||||
* Subscribe to your favorite YouTube channels
|
||||
* Download Videos using **[yt-dlp](https://github.com/yt-dlp/yt-dlp)**
|
||||
* Download Videos using **yt-dlp**
|
||||
* Index and make videos searchable
|
||||
* Play videos
|
||||
* Keep track of viewed and unviewed videos
|
||||
|
||||
## Resources
|
||||
- [Discord](https://www.tubearchivist.com/discord): Connect with us on our Discord server.
|
||||
## Tube Archivist on YouTube
|
||||
[](https://www.youtube.com/watch?v=O8H8Z01c0Ys)
|
||||
|
||||
## Screenshots
|
||||

|
||||
*Home Page*
|
||||
|
||||

|
||||
*All Channels*
|
||||
|
||||

|
||||
*Single Channel*
|
||||
|
||||

|
||||
*Video Page*
|
||||
|
||||

|
||||
*Downloads Page*
|
||||
|
||||
## Problem Tube Archivist tries to solve
|
||||
Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.
|
||||
|
||||
## Connect
|
||||
- [Discord](https://discord.gg/AFwz8nE7BK): Connect with us on our Discord server.
|
||||
- [r/TubeArchivist](https://www.reddit.com/r/TubeArchivist/): Join our Subreddit.
|
||||
|
||||
## Extended Universe
|
||||
- [Browser Extension](https://github.com/tubearchivist/browser-extension) Tube Archivist Companion, for [Firefox](https://addons.mozilla.org/addon/tubearchivist-companion/) and [Chrome](https://chrome.google.com/webstore/detail/tubearchivist-companion/jjnkmicfnfojkkgobdfeieblocadmcie)
|
||||
- [Jellyfin Plugin](https://github.com/tubearchivist/tubearchivist-jf-plugin): Add your videos to Jellyfin
|
||||
- [Plex Plugin](https://github.com/tubearchivist/tubearchivist-plex): Add your videos to Plex
|
||||
- [Tube Archivist Metrics](https://github.com/tubearchivist/tubearchivist-metrics) to create statistics in Prometheus/OpenMetrics format.
|
||||
|
||||
## Installing
|
||||
For minimal system requirements, the Tube Archivist stack needs around 2GB of available memory for a small testing setup and around 4GB of available memory for a mid to large sized installation. Minimal with dual core with 4 threads, better quad core plus.
|
||||
This project requires docker. Ensure it is installed and running on your system.
|
||||
## Installing and updating
|
||||
Take a look at the example `docker-compose.yml` file provided. Use the *latest* or the named semantic version tag. The *unstable* tag is for intermediate testing and as the name implies, is **unstable** and not be used on your main installation but in a [testing environment](CONTRIBUTING.md).
|
||||
|
||||
The documentation has additional user provided instructions for [Unraid](https://docs.tubearchivist.com/installation/unraid/), [Synology](https://docs.tubearchivist.com/installation/synology/) and [Podman](https://docs.tubearchivist.com/installation/podman/).
|
||||
For minimal system requirements, the Tube Archivist stack needs around 2GB of available memory for a small testing setup and around 4GB of available memory for a mid to large sized installation.
|
||||
|
||||
The instructions here should get you up and running quickly, for Docker beginners and full explanation about each environment variable, see the [docs](https://docs.tubearchivist.com/installation/docker-compose/).
|
||||
Tube Archivist depends on three main components split up into separate docker containers:
|
||||
|
||||
Take a look at the example [docker-compose.yml](https://github.com/tubearchivist/tubearchivist/blob/master/docker-compose.yml) and configure the required environment variables.
|
||||
### Tube Archivist
|
||||
The main Python application that displays and serves your video collection, built with Django.
|
||||
- Serves the interface on port `8000`
|
||||
- Needs a volume for the video archive at **/youtube**
|
||||
- And another volume to save application data at **/cache**.
|
||||
- The environment variables `ES_URL` and `REDIS_HOST` are needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
|
||||
- The environment variables `HOST_UID` and `HOST_GID` allows Tube Archivist to `chown` the video files to the main host system user instead of the container user. Those two variables are optional, not setting them will disable that functionality. That might be needed if the underlying filesystem doesn't support `chown` like *NFS*.
|
||||
- Set the environment variable `TA_HOST` to match with the system running Tube Archivist. This can be a domain like *example.com*, a subdomain like *ta.example.com* or an IP address like *192.168.1.20*, add without the protocol and without the port. You can add multiple hostnames separated with a space. Any wrong configurations here will result in a `Bad Request (400)` response.
|
||||
- Change the environment variables `TA_USERNAME` and `TA_PASSWORD` to create the initial credentials.
|
||||
- `ELASTIC_PASSWORD` is for the password for Elasticsearch. The environment variable `ELASTIC_USER` is optional, should you want to change the username from the default *elastic*.
|
||||
- For the scheduler to know what time it is, set your timezone with the `TZ` environment variable, defaults to *UTC*.
|
||||
|
||||
All environment variables are explained in detail in the docs [here](https://docs.tubearchivist.com/installation/env-vars/).
|
||||
|
||||
**TubeArchivist**:
|
||||
| Environment Var | Value | |
|
||||
| ----------- | ----------- | ----------- |
|
||||
| TA_HOST | Server IP or hostname `http://tubearchivist.local:8000` | Required |
|
||||
| TA_USERNAME | Initial username when logging into TA | Required |
|
||||
| TA_PASSWORD | Initial password when logging into TA | Required |
|
||||
| ELASTIC_PASSWORD | Password for ElasticSearch | Required |
|
||||
| REDIS_CON | Connection string to Redis | Required |
|
||||
| TZ | Set your timezone for the scheduler | Required |
|
||||
| TA_PORT | Overwrite Nginx port | Optional |
|
||||
| TA_BACKEND_PORT | Overwrite container internal backend server port | Optional |
|
||||
| TA_ENABLE_AUTH_PROXY | Enables support for forwarding auth in reverse proxies | [Read more](https://docs.tubearchivist.com/configuration/forward-auth/) |
|
||||
| TA_AUTH_PROXY_USERNAME_HEADER | Header containing username to log in | Optional |
|
||||
| TA_AUTH_PROXY_LOGOUT_URL | Logout URL for forwarded auth | Optional |
|
||||
| ES_URL | URL That ElasticSearch runs on | Optional |
|
||||
| ES_DISABLE_VERIFY_SSL | Disable ElasticSearch SSL certificate verification | Optional |
|
||||
| ES_SNAPSHOT_DIR | Custom path where elastic search stores snapshots for master/data nodes | Optional |
|
||||
| HOST_GID | Allow TA to own the video files instead of container user | Optional |
|
||||
| HOST_UID | Allow TA to own the video files instead of container user | Optional |
|
||||
| ELASTIC_USER | Change the default ElasticSearch user | Optional |
|
||||
| TA_LDAP | Configure TA to use LDAP Authentication | [Read more](https://docs.tubearchivist.com/configuration/ldap/) |
|
||||
| DISABLE_STATIC_AUTH | Remove authentication from media files, (Google Cast...) | [Read more](https://docs.tubearchivist.com/installation/env-vars/#disable_static_auth) |
|
||||
| TA_AUTO_UPDATE_YTDLP | Configure TA to automatically install the latest yt-dlp on container start | Optional |
|
||||
| DJANGO_DEBUG | Return additional error messages, for debug only | Optional |
|
||||
| TA_LOGIN_AUTH_MODE | Configure the order of login authentication backends (Default: single) | Optional |
|
||||
|
||||
| TA_LOGIN_AUTH_MODE value | Description |
|
||||
| ------------------------ | ----------- |
|
||||
| single | Only use a single backend (default, or LDAP, or Forward auth, selected by TA_LDAP or TA_ENABLE_AUTH_PROXY) |
|
||||
| local | Use local password database only |
|
||||
| ldap | Use LDAP backend only |
|
||||
| forwardauth | Use reverse proxy headers only |
|
||||
| ldap_local | Use LDAP backend in addition to the local password database |
|
||||
|
||||
**ElasticSearch**
|
||||
| Environment Var | Value | State |
|
||||
| ----------- | ----------- | ----------- |
|
||||
| ELASTIC_PASSWORD | Matching password `ELASTIC_PASSWORD` from TubeArchivist | Required |
|
||||
| http.port | Change the port ElasticSearch runs on | Optional |
|
||||
|
||||
|
||||
## Update
|
||||
Always use the *latest* (the default) or a named semantic version tag for the docker images. The *unstable* tags are only for your testing environment, there might not be an update path for these testing builds.
|
||||
|
||||
You will see the current version number of **Tube Archivist** in the footer of the interface. There is a daily version check task querying tubearchivist.com, notifying you of any new releases in the footer. To update, you need to update the docker images, the method for which will depend on your platform. For example, if you're using `docker-compose`, run `docker-compose pull` and then restart with `docker-compose up -d`. After updating, check the footer to verify you are running the expected version.
|
||||
|
||||
- This project is tested for updates between one or two releases maximum. Further updates back may or may not be supported and you might have to reset your index and configurations to update. Ideally apply new updates at least once per month.
|
||||
- There can be breaking changes between updates, particularly as the application grows, new environment variables or settings might be required for you to set in the your docker-compose file. *Always* check the **release notes**: Any breaking changes will be marked there.
|
||||
- All testing and development is done with the Elasticsearch version number as mentioned in the provided *docker-compose.yml* file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned. Use `bbilly1/tubearchivist-es` to automatically get the recommended version.
|
||||
|
||||
## Getting Started
|
||||
1. Go through the **settings** page and look at the available options. Particularly set *Download Format* to your desired video quality before downloading. **Tube Archivist** downloads the best available quality by default. To support iOS or MacOS and some other browsers a compatible format must be specified. For example:
|
||||
```
|
||||
bestvideo[vcodec*=avc1]+bestaudio[acodec*=mp4a]/mp4
|
||||
```
|
||||
2. Subscribe to some of your favorite YouTube channels on the **channels** page.
|
||||
3. On the **downloads** page, click on *Rescan subscriptions* to add videos from the subscribed channels to your Download queue or click on *Add to download queue* to manually add Video IDs, links, channels or playlists.
|
||||
4. Click on *Start download* and let **Tube Archivist** to it's thing.
|
||||
5. Enjoy your archived collection!
|
||||
|
||||
|
||||
### Port Collisions
|
||||
### Port collisions
|
||||
If you have a collision on port `8000`, best solution is to use dockers *HOST_PORT* and *CONTAINER_PORT* distinction: To for example change the interface to port 9000 use `9000:8000` in your docker-compose file.
|
||||
|
||||
For more information on port collisions, check the docs.
|
||||
Should that not be an option, the Tube Archivist container takes these two additional environment variables:
|
||||
- **TA_PORT**: To actually change the port where nginx listens, make sure to also change the ports value in your docker-compose file.
|
||||
- **TA_UWSGI_PORT**: To change the default uwsgi port 8080 used for container internal networking between uwsgi serving the django application and nginx.
|
||||
|
||||
## Common Errors
|
||||
Here is a list of common errors and their solutions.
|
||||
Changing any of these two environment variables will change the files *nginx.conf* and *uwsgi.ini* at startup using `sed` in your container.
|
||||
|
||||
### `vm.max_map_count`
|
||||
### LDAP Authentication
|
||||
You can configure LDAP with the following environment variables:
|
||||
|
||||
- `TA_LDAP` (ex: `true`) Set to anything besides empty string to use LDAP authentication **instead** of local user authentication.
|
||||
- `TA_LDAP_SERVER_URI` (ex: `ldap://ldap-server:389`) Set to the uri of your LDAP server.
|
||||
- `TA_LDAP_DISABLE_CERT_CHECK` (ex: `true`) Set to anything besides empty string to disable certificate checking when connecting over LDAPS.
|
||||
- `TA_LDAP_BIND_DN` (ex: `uid=search-user,ou=users,dc=your-server`) DN of the user that is able to perform searches on your LDAP account.
|
||||
- `TA_LDAP_BIND_PASSWORD` (ex: `yoursecretpassword`) Password for the search user.
|
||||
- `TA_LDAP_USER_BASE` (ex: `ou=users,dc=your-server`) Search base for user filter.
|
||||
- `TA_LDAP_USER_FILTER` (ex: `(objectClass=user)`) Filter for valid users. Login usernames are automatically matched using `uid` and does not need to be specified in this filter.
|
||||
|
||||
When LDAP authentication is enabled, django passwords (e.g. the password defined in TA_PASSWORD), will not allow you to login, only the LDAP server is used.
|
||||
|
||||
### Elasticsearch
|
||||
**Note**: Tube Archivist depends on Elasticsearch 8.
|
||||
|
||||
Use `bbilly1/tubearchivist-es` to automatically get the recommended version, or use the official image with the version tag in the docker-compose file.
|
||||
|
||||
Stores video meta data and makes everything searchable. Also keeps track of the download queue.
|
||||
- Needs to be accessible over the default port `9200`
|
||||
- Needs a volume at **/usr/share/elasticsearch/data** to store data
|
||||
|
||||
Follow the [documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) for additional installation details.
|
||||
|
||||
### Redis JSON
|
||||
Functions as a cache and temporary link between the application and the file system. Used to store and display messages and configuration variables.
|
||||
- Needs to be accessible over the default port `6379`
|
||||
- Needs a volume at **/data** to make your configuration changes permanent.
|
||||
|
||||
### Redis on a custom port
|
||||
For some architectures it might be required to run Redis JSON on a nonstandard port. To for example change the Redis port to **6380**, set the following values:
|
||||
- Set the environment variable `REDIS_PORT=6380` to the *tubearchivist* service.
|
||||
- For the *archivist-redis* service, change the ports to `6380:6380`
|
||||
- Additionally set the following value to the *archivist-redis* service: `command: --port 6380 --loadmodule /usr/lib/redis/modules/rejson.so`
|
||||
|
||||
### Updating Tube Archivist
|
||||
You will see the current version number of **Tube Archivist** in the footer of the interface so you can compare it with the latest release to make sure you are running the *latest and greatest*.
|
||||
* There can be breaking changes between updates, particularly as the application grows, new environment variables or settings might be required for you to set in the your docker-compose file. *Always* check the **release notes**: Any breaking changes will be marked there.
|
||||
* All testing and development is done with the Elasticsearch version number as mentioned in the provided *docker-compose.yml* file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned. Use `bbilly1/tubearchivist-es` to automatically get the recommended version.
|
||||
|
||||
### Alternative installation instructions:
|
||||
- **arm64**: The Tube Archivist container is multi arch, so is Elasticsearch. RedisJSON doesn't offer arm builds, you can use `bbilly1/rejson`, an unofficial rebuild for arm64.
|
||||
- **Helm Chart**: There is a Helm Chart available at https://github.com/insuusvenerati/helm-charts. Mostly self-explanatory but feel free to ask questions in the discord / subreddit.
|
||||
- **Wiki**: There are additional helpful installation instructions in the [wiki](https://github.com/tubearchivist/tubearchivist/wiki/Installation) for Unraid, Truenas and Synology.
|
||||
|
||||
|
||||
## Potential pitfalls
|
||||
### vm.max_map_count
|
||||
**Elastic Search** in Docker requires the kernel setting of the host machine `vm.max_map_count` to be set to at least 262144.
|
||||
|
||||
To temporary set the value run:
|
||||
```
|
||||
sudo sysctl -w vm.max_map_count=262144
|
||||
```
|
||||
|
||||
To apply the change permanently depends on your host operating system:
|
||||
|
||||
- For example on Ubuntu Server add `vm.max_map_count = 262144` to the file `/etc/sysctl.conf`.
|
||||
- On Arch based systems create a file `/etc/sysctl.d/max_map_count.conf` with the content `vm.max_map_count = 262144`.
|
||||
- On any other platform look up in the documentation on how to pass kernel parameters.
|
||||
|
||||
- For example on Ubuntu Server add `vm.max_map_count = 262144` to the file */etc/sysctl.conf*.
|
||||
- On Arch based systems create a file */etc/sysctl.d/max_map_count.conf* with the content `vm.max_map_count = 262144`.
|
||||
- On any other platform look up in the documentation on how to pass kernel parameters.
|
||||
|
||||
### Permissions for elasticsearch
|
||||
If you see a message similar to `Unable to access 'path.repo' (/usr/share/elasticsearch/data/snapshot)` or `failed to obtain node locks, tried [/usr/share/elasticsearch/data]` and `maybe these locations are not writable` when initially starting elasticsearch, that probably means the container is not allowed to write files to the volume.
|
||||
If you see a message similar to `failed to obtain node locks, tried [/usr/share/elasticsearch/data]` and `maybe these locations are not writable` when initially starting elasticsearch, that probably means the container is not allowed to write files to the volume.
|
||||
To fix that issue, shutdown the container and on your host machine run:
|
||||
```
|
||||
chown 1000:0 -R /path/to/mount/point
|
||||
```
|
||||
This will match the permissions with the **UID** and **GID** of elasticsearch process within the container and should fix the issue.
|
||||
|
||||
|
||||
### Disk usage
|
||||
The Elasticsearch index will turn to ***read only*** if the disk usage of the container goes above 95% until the usage drops below 90% again, you will see error messages like `disk usage exceeded flood-stage watermark`.
|
||||
The Elasticsearch index will turn to *read only* if the disk usage of the container goes above 95% until the usage drops below 90% again, you will see error messages like `disk usage exceeded flood-stage watermark`, [link](https://github.com/tubearchivist/tubearchivist#disk-usage).
|
||||
|
||||
Similar to that, TubeArchivist will become all sorts of messed up when running out of disk space. There are some error messages in the logs when that happens, but it's best to make sure to have enough disk space before starting to download.
|
||||
|
||||
## `error setting rlimit`
|
||||
If you are seeing errors like `failed to create shim: OCI runtime create failed` and `error during container init: error setting rlimits`, this means docker can't set these limits, usually because they are set at another place or are incompatible because of other reasons. Solution is to remove the `ulimits` key from the ES container in your docker compose and start again.
|
||||
|
||||
This can happen if you have nested virtualizations, e.g. LXC running Docker in Proxmox.
|
||||
|
||||
## Known limitations
|
||||
- Video files created by Tube Archivist need to be playable in your browser of choice. Not every codec is compatible with every browser and might require some testing with format selection.
|
||||
- Every limitation of **yt-dlp** will also be present in Tube Archivist. If **yt-dlp** can't download or extract a video for any reason, Tube Archivist won't be able to either.
|
||||
- There is no flexibility in naming of the media files.
|
||||
## Getting Started
|
||||
1. Go through the **settings** page and look at the available options. Particularly set *Download Format* to your desired video quality before downloading. **Tube Archivist** downloads the best available quality by default. To support iOS or MacOS and some other browsers a compatible format must be specified. For example:
|
||||
```
|
||||
bestvideo[VCODEC=avc1]+bestaudio[ACODEC=mp4a]/mp4
|
||||
```
|
||||
2. Subscribe to some of your favorite YouTube channels on the **channels** page.
|
||||
3. On the **downloads** page, click on *Rescan subscriptions* to add videos from the subscribed channels to your Download queue or click on *Add to download queue* to manually add Video IDs, links, channels or playlists.
|
||||
4. Click on *Start download* and let **Tube Archivist** to it's thing.
|
||||
5. Enjoy your archived collection!
|
||||
|
||||
## Roadmap
|
||||
We have come far, nonetheless we are not short of ideas on how to improve and extend this project. Issues waiting for you to be tackled in no particular order:
|
||||
|
||||
- [ ] Audio download
|
||||
- [ ] User roles
|
||||
- [ ] Podcast mode to serve channel as mp3
|
||||
- [ ] Random and repeat controls ([#108](https://github.com/tubearchivist/tubearchivist/issues/108), [#220](https://github.com/tubearchivist/tubearchivist/issues/220))
|
||||
- [ ] Implement [PyFilesystem](https://github.com/PyFilesystem/pyfilesystem2) for flexible video storage
|
||||
- [ ] Implement [Apprise](https://github.com/caronc/apprise) for notifications ([#97](https://github.com/tubearchivist/tubearchivist/issues/97))
|
||||
- [ ] User created playlists, random and repeat controls ([#108](https://github.com/tubearchivist/tubearchivist/issues/108), [#220](https://github.com/tubearchivist/tubearchivist/issues/220))
|
||||
- [ ] Auto play or play next link ([#226](https://github.com/tubearchivist/tubearchivist/issues/226))
|
||||
- [ ] Show similar videos on video page
|
||||
- [ ] Multi language support
|
||||
- [ ] Show total video downloaded vs total videos available in channel
|
||||
- [ ] Download or Ignore videos by keyword ([#163](https://github.com/tubearchivist/tubearchivist/issues/163))
|
||||
- [ ] Add statistics of index
|
||||
- [ ] Download speed schedule ([#198](https://github.com/tubearchivist/tubearchivist/issues/198))
|
||||
- [ ] Auto ignore videos by keyword ([#163](https://github.com/tubearchivist/tubearchivist/issues/163))
|
||||
- [ ] Custom searchable notes to videos, channels, playlists ([#144](https://github.com/tubearchivist/tubearchivist/issues/144))
|
||||
- [ ] Search comments
|
||||
- [ ] Search download queue
|
||||
- [ ] Per user videos/channel/playlists
|
||||
- [ ] Download video comments
|
||||
|
||||
Implemented:
|
||||
- [X] Configure shorts, streams and video sizes per channel [2024-07-15]
|
||||
- [X] User created playlists [2024-04-10]
|
||||
- [X] User roles, aka read only user [2023-11-10]
|
||||
- [X] Add statistics of index [2023-09-03]
|
||||
- [X] Implement [Apprise](https://github.com/caronc/apprise) for notifications [2023-08-05]
|
||||
- [X] Download video comments [2022-11-30]
|
||||
- [X] Show similar videos on video page [2022-11-30]
|
||||
- [X] Implement complete offline media file import from json file [2022-08-20]
|
||||
- [X] Filter and query in search form, search by url query [2022-07-23]
|
||||
- [X] Make items in grid row configurable to use more of the screen [2022-06-04]
|
||||
@ -200,20 +211,11 @@ Implemented:
|
||||
- [X] Backup and restore [2021-09-22]
|
||||
- [X] Scan your file system to index already downloaded videos [2021-09-14]
|
||||
|
||||
## User Scripts
|
||||
This is a list of useful user scripts, generously created from folks like you to extend this project and its functionality. Make sure to check the respective repository links for detailed license information.
|
||||
## Known limitations
|
||||
- Video files created by Tube Archivist need to be playable in your browser of choice. Not every codec is compatible with every browser and might require some testing with format selection.
|
||||
- Every limitation of **yt-dlp** will also be present in Tube Archivist. If **yt-dlp** can't download or extract a video for any reason, Tube Archivist won't be able to either.
|
||||
- There is currently no flexibility in naming of the media files.
|
||||
|
||||
This is your time to shine, [read this](https://github.com/tubearchivist/tubearchivist/blob/master/CONTRIBUTING.md#user-scripts) then open a PR to add your script here.
|
||||
|
||||
- [danieljue/ta_dl_page_script](https://github.com/danieljue/ta_dl_page_script): Helper browser script to prioritize a channels' videos in download queue.
|
||||
- [dot-mike/ta-scripts](https://github.com/dot-mike/ta-scripts): A collection of personal scripts for managing TubeArchivist.
|
||||
- [DarkFighterLuke/ta_base_url_nginx](https://gist.github.com/DarkFighterLuke/4561b6bfbf83720493dc59171c58ac36): Set base URL with Nginx when you can't use subdomains.
|
||||
- [lamusmaser/ta_migration_helper](https://github.com/lamusmaser/ta_migration_helper): Advanced helper script for migration issues to TubeArchivist v0.4.4 or later.
|
||||
- [lamusmaser/create_info_json](https://gist.github.com/lamusmaser/837fb58f73ea0cad784a33497932e0dd): Script to generate `.info.json` files using `ffmpeg` collecting information from downloaded videos.
|
||||
- [lamusmaser/ta_fix_for_video_redirection](https://github.com/lamusmaser/ta_fix_for_video_redirection): Script to fix videos that were incorrectly indexed by YouTube's "Video is Unavailable" response.
|
||||
- [RoninTech/ta-helper](https://github.com/RoninTech/ta-helper): Helper script to provide a symlink association to reference TubeArchivist videos with their original titles.
|
||||
- [tangyjoust/Tautulli-Notify-TubeArchivist-of-Plex-Watched-State](https://github.com/tangyjoust/Tautulli-Notify-TubeArchivist-of-Plex-Watched-State) Mark videos watched in Plex (through streaming not manually) through Tautulli back to TubeArchivist
|
||||
- [Dhs92/delete_shorts](https://github.com/Dhs92/delete_shorts): A script to delete ALL YouTube Shorts from TubeArchivist
|
||||
|
||||
## Donate
|
||||
The best donation to **Tube Archivist** is your time, take a look at the [contribution page](CONTRIBUTING.md) to get started.
|
||||
@ -223,18 +225,11 @@ Second best way to support the development is to provide for caffeinated beverag
|
||||
* [Paypal Subscription](https://www.paypal.com/webapps/billing/plans/subscribe?plan_id=P-03770005GR991451KMFGVPMQ) for a monthly coffee
|
||||
* [ko-fi.com](https://ko-fi.com/bbilly1) for an alternative platform
|
||||
|
||||
## Notable mentions
|
||||
This is a selection of places where this project has been featured on reddit, in the news, blogs or any other online media, newest on top.
|
||||
* **xda-developers.com**: 5 obscure self-hosted services worth checking out - Tube Archivist - To save your essential YouTube videos, [2024-10-13][[link](https://www.xda-developers.com/obscure-self-hosted-services/)]
|
||||
* **selfhosted.show**: why we're trying Tube Archivist, [2024-06-14][[link](https://selfhosted.show/125)]
|
||||
* **ycombinator**: Tube Archivist on Hackernews front page, [2023-07-16][[link](https://news.ycombinator.com/item?id=36744395)]
|
||||
* **linux-community.de**: Tube Archivist bringt Ordnung in die Youtube-Sammlung, [German][2023-05-01][[link](https://www.linux-community.de/ausgaben/linuxuser/2023/05/tube-archivist-bringt-ordnung-in-die-youtube-sammlung/)]
|
||||
* **noted.lol**: Dev Debrief, An Interview With the Developer of Tube Archivist, [2023-03-30] [[link](https://noted.lol/dev-debrief-tube-archivist/)]
|
||||
* **console.substack.com**: Interview With Simon of Tube Archivist, [2023-01-29] [[link](https://console.substack.com/p/console-142#%C2%A7interview-with-simon-of-tube-archivist)]
|
||||
* **reddit.com**: Tube Archivist v0.3.0 - Now Archiving Comments, [2022-12-02] [[link](https://www.reddit.com/r/selfhosted/comments/zaonzp/tube_archivist_v030_now_archiving_comments/)]
|
||||
* **reddit.com**: Tube Archivist v0.2 - Now with Full Text Search, [2022-07-24] [[link](https://www.reddit.com/r/selfhosted/comments/w6jfa1/tube_archivist_v02_now_with_full_text_search/)]
|
||||
* **noted.lol**: How I Control What Media My Kids Watch Using Tube Archivist, [2022-03-27] [[link](https://noted.lol/how-i-control-what-media-my-kids-watch-using-tube-archivist/)]
|
||||
* **thehomelab.wiki**: Tube Archivist - A Youtube-DL Alternative on Steroids, [2022-01-27] [[link](https://thehomelab.wiki/books/news/page/tube-archivist-a-youtube-dl-alternative-on-steroids)]
|
||||
* **reddit.com**: Celebrating TubeArchivist v0.1, [2022-01-09] [[link](https://www.reddit.com/r/selfhosted/comments/rzh084/celebrating_tubearchivist_v01/)]
|
||||
* **linuxunplugged.com**: Pick: tubearchivist — Your self-hosted YouTube media server, [2021-09-11] [[link](https://linuxunplugged.com/425)] and [2021-10-05] [[link](https://linuxunplugged.com/426)]
|
||||
* **reddit.com**: Introducing Tube Archivist, your self hosted Youtube media server, [2021-09-12] [[link](https://www.reddit.com/r/selfhosted/comments/pmj07b/introducing_tube_archivist_your_self_hosted/)]
|
||||
|
||||
## Sponsor
|
||||
Big thank you to [Digitalocean](https://www.digitalocean.com/) for generously donating credit for the tubearchivist.com VPS and buildserver.
|
||||
<p>
|
||||
<a href="https://www.digitalocean.com/">
|
||||
<img src="https://opensource.nyc3.cdn.digitaloceanspaces.com/attribution/assets/PoweredByDO/DO_Powered_by_Badge_blue.svg" width="201px">
|
||||
</a>
|
||||
</p>
|
||||
|
25
SHOWCASE.MD
@ -1,25 +0,0 @@
|
||||
## Tube Archivist on YouTube
|
||||
[](https://www.youtube.com/watch?v=O8H8Z01c0Ys)
|
||||
Video featuring Tube Archivist generously created by [IBRACORP](https://www.youtube.com/@IBRACORP).
|
||||
|
||||
## Screenshots
|
||||

|
||||
*Login Page*: Secure way to access your media collection.
|
||||
|
||||

|
||||
*Home Page*: Your recent videos, continue watching incomplete videos.
|
||||
|
||||

|
||||
*All Channels*: A list of all your indexed channels, filtered by subscribed only.
|
||||
|
||||

|
||||
*Single Channel*: Single channel page with additional metadata and sub pages.
|
||||
|
||||

|
||||
*Video Page*: Stream your video directly from the interface.
|
||||
|
||||

|
||||
*Downloads Page*: Add, control, and monitor your download queue.
|
||||
|
||||

|
||||
*Search Page*. Use expressions to quickly search through your collection.
|
BIN
assets/tube-archivist-banner.jpg
Normal file
After Width: | Height: | Size: 49 KiB |
Before Width: | Height: | Size: 516 KiB |
Before Width: | Height: | Size: 541 KiB |
Before Width: | Height: | Size: 1.6 MiB |
Before Width: | Height: | Size: 578 KiB |
Before Width: | Height: | Size: 106 KiB |
@ -1,79 +0,0 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<svg id="Layer_1" xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 1000 1000">
|
||||
<!-- Generator: Adobe Illustrator 29.5.0, SVG Export Plug-In . SVG Version: 2.1.0 Build 137) -->
|
||||
<defs>
|
||||
<style>
|
||||
.st0 {
|
||||
fill: #fff;
|
||||
}
|
||||
|
||||
.st1 {
|
||||
fill: #039a86;
|
||||
}
|
||||
|
||||
.st2 {
|
||||
fill: none;
|
||||
}
|
||||
|
||||
.st3 {
|
||||
clip-path: url(#clippath-1);
|
||||
}
|
||||
|
||||
.st4 {
|
||||
fill: #06131a;
|
||||
}
|
||||
|
||||
.st5 {
|
||||
clip-path: url(#clippath-3);
|
||||
}
|
||||
|
||||
.st6 {
|
||||
display: none;
|
||||
}
|
||||
|
||||
.st7 {
|
||||
clip-path: url(#clippath-2);
|
||||
}
|
||||
|
||||
.st8 {
|
||||
clip-path: url(#clippath);
|
||||
}
|
||||
</style>
|
||||
<clipPath id="clippath">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
<clipPath id="clippath-1">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
<clipPath id="clippath-2">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
<clipPath id="clippath-3">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
</defs>
|
||||
<g id="Artwork_1" class="st6">
|
||||
<g class="st8">
|
||||
<g class="st3">
|
||||
<path class="st1" d="M447.2,22.9v15.2C269.3,59.3,118.8,179.4,58.6,348.1l76,21.8c49.9-135.2,169.9-232.2,312.6-252.7v15.4h35.3s0-109.7,0-109.7h-35.3ZM523,34.5v79.1c142.3,7.7,269.2,91.9,331.7,219.9l-14.8,4.2,9.7,33.7,106.6-30.3-9.7-33.9-14.9,4.3c-73.1-161.9-231-269-408.5-277M957.6,382.9l-75.8,21.7c8.9,32.9,13.6,66.8,13.8,100.8-.2,103.8-41.6,203.3-114.9,276.8l-9.4-12.6-28.6,20.8,11.9,16,46.5,64,6.6,9.1,28.6-20.8-8.8-12.1c93.6-88.8,146.7-212.1,147-341.1-.2-41.4-5.9-82.6-16.8-122.6M35.3,383.5l-9.7,33.9,14,4c-5.3,27.7-8.1,55.8-8.4,84,0,145.5,67.3,282.8,182.1,372.1l46.5-64c-94.4-74.4-149.6-187.9-149.8-308.1.3-20.8,2.2-41.6,5.8-62.1l15.1,4.1,9.7-33.9-17.9-4.9-75.7-21.7-11.6-3.3ZM303.8,820.6l-64.8,88.8,28.6,20.8,8.5-11.7c69.4,38.3,147.4,58.5,226.7,58.7,94.9,0,187.7-28.7,266.1-82.2l-46.6-64.1c-64.8,43.9-141.2,67.3-219.5,67.5-62.6-.3-124.2-15.5-179.8-44.4l9.4-12.6-28.6-20.8Z"/>
|
||||
<polygon class="st4" points="114.9 238.4 115.1 324.3 261.3 324.3 261.1 458.5 351.9 458.5 352.1 324.3 495.9 324.3 495.6 238 114.9 238.4"/>
|
||||
<rect class="st4" x="261.1" y="554.4" width="90.8" height="200.1"/>
|
||||
<polygon class="st4" points="622.7 244.2 429.6 754.5 526.4 754.4 666.6 361.6 806 754.4 902.9 754.4 710.4 244.2 622.7 244.2"/>
|
||||
<path class="st1" d="M255.5,476.4c-16.5,0-29.9,13.6-29.9,30.1.2,17.6,16.1,30.1,30,30.1,34.5,0,69.9,0,103.3,0,16.1,0,28.9-14,28.9-30.1,0-16.1-12.2-30.1-28.8-30.1-35.8,0-72.8,0-103.4,0"/>
|
||||
<path class="st1" d="M665.5,483.6c-16.1,0-29.8,12.2-29.8,28.8v172l-37.8-38.9-25,24.5,92.2,93.8,94.3-93.8-25-24.5-38.9,38.9c0-23.6,0-40.8,0-68.6-.3-34.5,0-69,0-103.6,0-16.1-13.7-28.6-29.8-28.6h0Z"/>
|
||||
</g>
|
||||
</g>
|
||||
</g>
|
||||
<g id="Artwork_2">
|
||||
<g class="st7">
|
||||
<g class="st5">
|
||||
<path class="st1" d="M447.2,22.9v15.2C269.3,59.3,118.8,179.4,58.6,348.1l76,21.8c49.9-135.2,169.9-232.2,312.6-252.7v15.4h35.3s0-109.7,0-109.7h-35.3ZM523,34.5v79.1c142.3,7.7,269.2,91.9,331.7,219.9l-14.8,4.2,9.7,33.7,106.6-30.3-9.7-33.9-14.9,4.3c-73.1-161.9-231-269-408.5-277M957.6,382.9l-75.8,21.7c8.9,32.9,13.6,66.8,13.8,100.8-.2,103.8-41.6,203.3-114.9,276.8l-9.4-12.6-28.6,20.8,11.9,16,46.5,64,6.6,9.1,28.6-20.8-8.8-12.1c93.6-88.8,146.7-212.1,147-341.1-.2-41.4-5.9-82.6-16.8-122.6M35.3,383.5l-9.7,33.9,14,4c-5.3,27.7-8.1,55.8-8.4,84,0,145.5,67.3,282.8,182.1,372.1l46.5-64c-94.4-74.4-149.6-187.9-149.8-308.1.3-20.8,2.2-41.6,5.8-62.1l15.1,4.1,9.7-33.9-17.9-4.9-75.7-21.7-11.6-3.3ZM303.8,820.6l-64.8,88.8,28.6,20.8,8.5-11.7c69.4,38.3,147.4,58.5,226.7,58.7,94.9,0,187.7-28.7,266.1-82.2l-46.6-64.1c-64.8,43.9-141.2,67.3-219.5,67.5-62.6-.3-124.2-15.5-179.8-44.4l9.4-12.6-28.6-20.8Z"/>
|
||||
<polygon class="st0" points="114.9 238.4 115.1 324.3 261.3 324.3 261.1 458.5 351.9 458.5 352.1 324.3 495.9 324.3 495.6 238 114.9 238.4"/>
|
||||
<rect class="st0" x="261.1" y="554.4" width="90.8" height="200.1"/>
|
||||
<polygon class="st0" points="622.7 244.2 429.6 754.5 526.4 754.4 666.6 361.6 806 754.4 902.9 754.4 710.4 244.2 622.7 244.2"/>
|
||||
<path class="st1" d="M255.5,476.4c-16.5,0-29.9,13.6-29.9,30.1.2,17.6,16.1,30.1,30,30.1,34.5,0,69.9,0,103.3,0,16.1,0,28.9-14,28.9-30.1,0-16.1-12.2-30.1-28.8-30.1-35.8,0-72.8,0-103.4,0"/>
|
||||
<path class="st1" d="M665.5,483.6c-16.1,0-29.8,12.2-29.8,28.8v172l-37.8-38.9-25,24.5,92.2,93.8,94.3-93.8-25-24.5-38.9,38.9c0-23.6,0-40.8,0-68.6-.3-34.5,0-69,0-103.6,0-16.1-13.7-28.6-29.8-28.6h0Z"/>
|
||||
</g>
|
||||
</g>
|
||||
</g>
|
||||
</svg>
|
Before Width: | Height: | Size: 4.6 KiB |
@ -1,79 +0,0 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<svg id="Layer_1" xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 1000 1000">
|
||||
<!-- Generator: Adobe Illustrator 29.5.0, SVG Export Plug-In . SVG Version: 2.1.0 Build 137) -->
|
||||
<defs>
|
||||
<style>
|
||||
.st0 {
|
||||
fill: #fff;
|
||||
}
|
||||
|
||||
.st1 {
|
||||
fill: #039a86;
|
||||
}
|
||||
|
||||
.st2 {
|
||||
fill: none;
|
||||
}
|
||||
|
||||
.st3 {
|
||||
clip-path: url(#clippath-1);
|
||||
}
|
||||
|
||||
.st4 {
|
||||
fill: #06131a;
|
||||
}
|
||||
|
||||
.st5 {
|
||||
clip-path: url(#clippath-3);
|
||||
}
|
||||
|
||||
.st6 {
|
||||
display: none;
|
||||
}
|
||||
|
||||
.st7 {
|
||||
clip-path: url(#clippath-2);
|
||||
}
|
||||
|
||||
.st8 {
|
||||
clip-path: url(#clippath);
|
||||
}
|
||||
</style>
|
||||
<clipPath id="clippath">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
<clipPath id="clippath-1">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
<clipPath id="clippath-2">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
<clipPath id="clippath-3">
|
||||
<rect class="st2" x="25.6" y="22.9" width="948.9" height="954.2"/>
|
||||
</clipPath>
|
||||
</defs>
|
||||
<g id="Artwork_1">
|
||||
<g class="st8">
|
||||
<g class="st3">
|
||||
<path class="st1" d="M447.2,22.9v15.2C269.3,59.3,118.8,179.4,58.6,348.1l76,21.8c49.9-135.2,169.9-232.2,312.6-252.7v15.4h35.3s0-109.7,0-109.7h-35.3ZM523,34.5v79.1c142.3,7.7,269.2,91.9,331.7,219.9l-14.8,4.2,9.7,33.7,106.6-30.3-9.7-33.9-14.9,4.3c-73.1-161.9-231-269-408.5-277M957.6,382.9l-75.8,21.7c8.9,32.9,13.6,66.8,13.8,100.8-.2,103.8-41.6,203.3-114.9,276.8l-9.4-12.6-28.6,20.8,11.9,16,46.5,64,6.6,9.1,28.6-20.8-8.8-12.1c93.6-88.8,146.7-212.1,147-341.1-.2-41.4-5.9-82.6-16.8-122.6M35.3,383.5l-9.7,33.9,14,4c-5.3,27.7-8.1,55.8-8.4,84,0,145.5,67.3,282.8,182.1,372.1l46.5-64c-94.4-74.4-149.6-187.9-149.8-308.1.3-20.8,2.2-41.6,5.8-62.1l15.1,4.1,9.7-33.9-17.9-4.9-75.7-21.7-11.6-3.3ZM303.8,820.6l-64.8,88.8,28.6,20.8,8.5-11.7c69.4,38.3,147.4,58.5,226.7,58.7,94.9,0,187.7-28.7,266.1-82.2l-46.6-64.1c-64.8,43.9-141.2,67.3-219.5,67.5-62.6-.3-124.2-15.5-179.8-44.4l9.4-12.6-28.6-20.8Z"/>
|
||||
<polygon class="st4" points="114.9 238.4 115.1 324.3 261.3 324.3 261.1 458.5 351.9 458.5 352.1 324.3 495.9 324.3 495.6 238 114.9 238.4"/>
|
||||
<rect class="st4" x="261.1" y="554.4" width="90.8" height="200.1"/>
|
||||
<polygon class="st4" points="622.7 244.2 429.6 754.5 526.4 754.4 666.6 361.6 806 754.4 902.9 754.4 710.4 244.2 622.7 244.2"/>
|
||||
<path class="st1" d="M255.5,476.4c-16.5,0-29.9,13.6-29.9,30.1.2,17.6,16.1,30.1,30,30.1,34.5,0,69.9,0,103.3,0,16.1,0,28.9-14,28.9-30.1,0-16.1-12.2-30.1-28.8-30.1-35.8,0-72.8,0-103.4,0"/>
|
||||
<path class="st1" d="M665.5,483.6c-16.1,0-29.8,12.2-29.8,28.8v172l-37.8-38.9-25,24.5,92.2,93.8,94.3-93.8-25-24.5-38.9,38.9c0-23.6,0-40.8,0-68.6-.3-34.5,0-69,0-103.6,0-16.1-13.7-28.6-29.8-28.6h0Z"/>
|
||||
</g>
|
||||
</g>
|
||||
</g>
|
||||
<g id="Artwork_2" class="st6">
|
||||
<g class="st7">
|
||||
<g class="st5">
|
||||
<path class="st1" d="M447.2,22.9v15.2C269.3,59.3,118.8,179.4,58.6,348.1l76,21.8c49.9-135.2,169.9-232.2,312.6-252.7v15.4h35.3s0-109.7,0-109.7h-35.3ZM523,34.5v79.1c142.3,7.7,269.2,91.9,331.7,219.9l-14.8,4.2,9.7,33.7,106.6-30.3-9.7-33.9-14.9,4.3c-73.1-161.9-231-269-408.5-277M957.6,382.9l-75.8,21.7c8.9,32.9,13.6,66.8,13.8,100.8-.2,103.8-41.6,203.3-114.9,276.8l-9.4-12.6-28.6,20.8,11.9,16,46.5,64,6.6,9.1,28.6-20.8-8.8-12.1c93.6-88.8,146.7-212.1,147-341.1-.2-41.4-5.9-82.6-16.8-122.6M35.3,383.5l-9.7,33.9,14,4c-5.3,27.7-8.1,55.8-8.4,84,0,145.5,67.3,282.8,182.1,372.1l46.5-64c-94.4-74.4-149.6-187.9-149.8-308.1.3-20.8,2.2-41.6,5.8-62.1l15.1,4.1,9.7-33.9-17.9-4.9-75.7-21.7-11.6-3.3ZM303.8,820.6l-64.8,88.8,28.6,20.8,8.5-11.7c69.4,38.3,147.4,58.5,226.7,58.7,94.9,0,187.7-28.7,266.1-82.2l-46.6-64.1c-64.8,43.9-141.2,67.3-219.5,67.5-62.6-.3-124.2-15.5-179.8-44.4l9.4-12.6-28.6-20.8Z"/>
|
||||
<polygon class="st0" points="114.9 238.4 115.1 324.3 261.3 324.3 261.1 458.5 351.9 458.5 352.1 324.3 495.9 324.3 495.6 238 114.9 238.4"/>
|
||||
<rect class="st0" x="261.1" y="554.4" width="90.8" height="200.1"/>
|
||||
<polygon class="st0" points="622.7 244.2 429.6 754.5 526.4 754.4 666.6 361.6 806 754.4 902.9 754.4 710.4 244.2 622.7 244.2"/>
|
||||
<path class="st1" d="M255.5,476.4c-16.5,0-29.9,13.6-29.9,30.1.2,17.6,16.1,30.1,30,30.1,34.5,0,69.9,0,103.3,0,16.1,0,28.9-14,28.9-30.1,0-16.1-12.2-30.1-28.8-30.1-35.8,0-72.8,0-103.4,0"/>
|
||||
<path class="st1" d="M665.5,483.6c-16.1,0-29.8,12.2-29.8,28.8v172l-37.8-38.9-25,24.5,92.2,93.8,94.3-93.8-25-24.5-38.9,38.9c0-23.6,0-40.8,0-68.6-.3-34.5,0-69,0-103.6,0-16.1-13.7-28.6-29.8-28.6h0Z"/>
|
||||
</g>
|
||||
</g>
|
||||
</g>
|
||||
</svg>
|
Before Width: | Height: | Size: 4.6 KiB |
BIN
assets/tube-archivist-screenshot-channels.png
Normal file
After Width: | Height: | Size: 131 KiB |
BIN
assets/tube-archivist-screenshot-download.png
Normal file
After Width: | Height: | Size: 79 KiB |
BIN
assets/tube-archivist-screenshot-home.png
Normal file
After Width: | Height: | Size: 174 KiB |
BIN
assets/tube-archivist-screenshot-single-channel.png
Normal file
After Width: | Height: | Size: 166 KiB |
BIN
assets/tube-archivist-screenshot-video.png
Normal file
After Width: | Height: | Size: 238 KiB |
Before Width: | Height: | Size: 96 KiB |
Before Width: | Height: | Size: 716 KiB |
Before Width: | Height: | Size: 684 KiB |
@ -1,86 +0,0 @@
|
||||
# Django Setup
|
||||
|
||||
## Apps
|
||||
The backend is split up into the following apps.
|
||||
|
||||
### config
|
||||
Root Django App. Doesn't define any views.
|
||||
|
||||
- Has main `settings.py`
|
||||
- Has main `urls.py` responsible for routing to other apps
|
||||
|
||||
### common
|
||||
Functionality shared between apps.
|
||||
|
||||
Defines views on the root `/api/*` path. Has base views to inherit from.
|
||||
|
||||
- Connections to ES and Redis
|
||||
- Searching
|
||||
- URL parser
|
||||
- Collection of helper functions
|
||||
|
||||
### appsettings
|
||||
Responsible for functionality from the settings pages.
|
||||
|
||||
Defines views at `/api/appsettings/*`.
|
||||
|
||||
- Index setup
|
||||
- Reindexing
|
||||
- Snapshots
|
||||
- Filesystem Scan
|
||||
- Manual import
|
||||
|
||||
### channel
|
||||
Responsible for Channel Indexing functionality.
|
||||
|
||||
Defines views at `/api/channel/*` path.
|
||||
|
||||
### download
|
||||
Implements download functionality with yt-dlp.
|
||||
|
||||
Defines views at `/api/download/*`.
|
||||
|
||||
- Download videos
|
||||
- Queue management
|
||||
- Thumbnails
|
||||
- Subscriptions
|
||||
|
||||
### playlist
|
||||
Implements playlist functionality.
|
||||
|
||||
Defines views at `/api/playlist/*`.
|
||||
|
||||
- Index Playlists
|
||||
- Manual Playlists
|
||||
|
||||
### stats
|
||||
Builds aggregations views for the statistics dashboard.
|
||||
|
||||
Defines views at `/api/stats/*`.
|
||||
|
||||
### task
|
||||
Defines tasks for Celery.
|
||||
|
||||
Defines views at `/api/task/*`.
|
||||
|
||||
- Has main `tasks.py` with all shared_task definitions
|
||||
- Has `CustomPeriodicTask` model
|
||||
- Implements apprise notifications links
|
||||
- Implements schedule functionality
|
||||
|
||||
### user
|
||||
Implements user and auth functionality.
|
||||
|
||||
Defines views at `/api/config/*`.
|
||||
|
||||
- Defines custom `Account` model
|
||||
|
||||
### video
|
||||
Index functionality for videos.
|
||||
|
||||
Defines views at `/api/video/*`.
|
||||
|
||||
- Index videos
|
||||
- Index comments
|
||||
- Index/download subtitles
|
||||
- Media stream parsing
|
@ -1,133 +0,0 @@
|
||||
"""appsettings erializers"""
|
||||
|
||||
# pylint: disable=abstract-method
|
||||
|
||||
from common.serializers import ValidateUnknownFieldsMixin
|
||||
from rest_framework import serializers
|
||||
|
||||
|
||||
class BackupFileSerializer(serializers.Serializer):
|
||||
"""serialize backup file"""
|
||||
|
||||
filename = serializers.CharField()
|
||||
file_path = serializers.CharField()
|
||||
file_size = serializers.IntegerField()
|
||||
timestamp = serializers.CharField()
|
||||
reason = serializers.CharField()
|
||||
|
||||
|
||||
class AppConfigSubSerializer(
|
||||
ValidateUnknownFieldsMixin, serializers.Serializer
|
||||
):
|
||||
"""serialize app config subscriptions"""
|
||||
|
||||
channel_size = serializers.IntegerField(required=False)
|
||||
live_channel_size = serializers.IntegerField(required=False)
|
||||
shorts_channel_size = serializers.IntegerField(required=False)
|
||||
auto_start = serializers.BooleanField(required=False)
|
||||
|
||||
|
||||
class AppConfigDownloadsSerializer(
|
||||
ValidateUnknownFieldsMixin, serializers.Serializer
|
||||
):
|
||||
"""serialize app config downloads config"""
|
||||
|
||||
limit_speed = serializers.IntegerField(allow_null=True)
|
||||
sleep_interval = serializers.IntegerField(allow_null=True)
|
||||
autodelete_days = serializers.IntegerField(allow_null=True)
|
||||
format = serializers.CharField(allow_null=True)
|
||||
format_sort = serializers.CharField(allow_null=True)
|
||||
add_metadata = serializers.BooleanField()
|
||||
add_thumbnail = serializers.BooleanField()
|
||||
subtitle = serializers.CharField(allow_null=True)
|
||||
subtitle_source = serializers.ChoiceField(
|
||||
choices=["auto", "user"], allow_null=True
|
||||
)
|
||||
subtitle_index = serializers.BooleanField()
|
||||
comment_max = serializers.CharField(allow_null=True)
|
||||
comment_sort = serializers.ChoiceField(
|
||||
choices=["top", "new"], allow_null=True
|
||||
)
|
||||
cookie_import = serializers.BooleanField()
|
||||
potoken = serializers.BooleanField()
|
||||
throttledratelimit = serializers.IntegerField(allow_null=True)
|
||||
extractor_lang = serializers.CharField(allow_null=True)
|
||||
integrate_ryd = serializers.BooleanField()
|
||||
integrate_sponsorblock = serializers.BooleanField()
|
||||
|
||||
|
||||
class AppConfigAppSerializer(
|
||||
ValidateUnknownFieldsMixin, serializers.Serializer
|
||||
):
|
||||
"""serialize app config"""
|
||||
|
||||
enable_snapshot = serializers.BooleanField()
|
||||
enable_cast = serializers.BooleanField()
|
||||
|
||||
|
||||
class AppConfigSerializer(ValidateUnknownFieldsMixin, serializers.Serializer):
|
||||
"""serialize appconfig"""
|
||||
|
||||
subscriptions = AppConfigSubSerializer(required=False)
|
||||
downloads = AppConfigDownloadsSerializer(required=False)
|
||||
application = AppConfigAppSerializer(required=False)
|
||||
|
||||
|
||||
class CookieValidationSerializer(serializers.Serializer):
|
||||
"""serialize cookie validation response"""
|
||||
|
||||
cookie_enabled = serializers.BooleanField()
|
||||
status = serializers.BooleanField(required=False)
|
||||
validated = serializers.IntegerField(required=False)
|
||||
validated_str = serializers.CharField(required=False)
|
||||
|
||||
|
||||
class CookieUpdateSerializer(serializers.Serializer):
|
||||
"""serialize cookie to update"""
|
||||
|
||||
cookie = serializers.CharField()
|
||||
|
||||
|
||||
class PoTokenSerializer(serializers.Serializer):
|
||||
"""serialize PO token"""
|
||||
|
||||
potoken = serializers.CharField()
|
||||
|
||||
|
||||
class SnapshotItemSerializer(serializers.Serializer):
|
||||
"""serialize snapshot response"""
|
||||
|
||||
id = serializers.CharField()
|
||||
state = serializers.CharField()
|
||||
es_version = serializers.CharField()
|
||||
start_date = serializers.CharField()
|
||||
end_date = serializers.CharField()
|
||||
end_stamp = serializers.IntegerField()
|
||||
duration_s = serializers.IntegerField()
|
||||
|
||||
|
||||
class SnapshotListSerializer(serializers.Serializer):
|
||||
"""serialize snapshot list response"""
|
||||
|
||||
next_exec = serializers.IntegerField()
|
||||
next_exec_str = serializers.CharField()
|
||||
expire_after = serializers.CharField()
|
||||
snapshots = SnapshotItemSerializer(many=True)
|
||||
|
||||
|
||||
class SnapshotCreateResponseSerializer(serializers.Serializer):
|
||||
"""serialize new snapshot creating response"""
|
||||
|
||||
snapshot_name = serializers.CharField()
|
||||
|
||||
|
||||
class SnapshotRestoreResponseSerializer(serializers.Serializer):
|
||||
"""serialize snapshot restore response"""
|
||||
|
||||
accepted = serializers.BooleanField()
|
||||
|
||||
|
||||
class TokenResponseSerializer(serializers.Serializer):
|
||||
"""serialize token response"""
|
||||
|
||||
token = serializers.CharField()
|
@ -1,273 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- Handle json zip file based backup
|
||||
- create backup
|
||||
- restore backup
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import zipfile
|
||||
from datetime import datetime
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap, IndexPaginate
|
||||
from common.src.helper import get_mapping, ignore_filelist
|
||||
from task.models import CustomPeriodicTask
|
||||
|
||||
|
||||
class ElasticBackup:
|
||||
"""dump index to nd-json files for later bulk import"""
|
||||
|
||||
INDEX_SPLIT = ["comment"]
|
||||
CACHE_DIR = EnvironmentSettings.CACHE_DIR
|
||||
BACKUP_DIR = os.path.join(CACHE_DIR, "backup")
|
||||
|
||||
def __init__(self, reason=False, task=False) -> None:
|
||||
self.timestamp = datetime.now().strftime("%Y%m%d")
|
||||
self.index_config = get_mapping()
|
||||
self.reason = reason
|
||||
self.task = task
|
||||
|
||||
def backup_all_indexes(self):
|
||||
"""backup all indexes, add reason to init"""
|
||||
print("backup all indexes")
|
||||
if not self.reason:
|
||||
raise ValueError("missing backup reason in ElasticBackup")
|
||||
|
||||
if self.task:
|
||||
self.task.send_progress(["Scanning your index."])
|
||||
for index in self.index_config:
|
||||
index_name = index["index_name"]
|
||||
print(f"backup: export in progress for {index_name}")
|
||||
if not self.index_exists(index_name):
|
||||
print(f"skip backup for not yet existing index {index_name}")
|
||||
continue
|
||||
|
||||
self.backup_index(index_name)
|
||||
|
||||
if self.task:
|
||||
self.task.send_progress(["Compress files to zip archive."])
|
||||
self.zip_it()
|
||||
if self.reason == "auto":
|
||||
self.rotate_backup()
|
||||
|
||||
def backup_index(self, index_name):
|
||||
"""export all documents of a single index"""
|
||||
paginate_kwargs = {
|
||||
"data": {"query": {"match_all": {}}},
|
||||
"keep_source": True,
|
||||
"callback": BackupCallback,
|
||||
"task": self.task,
|
||||
"total": self._get_total(index_name),
|
||||
}
|
||||
|
||||
if index_name in self.INDEX_SPLIT:
|
||||
paginate_kwargs.update({"size": 200})
|
||||
|
||||
paginate = IndexPaginate(f"ta_{index_name}", **paginate_kwargs)
|
||||
_ = paginate.get_results()
|
||||
|
||||
@staticmethod
|
||||
def _get_total(index_name):
|
||||
"""get total documents in index"""
|
||||
path = f"ta_{index_name}/_count"
|
||||
response, _ = ElasticWrap(path).get()
|
||||
|
||||
return response.get("count")
|
||||
|
||||
def zip_it(self):
|
||||
"""pack it up into single zip file"""
|
||||
file_name = f"ta_backup-{self.timestamp}-{self.reason}.zip"
|
||||
|
||||
to_backup = []
|
||||
for file in os.listdir(self.BACKUP_DIR):
|
||||
if file.endswith(".json"):
|
||||
to_backup.append(os.path.join(self.BACKUP_DIR, file))
|
||||
|
||||
backup_file = os.path.join(self.BACKUP_DIR, file_name)
|
||||
|
||||
comp = zipfile.ZIP_DEFLATED
|
||||
with zipfile.ZipFile(backup_file, "w", compression=comp) as zip_f:
|
||||
for backup_file in to_backup:
|
||||
zip_f.write(backup_file, os.path.basename(backup_file))
|
||||
|
||||
# cleanup
|
||||
for backup_file in to_backup:
|
||||
os.remove(backup_file)
|
||||
|
||||
def post_bulk_restore(self, file_name):
|
||||
"""send bulk to es"""
|
||||
file_path = os.path.join(self.CACHE_DIR, file_name)
|
||||
with open(file_path, "r", encoding="utf-8") as f:
|
||||
data = f.read()
|
||||
|
||||
if not data.strip():
|
||||
return
|
||||
|
||||
_, _ = ElasticWrap("_bulk").post(data=data, ndjson=True)
|
||||
|
||||
def get_all_backup_files(self):
|
||||
"""build all available backup files for view"""
|
||||
all_backup_files = ignore_filelist(os.listdir(self.BACKUP_DIR))
|
||||
all_available_backups = [
|
||||
i
|
||||
for i in all_backup_files
|
||||
if i.startswith("ta_") and i.endswith(".zip")
|
||||
]
|
||||
all_available_backups.sort(reverse=True)
|
||||
|
||||
backup_dicts = []
|
||||
for filename in all_available_backups:
|
||||
data = self.build_backup_file_data(filename)
|
||||
backup_dicts.append(data)
|
||||
|
||||
return backup_dicts
|
||||
|
||||
def build_backup_file_data(self, filename):
|
||||
"""build metadata of single backup file"""
|
||||
file_path = os.path.join(self.BACKUP_DIR, filename)
|
||||
if not os.path.exists(file_path):
|
||||
return False
|
||||
|
||||
file_split = filename.split("-")
|
||||
if len(file_split) == 2:
|
||||
timestamp = file_split[1].strip(".zip")
|
||||
reason = False
|
||||
elif len(file_split) == 3:
|
||||
timestamp = file_split[1]
|
||||
reason = file_split[2].strip(".zip")
|
||||
else:
|
||||
raise ValueError
|
||||
|
||||
data = {
|
||||
"filename": filename,
|
||||
"file_path": file_path,
|
||||
"file_size": os.path.getsize(file_path),
|
||||
"timestamp": timestamp,
|
||||
"reason": reason,
|
||||
}
|
||||
|
||||
return data
|
||||
|
||||
def restore(self, filename):
|
||||
"""
|
||||
restore from backup zip file
|
||||
call reset from ElasitIndexWrap first to start blank
|
||||
"""
|
||||
zip_content = self._unpack_zip_backup(filename)
|
||||
self._restore_json_files(zip_content)
|
||||
|
||||
def _unpack_zip_backup(self, filename):
|
||||
"""extract backup zip and return filelist"""
|
||||
file_path = os.path.join(self.BACKUP_DIR, filename)
|
||||
|
||||
with zipfile.ZipFile(file_path, "r") as z:
|
||||
zip_content = z.namelist()
|
||||
z.extractall(self.BACKUP_DIR)
|
||||
|
||||
return zip_content
|
||||
|
||||
def _restore_json_files(self, zip_content):
|
||||
"""go through the unpacked files and restore"""
|
||||
for idx, json_f in enumerate(zip_content):
|
||||
self._notify_restore(idx, json_f, len(zip_content))
|
||||
file_name = os.path.join(self.BACKUP_DIR, json_f)
|
||||
|
||||
if not json_f.startswith("es_") or not json_f.endswith(".json"):
|
||||
os.remove(file_name)
|
||||
continue
|
||||
|
||||
print("restoring: " + json_f)
|
||||
self.post_bulk_restore(file_name)
|
||||
os.remove(file_name)
|
||||
|
||||
def _notify_restore(self, idx, json_f, total_files):
|
||||
"""notify restore progress"""
|
||||
message = [f"Restore index from json backup file {json_f}."]
|
||||
progress = (idx + 1) / total_files
|
||||
self.task.send_progress(message_lines=message, progress=progress)
|
||||
|
||||
@staticmethod
|
||||
def index_exists(index_name):
|
||||
"""check if index already exists to skip"""
|
||||
_, status_code = ElasticWrap(f"ta_{index_name}").get()
|
||||
exists = status_code == 200
|
||||
|
||||
return exists
|
||||
|
||||
def rotate_backup(self):
|
||||
"""delete old backups if needed"""
|
||||
try:
|
||||
task = CustomPeriodicTask.objects.get(name="run_backup")
|
||||
except CustomPeriodicTask.DoesNotExist:
|
||||
return
|
||||
|
||||
rotate = task.task_config.get("rotate")
|
||||
if not rotate:
|
||||
return
|
||||
|
||||
all_backup_files = self.get_all_backup_files()
|
||||
auto = [i for i in all_backup_files if i["reason"] == "auto"]
|
||||
|
||||
if len(auto) <= rotate:
|
||||
print("no backup files to rotate")
|
||||
return
|
||||
|
||||
all_to_delete = auto[rotate:]
|
||||
for to_delete in all_to_delete:
|
||||
self.delete_file(to_delete["filename"])
|
||||
|
||||
def delete_file(self, filename):
|
||||
"""delete backup file"""
|
||||
file_path = os.path.join(self.BACKUP_DIR, filename)
|
||||
if not os.path.exists(file_path):
|
||||
print(f"backup file not found: {filename}")
|
||||
return False
|
||||
|
||||
print(f"remove old backup file: {file_path}")
|
||||
os.remove(file_path)
|
||||
|
||||
return file_path
|
||||
|
||||
|
||||
class BackupCallback:
|
||||
"""handle backup ndjson writer as callback for IndexPaginate"""
|
||||
|
||||
def __init__(self, source, index_name, counter=0):
|
||||
self.source = source
|
||||
self.index_name = index_name
|
||||
self.counter = counter
|
||||
self.timestamp = datetime.now().strftime("%Y%m%d")
|
||||
self.cache_dir = EnvironmentSettings.CACHE_DIR
|
||||
|
||||
def run(self):
|
||||
"""run the junk task"""
|
||||
file_content = self._build_bulk()
|
||||
self._write_es_json(file_content)
|
||||
|
||||
def _build_bulk(self):
|
||||
"""build bulk query data from all_results"""
|
||||
bulk_list = []
|
||||
|
||||
for document in self.source:
|
||||
document_id = document["_id"]
|
||||
es_index = document["_index"]
|
||||
action = {"index": {"_index": es_index, "_id": document_id}}
|
||||
source = document["_source"]
|
||||
bulk_list.append(json.dumps(action))
|
||||
bulk_list.append(json.dumps(source))
|
||||
|
||||
# add last newline
|
||||
bulk_list.append("\n")
|
||||
file_content = "\n".join(bulk_list)
|
||||
|
||||
return file_content
|
||||
|
||||
def _write_es_json(self, file_content):
|
||||
"""write nd-json file for es _bulk API to disk"""
|
||||
index = self.index_name.lstrip("ta_")
|
||||
file_name = f"es_{index}-{self.timestamp}-{self.counter}.json"
|
||||
file_path = os.path.join(self.cache_dir, "backup", file_name)
|
||||
with open(file_path, "a+", encoding="utf-8") as f:
|
||||
f.write(file_content)
|
@ -1,252 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- read and write config
|
||||
- load config variables into redis
|
||||
"""
|
||||
|
||||
from random import randint
|
||||
from time import sleep
|
||||
from typing import Literal, TypedDict
|
||||
|
||||
import requests
|
||||
from appsettings.src.snapshot import ElasticSnapshot
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from django.conf import settings
|
||||
|
||||
|
||||
class SubscriptionsConfigType(TypedDict):
|
||||
"""describes subscriptions config"""
|
||||
|
||||
channel_size: int
|
||||
live_channel_size: int
|
||||
shorts_channel_size: int
|
||||
auto_start: bool
|
||||
|
||||
|
||||
class DownloadsConfigType(TypedDict):
|
||||
"""describes downloads config"""
|
||||
|
||||
limit_speed: int | None
|
||||
sleep_interval: int | None
|
||||
autodelete_days: int | None
|
||||
format: str | None
|
||||
format_sort: str | None
|
||||
add_metadata: bool
|
||||
add_thumbnail: bool
|
||||
subtitle: str | None
|
||||
subtitle_source: Literal["user", "auto"] | None
|
||||
subtitle_index: bool
|
||||
comment_max: str | None
|
||||
comment_sort: Literal["top", "new"] | None
|
||||
cookie_import: bool
|
||||
potoken: bool
|
||||
throttledratelimit: int | None
|
||||
extractor_lang: str | None
|
||||
integrate_ryd: bool
|
||||
integrate_sponsorblock: bool
|
||||
|
||||
|
||||
class ApplicationConfigType(TypedDict):
|
||||
"""describes application config"""
|
||||
|
||||
enable_snapshot: bool
|
||||
enable_cast: bool
|
||||
|
||||
|
||||
class AppConfigType(TypedDict):
|
||||
"""combined app config type"""
|
||||
|
||||
subscriptions: SubscriptionsConfigType
|
||||
downloads: DownloadsConfigType
|
||||
application: ApplicationConfigType
|
||||
|
||||
|
||||
class AppConfig:
|
||||
"""handle application variables"""
|
||||
|
||||
ES_PATH = "ta_config/_doc/appsettings"
|
||||
ES_UPDATE_PATH = "ta_config/_update/appsettings"
|
||||
CONFIG_DEFAULTS: AppConfigType = {
|
||||
"subscriptions": {
|
||||
"channel_size": 50,
|
||||
"live_channel_size": 50,
|
||||
"shorts_channel_size": 50,
|
||||
"auto_start": False,
|
||||
},
|
||||
"downloads": {
|
||||
"limit_speed": None,
|
||||
"sleep_interval": 10,
|
||||
"autodelete_days": None,
|
||||
"format": None,
|
||||
"format_sort": None,
|
||||
"add_metadata": False,
|
||||
"add_thumbnail": False,
|
||||
"subtitle": None,
|
||||
"subtitle_source": None,
|
||||
"subtitle_index": False,
|
||||
"comment_max": None,
|
||||
"comment_sort": "top",
|
||||
"cookie_import": False,
|
||||
"potoken": False,
|
||||
"throttledratelimit": None,
|
||||
"extractor_lang": None,
|
||||
"integrate_ryd": False,
|
||||
"integrate_sponsorblock": False,
|
||||
},
|
||||
"application": {
|
||||
"enable_snapshot": True,
|
||||
"enable_cast": False,
|
||||
},
|
||||
}
|
||||
|
||||
def __init__(self):
|
||||
self.config = self.get_config()
|
||||
|
||||
def get_config(self) -> AppConfigType:
|
||||
"""get config from ES"""
|
||||
response, status_code = ElasticWrap(self.ES_PATH).get()
|
||||
if not status_code == 200:
|
||||
raise ValueError(f"no config found at {self.ES_PATH}")
|
||||
|
||||
return response["_source"]
|
||||
|
||||
def update_config(self, data: dict) -> AppConfigType:
|
||||
"""update single config value"""
|
||||
new_config = self.config.copy()
|
||||
for key, value in data.items():
|
||||
if (
|
||||
isinstance(value, dict)
|
||||
and key in new_config
|
||||
and isinstance(new_config[key], dict)
|
||||
):
|
||||
new_config[key].update(value)
|
||||
else:
|
||||
new_config[key] = value
|
||||
|
||||
response, status_code = ElasticWrap(self.ES_PATH).post(new_config)
|
||||
if not status_code == 200:
|
||||
print(response)
|
||||
|
||||
self.config = new_config
|
||||
|
||||
return new_config
|
||||
|
||||
def post_process_updated(self, data: dict) -> None:
|
||||
"""apply hooks for some config keys"""
|
||||
for config_value, updated_value in data:
|
||||
if config_value == "application.enable_snapshot" and updated_value:
|
||||
ElasticSnapshot().setup()
|
||||
|
||||
@staticmethod
|
||||
def _fail_message(message_line):
|
||||
"""notify our failure"""
|
||||
key = "message:setting"
|
||||
message = {
|
||||
"status": key,
|
||||
"group": "setting:application",
|
||||
"level": "error",
|
||||
"title": "Cookie import failed",
|
||||
"messages": [message_line],
|
||||
"id": "0000",
|
||||
}
|
||||
RedisArchivist().set_message(key, message=message, expire=True)
|
||||
|
||||
def sync_defaults(self):
|
||||
"""sync defaults at startup, needs to be called with __new__"""
|
||||
return ElasticWrap(self.ES_PATH).post(self.CONFIG_DEFAULTS)
|
||||
|
||||
def add_new_defaults(self) -> list[str]:
|
||||
"""add new default config values to ES, called at startup"""
|
||||
updated = []
|
||||
for key, value in self.CONFIG_DEFAULTS.items():
|
||||
if key not in self.config:
|
||||
# complete new key
|
||||
self.update_config({key: value})
|
||||
updated.append(str({key: value}))
|
||||
continue
|
||||
|
||||
for sub_key, sub_value in value.items(): # type: ignore
|
||||
if sub_key not in self.config[key]:
|
||||
# new partial key
|
||||
to_update = {key: {sub_key: sub_value}}
|
||||
self.update_config(to_update)
|
||||
updated.append(str(to_update))
|
||||
|
||||
return updated
|
||||
|
||||
|
||||
class ReleaseVersion:
|
||||
"""compare local version with remote version"""
|
||||
|
||||
REMOTE_URL = "https://www.tubearchivist.com/api/release/latest/"
|
||||
NEW_KEY = "versioncheck:new"
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.local_version: str = settings.TA_VERSION
|
||||
self.is_unstable: bool = settings.TA_VERSION.endswith("-unstable")
|
||||
self.remote_version: str = ""
|
||||
self.is_breaking: bool = False
|
||||
|
||||
def check(self) -> None:
|
||||
"""check version"""
|
||||
print(f"[{self.local_version}]: look for updates")
|
||||
self.get_remote_version()
|
||||
new_version = self._has_update()
|
||||
if new_version:
|
||||
message = {
|
||||
"status": True,
|
||||
"version": new_version,
|
||||
"is_breaking": self.is_breaking,
|
||||
}
|
||||
RedisArchivist().set_message(self.NEW_KEY, message)
|
||||
print(f"[{self.local_version}]: found new version {new_version}")
|
||||
|
||||
def get_local_version(self) -> str:
|
||||
"""read version from local"""
|
||||
return self.local_version
|
||||
|
||||
def get_remote_version(self) -> None:
|
||||
"""read version from remote"""
|
||||
sleep(randint(0, 60))
|
||||
response = requests.get(self.REMOTE_URL, timeout=20).json()
|
||||
self.remote_version = response["release_version"]
|
||||
self.is_breaking = response["breaking_changes"]
|
||||
|
||||
def _has_update(self) -> str | bool:
|
||||
"""check if there is an update"""
|
||||
remote_parsed = self._parse_version(self.remote_version)
|
||||
local_parsed = self._parse_version(self.local_version)
|
||||
if remote_parsed > local_parsed:
|
||||
return self.remote_version
|
||||
|
||||
if self.is_unstable and local_parsed == remote_parsed:
|
||||
return self.remote_version
|
||||
|
||||
return False
|
||||
|
||||
@staticmethod
|
||||
def _parse_version(version) -> tuple[int, ...]:
|
||||
"""return version parts"""
|
||||
clean = version.rstrip("-unstable").lstrip("v")
|
||||
return tuple((int(i) for i in clean.split(".")))
|
||||
|
||||
def is_updated(self) -> str | bool:
|
||||
"""check if update happened in the mean time"""
|
||||
message = self.get_update()
|
||||
if not message:
|
||||
return False
|
||||
|
||||
local_parsed = self._parse_version(self.local_version)
|
||||
message_parsed = self._parse_version(message.get("version"))
|
||||
|
||||
if local_parsed >= message_parsed:
|
||||
RedisArchivist().del_message(self.NEW_KEY)
|
||||
return settings.TA_VERSION
|
||||
|
||||
return False
|
||||
|
||||
def get_update(self) -> dict | None:
|
||||
"""return new version dict if available"""
|
||||
message = RedisArchivist().get_message_dict(self.NEW_KEY)
|
||||
return message or None
|
@ -1,93 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- scan the filesystem to delete or index
|
||||
"""
|
||||
|
||||
import os
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import IndexPaginate
|
||||
from common.src.helper import ignore_filelist
|
||||
from video.src.comments import CommentList
|
||||
from video.src.index import YoutubeVideo, index_new_video
|
||||
|
||||
|
||||
class Scanner:
|
||||
"""scan index and filesystem"""
|
||||
|
||||
VIDEOS: str = EnvironmentSettings.MEDIA_DIR
|
||||
|
||||
def __init__(self, task=False) -> None:
|
||||
self.task = task
|
||||
self.to_delete: set[str] = set()
|
||||
self.to_index: set[str] = set()
|
||||
|
||||
def scan(self) -> None:
|
||||
"""scan the filesystem"""
|
||||
downloaded: set[str] = self._get_downloaded()
|
||||
indexed: set[str] = self._get_indexed()
|
||||
self.to_index = downloaded - indexed
|
||||
self.to_delete = indexed - downloaded
|
||||
|
||||
def _get_downloaded(self) -> set[str]:
|
||||
"""get downloaded ids"""
|
||||
if self.task:
|
||||
self.task.send_progress(["Scan your filesystem for videos."])
|
||||
|
||||
downloaded: set = set()
|
||||
channels = ignore_filelist(os.listdir(self.VIDEOS))
|
||||
for channel in channels:
|
||||
folder = os.path.join(self.VIDEOS, channel)
|
||||
files = ignore_filelist(os.listdir(folder))
|
||||
downloaded.update({i.split(".")[0] for i in files})
|
||||
|
||||
return downloaded
|
||||
|
||||
def _get_indexed(self) -> set:
|
||||
"""get all indexed ids"""
|
||||
if self.task:
|
||||
self.task.send_progress(["Get all videos indexed."])
|
||||
|
||||
data = {"query": {"match_all": {}}, "_source": ["youtube_id"]}
|
||||
response = IndexPaginate("ta_video", data).get_results()
|
||||
return {i["youtube_id"] for i in response}
|
||||
|
||||
def apply(self) -> None:
|
||||
"""apply all changes"""
|
||||
self.delete()
|
||||
self.index()
|
||||
|
||||
def delete(self) -> None:
|
||||
"""delete videos from index"""
|
||||
if not self.to_delete:
|
||||
print("nothing to delete")
|
||||
return
|
||||
|
||||
if self.task:
|
||||
self.task.send_progress(
|
||||
[f"Remove {len(self.to_delete)} videos from index."]
|
||||
)
|
||||
|
||||
for youtube_id in self.to_delete:
|
||||
YoutubeVideo(youtube_id).delete_media_file()
|
||||
|
||||
def index(self) -> None:
|
||||
"""index new"""
|
||||
if not self.to_index:
|
||||
print("nothing to index")
|
||||
return
|
||||
|
||||
total = len(self.to_index)
|
||||
for idx, youtube_id in enumerate(self.to_index):
|
||||
if self.task:
|
||||
self.task.send_progress(
|
||||
message_lines=[
|
||||
f"Index missing video {youtube_id}, {idx + 1}/{total}"
|
||||
],
|
||||
progress=(idx + 1) / total,
|
||||
)
|
||||
index_new_video(youtube_id)
|
||||
|
||||
comment_list = CommentList(task=self.task)
|
||||
comment_list.add(video_ids=list(self.to_index))
|
||||
comment_list.index()
|
@ -1,220 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- setup elastic index at first start
|
||||
- verify and update index mapping and settings if needed
|
||||
- backup and restore metadata
|
||||
"""
|
||||
|
||||
from appsettings.src.backup import ElasticBackup
|
||||
from appsettings.src.config import AppConfig
|
||||
from appsettings.src.snapshot import ElasticSnapshot
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.helper import get_mapping
|
||||
|
||||
|
||||
class ElasticIndex:
|
||||
"""interact with a single index"""
|
||||
|
||||
def __init__(self, index_name, expected_map=False, expected_set=False):
|
||||
self.index_name = index_name
|
||||
self.expected_map = expected_map
|
||||
self.expected_set = expected_set
|
||||
self.exists, self.details = self.index_exists()
|
||||
|
||||
def index_exists(self):
|
||||
"""check if index already exists and return mapping if it does"""
|
||||
response, status_code = ElasticWrap(f"ta_{self.index_name}").get()
|
||||
exists = status_code == 200
|
||||
details = response.get(f"ta_{self.index_name}", False)
|
||||
|
||||
return exists, details
|
||||
|
||||
def validate(self):
|
||||
"""
|
||||
check if all expected mappings and settings match
|
||||
returns True when rebuild is needed
|
||||
"""
|
||||
|
||||
if self.expected_map:
|
||||
rebuild = self.validate_mappings()
|
||||
if rebuild:
|
||||
return rebuild
|
||||
|
||||
if self.expected_set:
|
||||
rebuild = self.validate_settings()
|
||||
if rebuild:
|
||||
return rebuild
|
||||
|
||||
return False
|
||||
|
||||
def validate_mappings(self):
|
||||
"""check if all mappings are as expected"""
|
||||
now_map = self.details["mappings"]["properties"]
|
||||
|
||||
for key, value in self.expected_map.items():
|
||||
# nested
|
||||
if list(value.keys()) == ["properties"]:
|
||||
for key_n, value_n in value["properties"].items():
|
||||
if key not in now_map:
|
||||
print(f"detected mapping change: {key_n}, {value_n}")
|
||||
return True
|
||||
if key_n not in now_map[key]["properties"].keys():
|
||||
print(f"detected mapping change: {key_n}, {value_n}")
|
||||
return True
|
||||
if not value_n == now_map[key]["properties"][key_n]:
|
||||
print(f"detected mapping change: {key_n}, {value_n}")
|
||||
return True
|
||||
|
||||
continue
|
||||
|
||||
# not nested
|
||||
if key not in now_map.keys():
|
||||
print(f"detected mapping change: {key}, {value}")
|
||||
return True
|
||||
if not value == now_map[key]:
|
||||
print(f"detected mapping change: {key}, {value}")
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def validate_settings(self):
|
||||
"""check if all settings are as expected"""
|
||||
|
||||
now_set = self.details["settings"]["index"]
|
||||
|
||||
for key, value in self.expected_set.items():
|
||||
if key not in now_set.keys():
|
||||
print(key, value)
|
||||
return True
|
||||
|
||||
if not value == now_set[key]:
|
||||
print(key, value)
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def rebuild_index(self):
|
||||
"""rebuild with new mapping"""
|
||||
print(f"applying new mappings to index ta_{self.index_name}...")
|
||||
self.create_blank(for_backup=True)
|
||||
self.reindex("backup")
|
||||
self.delete_index(backup=False)
|
||||
self.create_blank()
|
||||
self.reindex("restore")
|
||||
self.delete_index()
|
||||
|
||||
def reindex(self, method):
|
||||
"""create on elastic search"""
|
||||
if method == "backup":
|
||||
source = f"ta_{self.index_name}"
|
||||
destination = f"ta_{self.index_name}_backup"
|
||||
elif method == "restore":
|
||||
source = f"ta_{self.index_name}_backup"
|
||||
destination = f"ta_{self.index_name}"
|
||||
else:
|
||||
raise ValueError("invalid method, expected 'backup' or 'restore'")
|
||||
|
||||
data = {"source": {"index": source}, "dest": {"index": destination}}
|
||||
_, _ = ElasticWrap("_reindex?refresh=true").post(data=data)
|
||||
|
||||
def delete_index(self, backup=True):
|
||||
"""delete index passed as argument"""
|
||||
path = f"ta_{self.index_name}"
|
||||
if backup:
|
||||
path = path + "_backup"
|
||||
|
||||
_, _ = ElasticWrap(path).delete()
|
||||
|
||||
def create_blank(self, for_backup=False):
|
||||
"""apply new mapping and settings for blank new index"""
|
||||
print(f"create new blank index with name ta_{self.index_name}...")
|
||||
path = f"ta_{self.index_name}"
|
||||
if for_backup:
|
||||
path = f"{path}_backup"
|
||||
|
||||
data = {}
|
||||
if self.expected_set:
|
||||
data.update({"settings": self.expected_set})
|
||||
if self.expected_map:
|
||||
data.update({"mappings": {"properties": self.expected_map}})
|
||||
|
||||
_, _ = ElasticWrap(path).put(data)
|
||||
|
||||
|
||||
class ElasitIndexWrap:
|
||||
"""interact with all index mapping and setup"""
|
||||
|
||||
def __init__(self):
|
||||
self.index_config = get_mapping()
|
||||
self.backup_run = False
|
||||
|
||||
def setup(self):
|
||||
"""setup elastic index, run at startup"""
|
||||
for index in self.index_config:
|
||||
index_name, expected_map, expected_set = self._config_split(index)
|
||||
handler = ElasticIndex(index_name, expected_map, expected_set)
|
||||
if not handler.exists:
|
||||
handler.create_blank()
|
||||
continue
|
||||
|
||||
rebuild = handler.validate()
|
||||
if rebuild:
|
||||
self._check_backup()
|
||||
handler.rebuild_index()
|
||||
continue
|
||||
|
||||
# else all good
|
||||
print(f"ta_{index_name} index is created and up to date...")
|
||||
|
||||
def reset(self):
|
||||
"""reset all indexes to blank"""
|
||||
self.delete_all()
|
||||
self.create_all_blank()
|
||||
|
||||
def delete_all(self):
|
||||
"""delete all indexes"""
|
||||
print("reset elastic index")
|
||||
for index in self.index_config:
|
||||
index_name, _, _ = self._config_split(index)
|
||||
handler = ElasticIndex(index_name)
|
||||
handler.delete_index(backup=False)
|
||||
|
||||
def create_all_blank(self):
|
||||
"""create all blank indexes"""
|
||||
print("create all new indexes in elastic from template")
|
||||
for index in self.index_config:
|
||||
index_name, expected_map, expected_set = self._config_split(index)
|
||||
handler = ElasticIndex(index_name, expected_map, expected_set)
|
||||
handler.create_blank()
|
||||
|
||||
@staticmethod
|
||||
def _config_split(index):
|
||||
"""split index config keys"""
|
||||
index_name = index["index_name"]
|
||||
expected_map = index["expected_map"]
|
||||
expected_set = index["expected_set"]
|
||||
|
||||
return index_name, expected_map, expected_set
|
||||
|
||||
def _check_backup(self):
|
||||
"""create backup if needed"""
|
||||
if self.backup_run:
|
||||
return
|
||||
|
||||
try:
|
||||
config = AppConfig().config
|
||||
except ValueError:
|
||||
# create defaults in ES if config not found
|
||||
print("AppConfig not found, creating defaults...")
|
||||
handler = AppConfig.__new__(AppConfig)
|
||||
handler.sync_defaults()
|
||||
config = AppConfig.CONFIG_DEFAULTS
|
||||
|
||||
if config["application"]["enable_snapshot"]:
|
||||
# take snapshot if enabled
|
||||
ElasticSnapshot().take_snapshot_now(wait=True)
|
||||
else:
|
||||
# fallback to json backup
|
||||
ElasticBackup(reason="update").backup_all_indexes()
|
||||
|
||||
self.backup_run = True
|
@ -1,563 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- periodically refresh documents
|
||||
- index and update in es
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
from datetime import datetime
|
||||
from typing import Callable, TypedDict
|
||||
|
||||
from appsettings.src.config import AppConfig
|
||||
from channel.src.index import YoutubeChannel
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap, IndexPaginate
|
||||
from common.src.helper import rand_sleep
|
||||
from common.src.ta_redis import RedisQueue
|
||||
from download.src.subscriptions import ChannelSubscription
|
||||
from download.src.thumbnails import ThumbManager
|
||||
from download.src.yt_dlp_base import CookieHandler
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
from task.models import CustomPeriodicTask
|
||||
from video.src.comments import Comments
|
||||
from video.src.index import YoutubeVideo
|
||||
|
||||
|
||||
class ReindexConfigType(TypedDict):
|
||||
"""represents config type"""
|
||||
|
||||
index_name: str
|
||||
queue_name: str
|
||||
active_key: str
|
||||
refresh_key: str
|
||||
|
||||
|
||||
class ReindexBase:
|
||||
"""base config class for reindex task"""
|
||||
|
||||
REINDEX_CONFIG: dict[str, ReindexConfigType] = {
|
||||
"video": {
|
||||
"index_name": "ta_video",
|
||||
"queue_name": "reindex:ta_video",
|
||||
"active_key": "active",
|
||||
"refresh_key": "vid_last_refresh",
|
||||
},
|
||||
"channel": {
|
||||
"index_name": "ta_channel",
|
||||
"queue_name": "reindex:ta_channel",
|
||||
"active_key": "channel_active",
|
||||
"refresh_key": "channel_last_refresh",
|
||||
},
|
||||
"playlist": {
|
||||
"index_name": "ta_playlist",
|
||||
"queue_name": "reindex:ta_playlist",
|
||||
"active_key": "playlist_active",
|
||||
"refresh_key": "playlist_last_refresh",
|
||||
},
|
||||
}
|
||||
|
||||
MULTIPLY = 1.2
|
||||
DAYS3 = 60 * 60 * 24 * 3
|
||||
|
||||
def __init__(self):
|
||||
self.config = AppConfig().config
|
||||
self.now = int(datetime.now().timestamp())
|
||||
|
||||
def populate(self, all_ids, reindex_config: ReindexConfigType):
|
||||
"""add all to reindex ids to redis queue"""
|
||||
if not all_ids:
|
||||
return
|
||||
|
||||
RedisQueue(queue_name=reindex_config["queue_name"]).add_list(all_ids)
|
||||
|
||||
|
||||
class ReindexPopulate(ReindexBase):
|
||||
"""add outdated and recent documents to reindex queue"""
|
||||
|
||||
INTERVAL_DEFAIULT: int = 90
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.interval = self.INTERVAL_DEFAIULT
|
||||
|
||||
def get_interval(self) -> None:
|
||||
"""get reindex days interval from task"""
|
||||
try:
|
||||
task = CustomPeriodicTask.objects.get(name="check_reindex")
|
||||
except CustomPeriodicTask.DoesNotExist:
|
||||
return
|
||||
|
||||
task_config = task.task_config
|
||||
if task_config.get("days"):
|
||||
self.interval = task_config.get("days")
|
||||
|
||||
def add_recent(self) -> None:
|
||||
"""add recent videos to refresh"""
|
||||
gte = datetime.fromtimestamp(self.now - self.DAYS3).date().isoformat()
|
||||
must_list = [
|
||||
{"term": {"active": {"value": True}}},
|
||||
{"range": {"published": {"gte": gte}}},
|
||||
]
|
||||
data = {
|
||||
"size": 10000,
|
||||
"query": {"bool": {"must": must_list}},
|
||||
"sort": [{"published": {"order": "desc"}}],
|
||||
}
|
||||
response, _ = ElasticWrap("ta_video/_search").get(data=data)
|
||||
hits = response["hits"]["hits"]
|
||||
if not hits:
|
||||
return
|
||||
|
||||
all_ids = [i["_source"]["youtube_id"] for i in hits]
|
||||
reindex_config: ReindexConfigType = self.REINDEX_CONFIG["video"]
|
||||
self.populate(all_ids, reindex_config)
|
||||
|
||||
def add_outdated(self) -> None:
|
||||
"""add outdated documents"""
|
||||
for reindex_config in self.REINDEX_CONFIG.values():
|
||||
total_hits = self._get_total_hits(reindex_config)
|
||||
daily_should = self._get_daily_should(total_hits)
|
||||
all_ids = self._get_outdated_ids(reindex_config, daily_should)
|
||||
self.populate(all_ids, reindex_config)
|
||||
|
||||
@staticmethod
|
||||
def _get_total_hits(reindex_config: ReindexConfigType) -> int:
|
||||
"""get total hits from index"""
|
||||
index_name = reindex_config["index_name"]
|
||||
active_key = reindex_config["active_key"]
|
||||
data = {
|
||||
"query": {"term": {active_key: {"value": True}}},
|
||||
"_source": False,
|
||||
}
|
||||
total = IndexPaginate(index_name, data, keep_source=True).get_results()
|
||||
|
||||
return len(total)
|
||||
|
||||
def _get_daily_should(self, total_hits: int) -> int:
|
||||
"""calc how many should reindex daily"""
|
||||
daily_should = int((total_hits // self.interval + 1) * self.MULTIPLY)
|
||||
if daily_should >= 10000:
|
||||
daily_should = 9999
|
||||
|
||||
return daily_should
|
||||
|
||||
def _get_outdated_ids(
|
||||
self, reindex_config: ReindexConfigType, daily_should: int
|
||||
) -> list[str]:
|
||||
"""get outdated from index_name"""
|
||||
index_name = reindex_config["index_name"]
|
||||
refresh_key = reindex_config["refresh_key"]
|
||||
now_lte = str(self.now - self.interval * 24 * 60 * 60)
|
||||
must_list = [
|
||||
{"match": {reindex_config["active_key"]: True}},
|
||||
{"range": {refresh_key: {"lte": now_lte}}},
|
||||
]
|
||||
data = {
|
||||
"size": daily_should,
|
||||
"query": {"bool": {"must": must_list}},
|
||||
"sort": [{refresh_key: {"order": "asc"}}],
|
||||
"_source": False,
|
||||
}
|
||||
response, _ = ElasticWrap(f"{index_name}/_search").get(data=data)
|
||||
|
||||
all_ids = [i["_id"] for i in response["hits"]["hits"]]
|
||||
return all_ids
|
||||
|
||||
|
||||
class ReindexManual(ReindexBase):
|
||||
"""
|
||||
manually add ids to reindex queue from API
|
||||
data_example = {
|
||||
"video": ["video1", "video2", "video3"],
|
||||
"channel": ["channel1", "channel2", "channel3"],
|
||||
"playlist": ["playlist1", "playlist2"],
|
||||
}
|
||||
extract_videos to also reindex all videos of channel/playlist
|
||||
"""
|
||||
|
||||
def __init__(self, extract_videos=False):
|
||||
super().__init__()
|
||||
self.extract_videos = extract_videos
|
||||
self.data = False
|
||||
|
||||
def extract_data(self, data) -> None:
|
||||
"""process data"""
|
||||
self.data = data
|
||||
for key, values in self.data.items():
|
||||
reindex_config = self.REINDEX_CONFIG.get(key)
|
||||
if not reindex_config:
|
||||
print(f"reindex type {key} not valid")
|
||||
raise ValueError
|
||||
|
||||
self.process_index(reindex_config, values)
|
||||
|
||||
def process_index(
|
||||
self, index_config: ReindexConfigType, values: list[str]
|
||||
) -> None:
|
||||
"""process values per index"""
|
||||
index_name = index_config["index_name"]
|
||||
if index_name == "ta_video":
|
||||
self._add_videos(values)
|
||||
elif index_name == "ta_channel":
|
||||
self._add_channels(values)
|
||||
elif index_name == "ta_playlist":
|
||||
self._add_playlists(values)
|
||||
|
||||
def _add_videos(self, values: list[str]) -> None:
|
||||
"""add list of videos to reindex queue"""
|
||||
if not values:
|
||||
return
|
||||
|
||||
queue_name = self.REINDEX_CONFIG["video"]["queue_name"]
|
||||
RedisQueue(queue_name).add_list(values)
|
||||
|
||||
def _add_channels(self, values: list[str]) -> None:
|
||||
"""add list of channels to reindex queue"""
|
||||
queue_name = self.REINDEX_CONFIG["channel"]["queue_name"]
|
||||
RedisQueue(queue_name).add_list(values)
|
||||
|
||||
if self.extract_videos:
|
||||
for channel_id in values:
|
||||
all_videos = self._get_channel_videos(channel_id)
|
||||
self._add_videos(all_videos)
|
||||
|
||||
def _add_playlists(self, values: list[str]) -> None:
|
||||
"""add list of playlists to reindex queue"""
|
||||
queue_name = self.REINDEX_CONFIG["playlist"]["queue_name"]
|
||||
RedisQueue(queue_name).add_list(values)
|
||||
|
||||
if self.extract_videos:
|
||||
for playlist_id in values:
|
||||
all_videos = self._get_playlist_videos(playlist_id)
|
||||
self._add_videos(all_videos)
|
||||
|
||||
def _get_channel_videos(self, channel_id: str) -> list[str]:
|
||||
"""get all videos from channel"""
|
||||
data = {
|
||||
"query": {"term": {"channel.channel_id": {"value": channel_id}}},
|
||||
"_source": ["youtube_id"],
|
||||
}
|
||||
all_results = IndexPaginate("ta_video", data).get_results()
|
||||
return [i["youtube_id"] for i in all_results]
|
||||
|
||||
def _get_playlist_videos(self, playlist_id: str) -> list[str]:
|
||||
"""get all videos from playlist"""
|
||||
data = {
|
||||
"query": {"term": {"playlist.keyword": {"value": playlist_id}}},
|
||||
"_source": ["youtube_id"],
|
||||
}
|
||||
all_results = IndexPaginate("ta_video", data).get_results()
|
||||
return [i["youtube_id"] for i in all_results]
|
||||
|
||||
|
||||
class Reindex(ReindexBase):
|
||||
"""reindex all documents from redis queue"""
|
||||
|
||||
def __init__(self, task=False):
|
||||
super().__init__()
|
||||
self.task = task
|
||||
self.processed = {
|
||||
"videos": 0,
|
||||
"channels": 0,
|
||||
"playlists": 0,
|
||||
}
|
||||
|
||||
def reindex_all(self) -> None:
|
||||
"""reindex all in queue"""
|
||||
if not self.cookie_is_valid():
|
||||
print("[reindex] cookie invalid, exiting...")
|
||||
return
|
||||
|
||||
for name, index_config in self.REINDEX_CONFIG.items():
|
||||
if not RedisQueue(index_config["queue_name"]).length():
|
||||
continue
|
||||
|
||||
self.reindex_type(name, index_config)
|
||||
|
||||
def reindex_type(self, name: str, index_config: ReindexConfigType) -> None:
|
||||
"""reindex all of a single index"""
|
||||
reindex = self._get_reindex_map(index_config["index_name"])
|
||||
queue = RedisQueue(index_config["queue_name"])
|
||||
while True:
|
||||
total = queue.max_score()
|
||||
youtube_id, idx = queue.get_next()
|
||||
if not youtube_id or not idx or not total:
|
||||
break
|
||||
|
||||
if self.task:
|
||||
self._notify(name, total, idx)
|
||||
|
||||
reindex(youtube_id)
|
||||
rand_sleep(self.config)
|
||||
|
||||
def _get_reindex_map(self, index_name: str) -> Callable:
|
||||
"""return def to run for index"""
|
||||
def_map = {
|
||||
"ta_video": self._reindex_single_video,
|
||||
"ta_channel": self._reindex_single_channel,
|
||||
"ta_playlist": self._reindex_single_playlist,
|
||||
}
|
||||
|
||||
return def_map[index_name]
|
||||
|
||||
def _notify(self, name: str, total: int, idx: int) -> None:
|
||||
"""send notification back to task"""
|
||||
message = [f"Reindexing {name.title()}s {idx}/{total}"]
|
||||
progress = idx / total
|
||||
self.task.send_progress(message, progress=progress)
|
||||
|
||||
def _reindex_single_video(self, youtube_id: str) -> None:
|
||||
"""refresh data for single video"""
|
||||
video = YoutubeVideo(youtube_id)
|
||||
|
||||
# read current state
|
||||
video.get_from_es()
|
||||
if not video.json_data:
|
||||
return
|
||||
|
||||
es_meta = video.json_data.copy()
|
||||
|
||||
# get new
|
||||
media_url = os.path.join(
|
||||
EnvironmentSettings.MEDIA_DIR, es_meta["media_url"]
|
||||
)
|
||||
video.build_json(media_path=media_url)
|
||||
if not video.youtube_meta:
|
||||
video.deactivate()
|
||||
return
|
||||
|
||||
video.delete_subtitles(subtitles=es_meta.get("subtitles"))
|
||||
video.check_subtitles()
|
||||
|
||||
# add back
|
||||
video.json_data["player"] = es_meta.get("player")
|
||||
video.json_data["date_downloaded"] = es_meta.get("date_downloaded")
|
||||
video.json_data["vid_type"] = es_meta.get("vid_type")
|
||||
video.json_data["channel"] = es_meta.get("channel")
|
||||
if es_meta.get("playlist"):
|
||||
video.json_data["playlist"] = es_meta.get("playlist")
|
||||
|
||||
video.upload_to_es()
|
||||
|
||||
thumb_handler = ThumbManager(youtube_id)
|
||||
thumb_handler.delete_video_thumb()
|
||||
thumb_handler.download_video_thumb(video.json_data["vid_thumb_url"])
|
||||
|
||||
Comments(youtube_id, config=self.config).reindex_comments()
|
||||
self.processed["videos"] += 1
|
||||
|
||||
def _reindex_single_channel(self, channel_id: str) -> None:
|
||||
"""refresh channel data and sync to videos"""
|
||||
# read current state
|
||||
channel = YoutubeChannel(channel_id)
|
||||
channel.get_from_es()
|
||||
if not channel.json_data:
|
||||
return
|
||||
|
||||
es_meta = channel.json_data.copy()
|
||||
|
||||
# get new
|
||||
channel.get_from_youtube()
|
||||
if not channel.youtube_meta:
|
||||
channel.deactivate()
|
||||
channel.get_from_es()
|
||||
channel.sync_to_videos()
|
||||
return
|
||||
|
||||
channel.process_youtube_meta()
|
||||
channel.get_channel_art()
|
||||
|
||||
# add back
|
||||
channel.json_data["channel_subscribed"] = es_meta["channel_subscribed"]
|
||||
overwrites = es_meta.get("channel_overwrites")
|
||||
if overwrites:
|
||||
channel.json_data["channel_overwrites"] = overwrites
|
||||
|
||||
channel.upload_to_es()
|
||||
channel.sync_to_videos()
|
||||
ChannelFullScan(channel_id).scan()
|
||||
self.processed["channels"] += 1
|
||||
|
||||
def _reindex_single_playlist(self, playlist_id: str) -> None:
|
||||
"""refresh playlist data"""
|
||||
playlist = YoutubePlaylist(playlist_id)
|
||||
playlist.get_from_es()
|
||||
if (
|
||||
not playlist.json_data
|
||||
or playlist.json_data["playlist_type"] == "custom"
|
||||
):
|
||||
return
|
||||
|
||||
is_active = playlist.update_playlist()
|
||||
if not is_active:
|
||||
playlist.deactivate()
|
||||
return
|
||||
|
||||
self.processed["playlists"] += 1
|
||||
|
||||
def cookie_is_valid(self) -> bool:
|
||||
"""return true if cookie is enabled and valid"""
|
||||
if not self.config["downloads"]["cookie_import"]:
|
||||
# is not activated, continue reindex
|
||||
return True
|
||||
|
||||
valid = CookieHandler(self.config).validate()
|
||||
return valid
|
||||
|
||||
def build_message(self) -> str:
|
||||
"""build progress message"""
|
||||
message = ""
|
||||
for key, value in self.processed.items():
|
||||
if value:
|
||||
message = message + f"{value} {key}, "
|
||||
|
||||
if message:
|
||||
message = f"reindexed {message.rstrip(', ')}"
|
||||
|
||||
return message
|
||||
|
||||
|
||||
class ReindexProgress(ReindexBase):
|
||||
"""
|
||||
get progress of reindex task
|
||||
request_type: key of self.REINDEX_CONFIG
|
||||
request_id: id of request_type
|
||||
return = {
|
||||
"state": "running" | "queued" | False
|
||||
"total_queued": int
|
||||
"in_queue_name": "queue_name"
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, request_type=False, request_id=False):
|
||||
super().__init__()
|
||||
self.request_type = request_type
|
||||
self.request_id = request_id
|
||||
|
||||
def get_progress(self) -> dict:
|
||||
"""get progress from task"""
|
||||
queue_name, request_type = self._get_queue_name()
|
||||
total = self._get_total_in_queue(queue_name)
|
||||
|
||||
progress = {
|
||||
"total_queued": total,
|
||||
"type": request_type,
|
||||
}
|
||||
state = self._get_state(total, queue_name)
|
||||
progress.update(state)
|
||||
|
||||
return progress
|
||||
|
||||
def _get_queue_name(self):
|
||||
"""return queue_name, queue_type, raise exception on error"""
|
||||
if not self.request_type:
|
||||
return "all", "all"
|
||||
|
||||
reindex_config = self.REINDEX_CONFIG.get(self.request_type)
|
||||
if not reindex_config:
|
||||
print(f"reindex_config not found: {self.request_type}")
|
||||
raise ValueError
|
||||
|
||||
return reindex_config["queue_name"], self.request_type
|
||||
|
||||
def _get_total_in_queue(self, queue_name):
|
||||
"""get all items in queue"""
|
||||
total = 0
|
||||
if queue_name == "all":
|
||||
queues = [i["queue_name"] for i in self.REINDEX_CONFIG.values()]
|
||||
for queue in queues:
|
||||
total += len(RedisQueue(queue).get_all())
|
||||
else:
|
||||
total += len(RedisQueue(queue_name).get_all())
|
||||
|
||||
return total
|
||||
|
||||
def _get_state(self, total, queue_name):
|
||||
"""get state based on request_id"""
|
||||
state_dict = {}
|
||||
if self.request_id:
|
||||
state = RedisQueue(queue_name).in_queue(self.request_id)
|
||||
state_dict.update({"id": self.request_id, "state": state})
|
||||
|
||||
return state_dict
|
||||
|
||||
if total:
|
||||
state = "running"
|
||||
else:
|
||||
state = "empty"
|
||||
|
||||
state_dict.update({"state": state})
|
||||
|
||||
return state_dict
|
||||
|
||||
|
||||
class ChannelFullScan:
|
||||
"""
|
||||
update from v0.3.0 to v0.3.1
|
||||
full scan of channel to fix vid_type mismatch
|
||||
"""
|
||||
|
||||
def __init__(self, channel_id):
|
||||
self.channel_id = channel_id
|
||||
self.to_update = False
|
||||
|
||||
def scan(self):
|
||||
"""match local with remote"""
|
||||
print(f"{self.channel_id}: start full scan")
|
||||
all_local_videos = self._get_all_local()
|
||||
all_remote_videos = self._get_all_remote()
|
||||
self.to_update = []
|
||||
for video in all_local_videos:
|
||||
video_id = video["youtube_id"]
|
||||
remote_match = [i for i in all_remote_videos if i[0] == video_id]
|
||||
if not remote_match:
|
||||
print(f"{video_id}: no remote match found")
|
||||
continue
|
||||
|
||||
expected_type = remote_match[0][-1]
|
||||
if video["vid_type"] != expected_type:
|
||||
self.to_update.append(
|
||||
{
|
||||
"video_id": video_id,
|
||||
"vid_type": expected_type,
|
||||
}
|
||||
)
|
||||
|
||||
self.update()
|
||||
|
||||
def _get_all_remote(self):
|
||||
"""get all channel videos"""
|
||||
sub = ChannelSubscription()
|
||||
all_remote_videos = sub.get_last_youtube_videos(
|
||||
self.channel_id, limit=False
|
||||
)
|
||||
|
||||
return all_remote_videos
|
||||
|
||||
def _get_all_local(self):
|
||||
"""get all local indexed channel_videos"""
|
||||
channel = YoutubeChannel(self.channel_id)
|
||||
all_local_videos = channel.get_channel_videos()
|
||||
|
||||
return all_local_videos
|
||||
|
||||
def update(self):
|
||||
"""build bulk query for updates"""
|
||||
if not self.to_update:
|
||||
print(f"{self.channel_id}: nothing to update")
|
||||
return
|
||||
|
||||
print(f"{self.channel_id}: fixing {len(self.to_update)} videos")
|
||||
bulk_list = []
|
||||
for video in self.to_update:
|
||||
action = {
|
||||
"update": {"_id": video.get("video_id"), "_index": "ta_video"}
|
||||
}
|
||||
source = {"doc": {"vid_type": video.get("vid_type")}}
|
||||
bulk_list.append(json.dumps(action))
|
||||
bulk_list.append(json.dumps(source))
|
||||
# add last newline
|
||||
bulk_list.append("\n")
|
||||
data = "\n".join(bulk_list)
|
||||
_, _ = ElasticWrap("_bulk").post(data=data, ndjson=True)
|
@ -1,285 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- handle snapshots in ES
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
from time import sleep
|
||||
from zoneinfo import ZoneInfo
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.helper import get_mapping
|
||||
|
||||
|
||||
class ElasticSnapshot:
|
||||
"""interact with snapshots on ES"""
|
||||
|
||||
REPO = "ta_snapshot"
|
||||
REPO_SETTINGS = {
|
||||
"compress": "true",
|
||||
"chunk_size": "1g",
|
||||
"location": EnvironmentSettings.ES_SNAPSHOT_DIR,
|
||||
}
|
||||
POLICY = "ta_daily"
|
||||
|
||||
def __init__(self):
|
||||
self.all_indices = self._get_all_indices()
|
||||
|
||||
def _get_all_indices(self):
|
||||
"""return all indices names managed by TA"""
|
||||
mapping = get_mapping()
|
||||
all_indices = [f"ta_{i['index_name']}" for i in mapping]
|
||||
|
||||
return all_indices
|
||||
|
||||
def setup(self):
|
||||
"""setup the snapshot in ES, create or update if needed"""
|
||||
print("snapshot: run setup")
|
||||
repo_exists = self._check_repo_exists()
|
||||
if not repo_exists:
|
||||
self.create_repo()
|
||||
|
||||
policy_exists = self._check_policy_exists()
|
||||
if not policy_exists:
|
||||
self.create_policy()
|
||||
|
||||
is_outdated = self._needs_startup_snapshot()
|
||||
if is_outdated:
|
||||
_ = self.take_snapshot_now()
|
||||
|
||||
def _check_repo_exists(self):
|
||||
"""check if expected repo already exists"""
|
||||
path = f"_snapshot/{self.REPO}"
|
||||
response, statuscode = ElasticWrap(path).get()
|
||||
if statuscode == 200:
|
||||
print(f"snapshot: repo {self.REPO} already created")
|
||||
matching = response[self.REPO]["settings"] == self.REPO_SETTINGS
|
||||
if not matching:
|
||||
print(f"snapshot: update repo settings {self.REPO_SETTINGS}")
|
||||
|
||||
return matching
|
||||
|
||||
print(f"snapshot: setup repo {self.REPO} config {self.REPO_SETTINGS}")
|
||||
return False
|
||||
|
||||
def create_repo(self):
|
||||
"""create filesystem repo"""
|
||||
path = f"_snapshot/{self.REPO}"
|
||||
data = {
|
||||
"type": "fs",
|
||||
"settings": self.REPO_SETTINGS,
|
||||
}
|
||||
response, statuscode = ElasticWrap(path).post(data=data)
|
||||
if statuscode == 200:
|
||||
print(f"snapshot: repo setup correctly: {response}")
|
||||
|
||||
def _check_policy_exists(self):
|
||||
"""check if snapshot policy is set correctly"""
|
||||
policy = self._get_policy()
|
||||
expected_policy = self._build_policy_data()
|
||||
if not policy:
|
||||
print(f"snapshot: create policy {self.POLICY} {expected_policy}")
|
||||
return False
|
||||
|
||||
if policy["policy"] != expected_policy:
|
||||
print(f"snapshot: update policy settings {expected_policy}")
|
||||
return False
|
||||
|
||||
print("snapshot: policy is set.")
|
||||
return True
|
||||
|
||||
def _get_policy(self):
|
||||
"""get policy from es"""
|
||||
path = f"_slm/policy/{self.POLICY}"
|
||||
response, statuscode = ElasticWrap(path).get()
|
||||
if statuscode != 200:
|
||||
return False
|
||||
|
||||
return response[self.POLICY]
|
||||
|
||||
def create_policy(self):
|
||||
"""create snapshot lifetime policy"""
|
||||
path = f"_slm/policy/{self.POLICY}"
|
||||
data = self._build_policy_data()
|
||||
response, statuscode = ElasticWrap(path).put(data)
|
||||
if statuscode == 200:
|
||||
print(f"snapshot: policy setup correctly: {response}")
|
||||
|
||||
def _build_policy_data(self):
|
||||
"""build policy dict from config"""
|
||||
at_12 = datetime.now().replace(hour=12, minute=0, second=0)
|
||||
hour = at_12.astimezone(ZoneInfo("UTC")).hour
|
||||
|
||||
return {
|
||||
"schedule": f"0 0 {hour} * * ?",
|
||||
"name": f"<{self.POLICY}_>",
|
||||
"repository": self.REPO,
|
||||
"config": {
|
||||
"indices": self.all_indices,
|
||||
"ignore_unavailable": True,
|
||||
"include_global_state": True,
|
||||
},
|
||||
"retention": {
|
||||
"expire_after": "30d",
|
||||
"min_count": 5,
|
||||
"max_count": 50,
|
||||
},
|
||||
}
|
||||
|
||||
def _needs_startup_snapshot(self):
|
||||
"""check if last snapshot is expired"""
|
||||
snap_dicts = self._get_all_snapshots()
|
||||
if not snap_dicts:
|
||||
print("snapshot: create initial snapshot")
|
||||
return True
|
||||
|
||||
last_stamp = snap_dicts[0]["end_stamp"]
|
||||
now = int(datetime.now().timestamp())
|
||||
outdated = (now - last_stamp) / 60 / 60 > 24
|
||||
if outdated:
|
||||
print("snapshot: is outdated, create new now")
|
||||
|
||||
print("snapshot: last snapshot is up-to-date")
|
||||
return outdated
|
||||
|
||||
def take_snapshot_now(self, wait=False):
|
||||
"""execute daily snapshot now"""
|
||||
path = f"_slm/policy/{self.POLICY}/_execute"
|
||||
response, statuscode = ElasticWrap(path).post()
|
||||
if statuscode == 200:
|
||||
print(f"snapshot: executing now: {response}")
|
||||
|
||||
if wait and "snapshot_name" in response:
|
||||
self._wait_for_snapshot(response["snapshot_name"])
|
||||
|
||||
return response
|
||||
|
||||
def _wait_for_snapshot(self, snapshot_name):
|
||||
"""return after snapshot_name completes"""
|
||||
path = f"_snapshot/{self.REPO}/{snapshot_name}"
|
||||
|
||||
while True:
|
||||
# wait for task to be created
|
||||
sleep(1)
|
||||
_, statuscode = ElasticWrap(path).get()
|
||||
if statuscode == 200:
|
||||
break
|
||||
|
||||
while True:
|
||||
# wait for snapshot success
|
||||
response, statuscode = ElasticWrap(path).get()
|
||||
snapshot_state = response["snapshots"][0]["state"]
|
||||
if snapshot_state == "SUCCESS":
|
||||
break
|
||||
|
||||
print(f"snapshot: {snapshot_name} in state {snapshot_state}")
|
||||
print("snapshot: wait to complete")
|
||||
sleep(5)
|
||||
|
||||
print(f"snapshot: completed - {response}")
|
||||
|
||||
def get_snapshot_stats(self):
|
||||
"""get snapshot info for frontend"""
|
||||
snapshot_info = self._build_policy_details()
|
||||
if snapshot_info:
|
||||
snapshot_info.update({"snapshots": self._get_all_snapshots()})
|
||||
|
||||
return snapshot_info
|
||||
|
||||
def get_single_snapshot(self, snapshot_id):
|
||||
"""get single snapshot metadata"""
|
||||
path = f"_snapshot/{self.REPO}/{snapshot_id}"
|
||||
response, statuscode = ElasticWrap(path).get()
|
||||
if statuscode == 404:
|
||||
print(f"snapshots: not found: {snapshot_id}")
|
||||
return False
|
||||
|
||||
snapshot = response["snapshots"][0]
|
||||
return self._parse_single_snapshot(snapshot)
|
||||
|
||||
def _get_all_snapshots(self):
|
||||
"""get a list of all registered snapshots"""
|
||||
path = f"_snapshot/{self.REPO}/*?sort=start_time&order=desc"
|
||||
response, statuscode = ElasticWrap(path).get()
|
||||
if statuscode == 404:
|
||||
print("snapshots: not configured")
|
||||
return False
|
||||
|
||||
all_snapshots = response["snapshots"]
|
||||
if not all_snapshots:
|
||||
print("snapshots: no snapshots found")
|
||||
return False
|
||||
|
||||
snap_dicts = []
|
||||
for snapshot in all_snapshots:
|
||||
snap_dict = self._parse_single_snapshot(snapshot)
|
||||
snap_dicts.append(snap_dict)
|
||||
|
||||
return snap_dicts
|
||||
|
||||
def _parse_single_snapshot(self, snapshot):
|
||||
"""extract relevant metadata from single snapshot"""
|
||||
snap_dict = {
|
||||
"id": snapshot["snapshot"],
|
||||
"state": snapshot["state"],
|
||||
"es_version": snapshot["version"],
|
||||
"start_date": self._date_converter(snapshot["start_time"]),
|
||||
"end_date": self._date_converter(snapshot["end_time"]),
|
||||
"end_stamp": snapshot["end_time_in_millis"] // 1000,
|
||||
"duration_s": snapshot["duration_in_millis"] // 1000,
|
||||
}
|
||||
return snap_dict
|
||||
|
||||
def _build_policy_details(self):
|
||||
"""get additional policy details"""
|
||||
policy = self._get_policy()
|
||||
if not policy:
|
||||
return False
|
||||
|
||||
next_exec = policy["next_execution_millis"] // 1000
|
||||
next_exec_date = datetime.fromtimestamp(next_exec)
|
||||
next_exec_str = next_exec_date.strftime("%Y-%m-%d %H:%M")
|
||||
expire_after = policy["policy"]["retention"]["expire_after"]
|
||||
policy_metadata = {
|
||||
"next_exec": next_exec,
|
||||
"next_exec_str": next_exec_str,
|
||||
"expire_after": expire_after,
|
||||
}
|
||||
return policy_metadata
|
||||
|
||||
@staticmethod
|
||||
def _date_converter(date_utc):
|
||||
"""convert datetime string"""
|
||||
date = datetime.strptime(date_utc, "%Y-%m-%dT%H:%M:%S.%fZ")
|
||||
utc_date = date.replace(tzinfo=ZoneInfo("UTC"))
|
||||
converted = utc_date.astimezone(ZoneInfo(EnvironmentSettings.TZ))
|
||||
converted_str = converted.strftime("%Y-%m-%d %H:%M")
|
||||
|
||||
return converted_str
|
||||
|
||||
def restore_all(self, snapshot_name):
|
||||
"""restore snapshot by name"""
|
||||
for index in self.all_indices:
|
||||
_, _ = ElasticWrap(index).delete()
|
||||
|
||||
path = f"_snapshot/{self.REPO}/{snapshot_name}/_restore"
|
||||
data = {"indices": "*"}
|
||||
response, statuscode = ElasticWrap(path).post(data=data)
|
||||
if statuscode == 200:
|
||||
print(f"snapshot: executing now: {response}")
|
||||
return response
|
||||
|
||||
print(f"snapshot: failed to restore, {statuscode} {response}")
|
||||
return False
|
||||
|
||||
def delete_single_snapshot(self, snapshot_id):
|
||||
"""delete single snapshot from index"""
|
||||
path = f"_snapshot/{self.REPO}/{snapshot_id}"
|
||||
response, statuscode = ElasticWrap(path).delete()
|
||||
if statuscode == 200:
|
||||
print(f"snapshot: deleting {snapshot_id} {response}")
|
||||
return response
|
||||
|
||||
print(f"snapshot: failed to delete, {statuscode} {response}")
|
||||
return False
|
@ -1,47 +0,0 @@
|
||||
"""all app settings API urls"""
|
||||
|
||||
from appsettings import views
|
||||
from django.urls import path
|
||||
|
||||
urlpatterns = [
|
||||
path(
|
||||
"config/",
|
||||
views.AppConfigApiView.as_view(),
|
||||
name="api-config",
|
||||
),
|
||||
path(
|
||||
"snapshot/",
|
||||
views.SnapshotApiListView.as_view(),
|
||||
name="api-snapshot-list",
|
||||
),
|
||||
path(
|
||||
"snapshot/<slug:snapshot_id>/",
|
||||
views.SnapshotApiView.as_view(),
|
||||
name="api-snapshot",
|
||||
),
|
||||
path(
|
||||
"backup/",
|
||||
views.BackupApiListView.as_view(),
|
||||
name="api-backup-list",
|
||||
),
|
||||
path(
|
||||
"backup/<str:filename>/",
|
||||
views.BackupApiView.as_view(),
|
||||
name="api-backup",
|
||||
),
|
||||
path(
|
||||
"cookie/",
|
||||
views.CookieView.as_view(),
|
||||
name="api-cookie",
|
||||
),
|
||||
path(
|
||||
"potoken/",
|
||||
views.POTokenView.as_view(),
|
||||
name="api-potoken",
|
||||
),
|
||||
path(
|
||||
"token/",
|
||||
views.TokenView.as_view(),
|
||||
name="api-token",
|
||||
),
|
||||
]
|
@ -1,493 +0,0 @@
|
||||
"""all app settings API views"""
|
||||
|
||||
from appsettings.serializers import (
|
||||
AppConfigSerializer,
|
||||
BackupFileSerializer,
|
||||
CookieUpdateSerializer,
|
||||
CookieValidationSerializer,
|
||||
PoTokenSerializer,
|
||||
SnapshotCreateResponseSerializer,
|
||||
SnapshotItemSerializer,
|
||||
SnapshotListSerializer,
|
||||
SnapshotRestoreResponseSerializer,
|
||||
TokenResponseSerializer,
|
||||
)
|
||||
from appsettings.src.backup import ElasticBackup
|
||||
from appsettings.src.config import AppConfig
|
||||
from appsettings.src.snapshot import ElasticSnapshot
|
||||
from common.serializers import (
|
||||
AsyncTaskResponseSerializer,
|
||||
ErrorResponseSerializer,
|
||||
)
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from common.views_base import AdminOnly, AdminWriteOnly, ApiBaseView
|
||||
from django.conf import settings
|
||||
from download.src.yt_dlp_base import CookieHandler, POTokenHandler
|
||||
from drf_spectacular.utils import OpenApiResponse, extend_schema
|
||||
from rest_framework.authtoken.models import Token
|
||||
from rest_framework.response import Response
|
||||
from task.src.task_manager import TaskCommand
|
||||
from task.tasks import run_restore_backup
|
||||
|
||||
|
||||
class BackupApiListView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/backup/
|
||||
GET: returns list of available zip backups
|
||||
POST: take zip backup now
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
task_name = "run_backup"
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(BackupFileSerializer(many=True)),
|
||||
},
|
||||
)
|
||||
def get(request):
|
||||
"""get list of available backup files"""
|
||||
# pylint: disable=unused-argument
|
||||
backup_files = ElasticBackup().get_all_backup_files()
|
||||
serializer = BackupFileSerializer(backup_files, many=True)
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(AsyncTaskResponseSerializer()),
|
||||
},
|
||||
)
|
||||
def post(self, request):
|
||||
"""start new backup file task"""
|
||||
# pylint: disable=unused-argument
|
||||
response = TaskCommand().start(self.task_name)
|
||||
message = {
|
||||
"message": "backup task started",
|
||||
"task_id": response["task_id"],
|
||||
}
|
||||
serializer = AsyncTaskResponseSerializer(message)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class BackupApiView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/backup/<filename>/
|
||||
GET: return a single backup
|
||||
POST: restore backup
|
||||
DELETE: delete backup
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
task_name = "restore_backup"
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(BackupFileSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="file not found"
|
||||
),
|
||||
}
|
||||
)
|
||||
def get(request, filename):
|
||||
"""get single backup"""
|
||||
# pylint: disable=unused-argument
|
||||
backup_file = ElasticBackup().build_backup_file_data(filename)
|
||||
if not backup_file:
|
||||
error = ErrorResponseSerializer({"error": "file not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
serializer = BackupFileSerializer(backup_file)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(AsyncTaskResponseSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="file not found"
|
||||
),
|
||||
}
|
||||
)
|
||||
def post(self, request, filename):
|
||||
"""start new task to restore backup file"""
|
||||
# pylint: disable=unused-argument
|
||||
backup_file = ElasticBackup().build_backup_file_data(filename)
|
||||
if not backup_file:
|
||||
error = ErrorResponseSerializer({"error": "file not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
task = run_restore_backup.delay(filename)
|
||||
message = {
|
||||
"message": "backup restore task started",
|
||||
"filename": filename,
|
||||
"task_id": task.id,
|
||||
}
|
||||
return Response(message)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="file deleted"),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="file not found"
|
||||
),
|
||||
}
|
||||
)
|
||||
def delete(request, filename):
|
||||
"""delete backup file"""
|
||||
# pylint: disable=unused-argument
|
||||
|
||||
backup_file = ElasticBackup().delete_file(filename)
|
||||
if not backup_file:
|
||||
error = ErrorResponseSerializer({"error": "file not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
return Response(status=204)
|
||||
|
||||
|
||||
class AppConfigApiView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/config/
|
||||
GET: return app settings
|
||||
POST: update app settings
|
||||
"""
|
||||
|
||||
permission_classes = [AdminWriteOnly]
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(AppConfigSerializer()),
|
||||
}
|
||||
)
|
||||
def get(request):
|
||||
"""get app config"""
|
||||
response = AppConfig().config
|
||||
serializer = AppConfigSerializer(response)
|
||||
return Response(serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
request=AppConfigSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(AppConfigSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(request):
|
||||
"""update config values, allows partial update"""
|
||||
serializer = AppConfigSerializer(data=request.data, partial=True)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
updated_config = AppConfig().update_config(validated_data)
|
||||
updated_serializer = AppConfigSerializer(updated_config)
|
||||
return Response(updated_serializer.data)
|
||||
|
||||
|
||||
class CookieView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/cookie/
|
||||
GET: check if cookie is enabled
|
||||
POST: verify validity of cookie
|
||||
PUT: import cookie
|
||||
DELETE: revoke the cookie
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(CookieValidationSerializer()),
|
||||
}
|
||||
)
|
||||
def get(self, request):
|
||||
"""get cookie validation status"""
|
||||
# pylint: disable=unused-argument
|
||||
validation = self._get_cookie_validation()
|
||||
serializer = CookieValidationSerializer(validation)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(CookieValidationSerializer()),
|
||||
}
|
||||
)
|
||||
def post(self, request):
|
||||
"""validate cookie"""
|
||||
# pylint: disable=unused-argument
|
||||
config = AppConfig().config
|
||||
_ = CookieHandler(config).validate()
|
||||
validation = self._get_cookie_validation()
|
||||
serializer = CookieValidationSerializer(validation)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
request=CookieUpdateSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(CookieValidationSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def put(self, request):
|
||||
"""handle put request"""
|
||||
# pylint: disable=unused-argument
|
||||
|
||||
serializer = CookieUpdateSerializer(data=request.data)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
|
||||
cookie = validated_data.get("cookie")
|
||||
if not cookie:
|
||||
message = "missing cookie key in request data"
|
||||
print(message)
|
||||
error = ErrorResponseSerializer({"error": message})
|
||||
return Response(error.data, status=400)
|
||||
|
||||
if settings.DEBUG:
|
||||
print(f"[cookie] preview:\n\n{cookie[:300]}")
|
||||
|
||||
config = AppConfig().config
|
||||
handler = CookieHandler(config)
|
||||
handler.set_cookie(cookie)
|
||||
validated = handler.validate()
|
||||
if not validated:
|
||||
message = "[cookie]: import failed, not valid"
|
||||
print(message)
|
||||
error = ErrorResponseSerializer({"error": message})
|
||||
handler.revoke()
|
||||
return Response(error.data, status=400)
|
||||
|
||||
validation = self._get_cookie_validation()
|
||||
serializer = CookieValidationSerializer(validation)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="Cookie revoked"),
|
||||
},
|
||||
)
|
||||
def delete(self, request):
|
||||
"""delete the cookie"""
|
||||
config = AppConfig().config
|
||||
handler = CookieHandler(config)
|
||||
handler.revoke()
|
||||
return Response(status=204)
|
||||
|
||||
@staticmethod
|
||||
def _get_cookie_validation():
|
||||
"""get current cookie validation"""
|
||||
config = AppConfig().config
|
||||
validation = RedisArchivist().get_message_dict("cookie:valid")
|
||||
is_enabled = {"cookie_enabled": config["downloads"]["cookie_import"]}
|
||||
validation.update(is_enabled)
|
||||
|
||||
return validation
|
||||
|
||||
|
||||
class POTokenView(ApiBaseView):
|
||||
"""handle PO token"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(PoTokenSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="PO token not found"
|
||||
),
|
||||
}
|
||||
)
|
||||
def get(self, request):
|
||||
"""get PO token"""
|
||||
config = AppConfig().config
|
||||
potoken = POTokenHandler(config).get()
|
||||
if not potoken:
|
||||
error = ErrorResponseSerializer({"error": "PO token not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
serializer = PoTokenSerializer(data={"potoken": potoken})
|
||||
serializer.is_valid(raise_exception=True)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(PoTokenSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
}
|
||||
)
|
||||
def post(self, request):
|
||||
"""Update PO token"""
|
||||
serializer = PoTokenSerializer(data=request.data)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
if not validated_data:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "missing PO token key in request data"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
|
||||
config = AppConfig().config
|
||||
new_token = validated_data["potoken"]
|
||||
|
||||
POTokenHandler(config).set_token(new_token)
|
||||
return Response(serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="PO token revoked"),
|
||||
},
|
||||
)
|
||||
def delete(self, request):
|
||||
"""delete PO token"""
|
||||
config = AppConfig().config
|
||||
POTokenHandler(config).revoke_token()
|
||||
return Response(status=204)
|
||||
|
||||
|
||||
class SnapshotApiListView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/snapshot/
|
||||
GET: returns snapshot config plus list of existing snapshots
|
||||
POST: take snapshot now
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(SnapshotListSerializer()),
|
||||
}
|
||||
)
|
||||
def get(request):
|
||||
"""get available snapshots with metadata"""
|
||||
# pylint: disable=unused-argument
|
||||
snapshots = ElasticSnapshot().get_snapshot_stats()
|
||||
serializer = SnapshotListSerializer(snapshots)
|
||||
return Response(serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(SnapshotCreateResponseSerializer()),
|
||||
}
|
||||
)
|
||||
def post(request):
|
||||
"""take snapshot now"""
|
||||
# pylint: disable=unused-argument
|
||||
response = ElasticSnapshot().take_snapshot_now()
|
||||
serializer = SnapshotCreateResponseSerializer(response)
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class SnapshotApiView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/snapshot/<snapshot-id>/
|
||||
GET: return a single snapshot
|
||||
POST: restore snapshot
|
||||
DELETE: delete a snapshot
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(SnapshotItemSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="snapshot not found"
|
||||
),
|
||||
}
|
||||
)
|
||||
def get(request, snapshot_id):
|
||||
"""handle get request"""
|
||||
# pylint: disable=unused-argument
|
||||
snapshot = ElasticSnapshot().get_single_snapshot(snapshot_id)
|
||||
|
||||
if not snapshot:
|
||||
error = ErrorResponseSerializer({"error": "snapshot not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
serializer = SnapshotItemSerializer(snapshot)
|
||||
return Response(serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(SnapshotRestoreResponseSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="bad request"
|
||||
),
|
||||
}
|
||||
)
|
||||
def post(request, snapshot_id):
|
||||
"""restore snapshot"""
|
||||
# pylint: disable=unused-argument
|
||||
response = ElasticSnapshot().restore_all(snapshot_id)
|
||||
if not response:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "failed to restore snapshot"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
|
||||
serializer = SnapshotRestoreResponseSerializer(response)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="delete snapshot from index"),
|
||||
}
|
||||
)
|
||||
def delete(request, snapshot_id):
|
||||
"""delete snapshot from index"""
|
||||
# pylint: disable=unused-argument
|
||||
response = ElasticSnapshot().delete_single_snapshot(snapshot_id)
|
||||
if not response:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "failed to delete snapshot"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
|
||||
return Response(status=204)
|
||||
|
||||
|
||||
class TokenView(ApiBaseView):
|
||||
"""resolves to /api/appsettings/token/
|
||||
GET: get API token
|
||||
DELETE: revoke the token
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(TokenResponseSerializer()),
|
||||
}
|
||||
)
|
||||
def get(request):
|
||||
"""get your API token"""
|
||||
token, _ = Token.objects.get_or_create(user=request.user)
|
||||
serializer = TokenResponseSerializer({"token": token.key})
|
||||
return Response(serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="delete token"),
|
||||
}
|
||||
)
|
||||
def delete(request):
|
||||
"""delete your API token, new will get created on next get"""
|
||||
print("revoke API token")
|
||||
request.user.auth_token.delete()
|
||||
return Response(status=204)
|
@ -1,103 +0,0 @@
|
||||
"""channel serializers"""
|
||||
|
||||
# pylint: disable=abstract-method
|
||||
|
||||
from common.serializers import PaginationSerializer, ValidateUnknownFieldsMixin
|
||||
from rest_framework import serializers
|
||||
|
||||
|
||||
class ChannelOverwriteSerializer(
|
||||
ValidateUnknownFieldsMixin, serializers.Serializer
|
||||
):
|
||||
"""serialize channel overwrites"""
|
||||
|
||||
download_format = serializers.CharField(required=False, allow_null=True)
|
||||
autodelete_days = serializers.IntegerField(required=False, allow_null=True)
|
||||
index_playlists = serializers.BooleanField(required=False, allow_null=True)
|
||||
integrate_sponsorblock = serializers.BooleanField(
|
||||
required=False, allow_null=True
|
||||
)
|
||||
subscriptions_channel_size = serializers.IntegerField(
|
||||
required=False, allow_null=True
|
||||
)
|
||||
subscriptions_live_channel_size = serializers.IntegerField(
|
||||
required=False, allow_null=True
|
||||
)
|
||||
subscriptions_shorts_channel_size = serializers.IntegerField(
|
||||
required=False, allow_null=True
|
||||
)
|
||||
|
||||
|
||||
class ChannelSerializer(serializers.Serializer):
|
||||
"""serialize channel"""
|
||||
|
||||
channel_id = serializers.CharField()
|
||||
channel_active = serializers.BooleanField()
|
||||
channel_banner_url = serializers.CharField()
|
||||
channel_thumb_url = serializers.CharField()
|
||||
channel_tvart_url = serializers.CharField()
|
||||
channel_description = serializers.CharField()
|
||||
channel_last_refresh = serializers.CharField()
|
||||
channel_name = serializers.CharField()
|
||||
channel_overwrites = ChannelOverwriteSerializer(required=False)
|
||||
channel_subs = serializers.IntegerField()
|
||||
channel_subscribed = serializers.BooleanField()
|
||||
channel_tags = serializers.ListField(
|
||||
child=serializers.CharField(), required=False
|
||||
)
|
||||
channel_views = serializers.IntegerField()
|
||||
_index = serializers.CharField(required=False)
|
||||
_score = serializers.IntegerField(required=False)
|
||||
|
||||
|
||||
class ChannelListSerializer(serializers.Serializer):
|
||||
"""serialize channel list"""
|
||||
|
||||
data = ChannelSerializer(many=True)
|
||||
paginate = PaginationSerializer()
|
||||
|
||||
|
||||
class ChannelListQuerySerializer(serializers.Serializer):
|
||||
"""serialize list query"""
|
||||
|
||||
filter = serializers.ChoiceField(choices=["subscribed"], required=False)
|
||||
page = serializers.IntegerField(required=False)
|
||||
|
||||
|
||||
class ChannelUpdateSerializer(serializers.Serializer):
|
||||
"""update channel"""
|
||||
|
||||
channel_subscribed = serializers.BooleanField(required=False)
|
||||
channel_overwrites = ChannelOverwriteSerializer(required=False)
|
||||
|
||||
|
||||
class ChannelAggBucketSerializer(serializers.Serializer):
|
||||
"""serialize channel agg bucket"""
|
||||
|
||||
value = serializers.IntegerField()
|
||||
value_str = serializers.CharField(required=False)
|
||||
|
||||
|
||||
class ChannelAggSerializer(serializers.Serializer):
|
||||
"""serialize channel aggregation"""
|
||||
|
||||
total_items = ChannelAggBucketSerializer()
|
||||
total_size = ChannelAggBucketSerializer()
|
||||
total_duration = ChannelAggBucketSerializer()
|
||||
|
||||
|
||||
class ChannelNavSerializer(serializers.Serializer):
|
||||
"""serialize channel navigation"""
|
||||
|
||||
has_pending = serializers.BooleanField()
|
||||
has_ignored = serializers.BooleanField()
|
||||
has_playlists = serializers.BooleanField()
|
||||
has_videos = serializers.BooleanField()
|
||||
has_streams = serializers.BooleanField()
|
||||
has_shorts = serializers.BooleanField()
|
||||
|
||||
|
||||
class ChannelSearchQuerySerializer(serializers.Serializer):
|
||||
"""serialize query parameters for searching"""
|
||||
|
||||
q = serializers.CharField()
|
@ -1,363 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- get metadata from youtube for a channel
|
||||
- index and update in es
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
from datetime import datetime
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap, IndexPaginate
|
||||
from common.src.helper import rand_sleep
|
||||
from common.src.index_generic import YouTubeItem
|
||||
from download.src.thumbnails import ThumbManager
|
||||
from download.src.yt_dlp_base import YtWrap
|
||||
|
||||
|
||||
class YoutubeChannel(YouTubeItem):
|
||||
"""represents a single youtube channel"""
|
||||
|
||||
es_path = False
|
||||
index_name = "ta_channel"
|
||||
yt_base = "https://www.youtube.com/channel/"
|
||||
yt_obs = {
|
||||
"playlist_items": "0,0",
|
||||
"skip_download": True,
|
||||
}
|
||||
|
||||
def __init__(self, youtube_id, task=False):
|
||||
super().__init__(youtube_id)
|
||||
self.all_playlists = False
|
||||
self.task = task
|
||||
|
||||
def build_json(self, upload=False, fallback=False):
|
||||
"""get from es or from youtube"""
|
||||
self.get_from_es()
|
||||
if self.json_data:
|
||||
return
|
||||
|
||||
self.get_from_youtube()
|
||||
if not self.youtube_meta and fallback:
|
||||
self._video_fallback(fallback)
|
||||
else:
|
||||
if not self.youtube_meta:
|
||||
message = f"{self.youtube_id}: Failed to get metadata"
|
||||
raise ValueError(message)
|
||||
|
||||
self.process_youtube_meta()
|
||||
self.get_channel_art()
|
||||
|
||||
if upload:
|
||||
self.upload_to_es()
|
||||
|
||||
def process_youtube_meta(self):
|
||||
"""extract relevant fields"""
|
||||
self.youtube_meta["thumbnails"].reverse()
|
||||
channel_name = self.youtube_meta["uploader"] or self.youtube_meta["id"]
|
||||
self.json_data = {
|
||||
"channel_active": True,
|
||||
"channel_description": self.youtube_meta.get("description", ""),
|
||||
"channel_id": self.youtube_id,
|
||||
"channel_last_refresh": int(datetime.now().timestamp()),
|
||||
"channel_name": channel_name,
|
||||
"channel_subs": self.youtube_meta.get("channel_follower_count", 0),
|
||||
"channel_subscribed": False,
|
||||
"channel_tags": self.youtube_meta.get("tags", []),
|
||||
"channel_banner_url": self._get_banner_art(),
|
||||
"channel_thumb_url": self._get_thumb_art(),
|
||||
"channel_tvart_url": self._get_tv_art(),
|
||||
"channel_views": self.youtube_meta.get("view_count") or 0,
|
||||
}
|
||||
|
||||
def _get_thumb_art(self):
|
||||
"""extract thumb art"""
|
||||
for i in self.youtube_meta["thumbnails"]:
|
||||
if not i.get("width"):
|
||||
continue
|
||||
if i.get("width") == i.get("height"):
|
||||
return i["url"]
|
||||
|
||||
return False
|
||||
|
||||
def _get_tv_art(self):
|
||||
"""extract tv artwork"""
|
||||
for i in self.youtube_meta["thumbnails"]:
|
||||
if i.get("id") == "banner_uncropped":
|
||||
return i["url"]
|
||||
for i in self.youtube_meta["thumbnails"]:
|
||||
if not i.get("width"):
|
||||
continue
|
||||
if i["width"] // i["height"] < 2 and not i["width"] == i["height"]:
|
||||
return i["url"]
|
||||
|
||||
return False
|
||||
|
||||
def _get_banner_art(self):
|
||||
"""extract banner artwork"""
|
||||
for i in self.youtube_meta["thumbnails"]:
|
||||
if not i.get("width"):
|
||||
continue
|
||||
if i["width"] // i["height"] > 5:
|
||||
return i["url"]
|
||||
|
||||
return False
|
||||
|
||||
def _video_fallback(self, fallback):
|
||||
"""use video metadata as fallback"""
|
||||
print(f"{self.youtube_id}: fallback to video metadata")
|
||||
self.json_data = {
|
||||
"channel_active": False,
|
||||
"channel_last_refresh": int(datetime.now().timestamp()),
|
||||
"channel_subs": fallback.get("channel_follower_count", 0),
|
||||
"channel_name": fallback["uploader"],
|
||||
"channel_banner_url": False,
|
||||
"channel_tvart_url": False,
|
||||
"channel_id": self.youtube_id,
|
||||
"channel_subscribed": False,
|
||||
"channel_tags": [],
|
||||
"channel_description": "",
|
||||
"channel_thumb_url": False,
|
||||
"channel_views": 0,
|
||||
}
|
||||
self._info_json_fallback()
|
||||
|
||||
def _info_json_fallback(self):
|
||||
"""read channel info.json for additional metadata"""
|
||||
info_json = os.path.join(
|
||||
EnvironmentSettings.CACHE_DIR,
|
||||
"import",
|
||||
f"{self.youtube_id}.info.json",
|
||||
)
|
||||
if os.path.exists(info_json):
|
||||
print(f"{self.youtube_id}: read info.json file")
|
||||
with open(info_json, "r", encoding="utf-8") as f:
|
||||
content = json.loads(f.read())
|
||||
|
||||
self.json_data.update(
|
||||
{
|
||||
"channel_subs": content.get("channel_follower_count", 0),
|
||||
"channel_description": content.get("description", False),
|
||||
}
|
||||
)
|
||||
os.remove(info_json)
|
||||
|
||||
def get_channel_art(self):
|
||||
"""download channel art for new channels"""
|
||||
urls = (
|
||||
self.json_data["channel_thumb_url"],
|
||||
self.json_data["channel_banner_url"],
|
||||
self.json_data["channel_tvart_url"],
|
||||
)
|
||||
ThumbManager(self.youtube_id, item_type="channel").download(urls)
|
||||
|
||||
def sync_to_videos(self):
|
||||
"""sync new channel_dict to all videos of channel"""
|
||||
# add ingest pipeline
|
||||
processors = []
|
||||
for field, value in self.json_data.items():
|
||||
if value is None:
|
||||
line = {
|
||||
"script": {
|
||||
"lang": "painless",
|
||||
"source": f"ctx['{field}'] = null;",
|
||||
}
|
||||
}
|
||||
else:
|
||||
line = {"set": {"field": "channel." + field, "value": value}}
|
||||
|
||||
processors.append(line)
|
||||
|
||||
data = {"description": self.youtube_id, "processors": processors}
|
||||
ingest_path = f"_ingest/pipeline/{self.youtube_id}"
|
||||
_, _ = ElasticWrap(ingest_path).put(data)
|
||||
# apply pipeline
|
||||
data = {"query": {"match": {"channel.channel_id": self.youtube_id}}}
|
||||
update_path = f"ta_video/_update_by_query?pipeline={self.youtube_id}"
|
||||
_, _ = ElasticWrap(update_path).post(data)
|
||||
|
||||
def get_folder_path(self):
|
||||
"""get folder where media files get stored"""
|
||||
folder_path = os.path.join(
|
||||
EnvironmentSettings.MEDIA_DIR,
|
||||
self.json_data["channel_id"],
|
||||
)
|
||||
return folder_path
|
||||
|
||||
def delete_es_videos(self):
|
||||
"""delete all channel documents from elasticsearch"""
|
||||
data = {
|
||||
"query": {
|
||||
"term": {"channel.channel_id": {"value": self.youtube_id}}
|
||||
}
|
||||
}
|
||||
_, _ = ElasticWrap("ta_video/_delete_by_query").post(data)
|
||||
|
||||
def delete_es_comments(self):
|
||||
"""delete all comments from this channel"""
|
||||
data = {
|
||||
"query": {
|
||||
"term": {"comment_channel_id": {"value": self.youtube_id}}
|
||||
}
|
||||
}
|
||||
_, _ = ElasticWrap("ta_comment/_delete_by_query").post(data)
|
||||
|
||||
def delete_es_subtitles(self):
|
||||
"""delete all subtitles from this channel"""
|
||||
data = {
|
||||
"query": {
|
||||
"term": {"subtitle_channel_id": {"value": self.youtube_id}}
|
||||
}
|
||||
}
|
||||
_, _ = ElasticWrap("ta_subtitle/_delete_by_query").post(data)
|
||||
|
||||
def delete_playlists(self):
|
||||
"""delete all indexed playlist from es"""
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
|
||||
all_playlists = self.get_indexed_playlists()
|
||||
for playlist in all_playlists:
|
||||
YoutubePlaylist(playlist["playlist_id"]).delete_metadata()
|
||||
|
||||
def delete_channel(self):
|
||||
"""delete channel and all videos"""
|
||||
print(f"{self.youtube_id}: delete channel")
|
||||
self.get_from_es()
|
||||
if not self.json_data:
|
||||
raise FileNotFoundError
|
||||
|
||||
folder_path = self.get_folder_path()
|
||||
print(f"{self.youtube_id}: delete all media files")
|
||||
try:
|
||||
all_videos = os.listdir(folder_path)
|
||||
for video in all_videos:
|
||||
video_path = os.path.join(folder_path, video)
|
||||
os.remove(video_path)
|
||||
os.rmdir(folder_path)
|
||||
except FileNotFoundError:
|
||||
print(f"no videos found for {folder_path}")
|
||||
|
||||
print(f"{self.youtube_id}: delete indexed playlists")
|
||||
self.delete_playlists()
|
||||
print(f"{self.youtube_id}: delete indexed videos")
|
||||
self.delete_es_videos()
|
||||
self.delete_es_comments()
|
||||
self.delete_es_subtitles()
|
||||
self.del_in_es()
|
||||
|
||||
def index_channel_playlists(self):
|
||||
"""add all playlists of channel to index"""
|
||||
print(f"{self.youtube_id}: index all playlists")
|
||||
self.get_from_es()
|
||||
channel_name = self.json_data["channel_name"]
|
||||
self.task.send_progress([f"{channel_name}: Looking for Playlists"])
|
||||
self.get_all_playlists()
|
||||
if not self.all_playlists:
|
||||
print(f"{self.youtube_id}: no playlists found.")
|
||||
return
|
||||
|
||||
total = len(self.all_playlists)
|
||||
for idx, playlist in enumerate(self.all_playlists):
|
||||
if self.task:
|
||||
self._notify_single_playlist(idx, total)
|
||||
|
||||
self._index_single_playlist(playlist)
|
||||
print("add playlist: " + playlist[1])
|
||||
rand_sleep(self.config)
|
||||
|
||||
def _notify_single_playlist(self, idx, total):
|
||||
"""send notification"""
|
||||
channel_name = self.json_data["channel_name"]
|
||||
message = [
|
||||
f"{channel_name}: Scanning channel for playlists",
|
||||
f"Progress: {idx + 1}/{total}",
|
||||
]
|
||||
self.task.send_progress(message, progress=(idx + 1) / total)
|
||||
|
||||
@staticmethod
|
||||
def _index_single_playlist(playlist):
|
||||
"""add single playlist if needed"""
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
|
||||
playlist = YoutubePlaylist(playlist[0])
|
||||
playlist.update_playlist(skip_on_empty=True)
|
||||
|
||||
def get_channel_videos(self):
|
||||
"""get all videos from channel"""
|
||||
data = {
|
||||
"query": {
|
||||
"term": {"channel.channel_id": {"value": self.youtube_id}}
|
||||
},
|
||||
"_source": ["youtube_id", "vid_type"],
|
||||
}
|
||||
all_videos = IndexPaginate("ta_video", data).get_results()
|
||||
return all_videos
|
||||
|
||||
def get_all_playlists(self):
|
||||
"""get all playlists owned by this channel"""
|
||||
url = (
|
||||
f"https://www.youtube.com/channel/{self.youtube_id}"
|
||||
+ "/playlists?view=1&sort=dd&shelf_id=0"
|
||||
)
|
||||
obs = {"skip_download": True, "extract_flat": True}
|
||||
playlists = YtWrap(obs, self.config).extract(url)
|
||||
if not playlists:
|
||||
self.all_playlists = []
|
||||
return
|
||||
|
||||
all_entries = [(i["id"], i["title"]) for i in playlists["entries"]]
|
||||
self.all_playlists = all_entries
|
||||
|
||||
def get_indexed_playlists(self, active_only=False):
|
||||
"""get all indexed playlists from channel"""
|
||||
must_list = [
|
||||
{"term": {"playlist_channel_id": {"value": self.youtube_id}}}
|
||||
]
|
||||
if active_only:
|
||||
must_list.append({"term": {"playlist_active": {"value": True}}})
|
||||
|
||||
data = {"query": {"bool": {"must": must_list}}}
|
||||
|
||||
all_playlists = IndexPaginate("ta_playlist", data).get_results()
|
||||
return all_playlists
|
||||
|
||||
def get_overwrites(self) -> dict:
|
||||
"""get all per channel overwrites"""
|
||||
return self.json_data.get("channel_overwrites", {})
|
||||
|
||||
def set_overwrites(self, overwrites):
|
||||
"""set per channel overwrites"""
|
||||
valid_keys = [
|
||||
"download_format",
|
||||
"autodelete_days",
|
||||
"index_playlists",
|
||||
"integrate_sponsorblock",
|
||||
"subscriptions_channel_size",
|
||||
"subscriptions_live_channel_size",
|
||||
"subscriptions_shorts_channel_size",
|
||||
]
|
||||
|
||||
to_write = self.json_data.get("channel_overwrites", {})
|
||||
for key, value in overwrites.items():
|
||||
if key not in valid_keys:
|
||||
raise ValueError(f"invalid overwrite key: {key}")
|
||||
|
||||
if value is None and key in to_write:
|
||||
to_write.pop(key)
|
||||
continue
|
||||
|
||||
to_write.update({key: value})
|
||||
|
||||
self.json_data["channel_overwrites"] = to_write
|
||||
|
||||
|
||||
def channel_overwrites(channel_id, overwrites):
|
||||
"""collection to overwrite settings per channel"""
|
||||
channel = YoutubeChannel(channel_id)
|
||||
channel.build_json()
|
||||
channel.set_overwrites(overwrites)
|
||||
channel.upload_to_es()
|
||||
channel.sync_to_videos()
|
||||
|
||||
return channel.json_data
|
@ -1,97 +0,0 @@
|
||||
"""build channel nav"""
|
||||
|
||||
from common.src.es_connect import ElasticWrap
|
||||
|
||||
|
||||
class ChannelNav:
|
||||
"""get all nav items"""
|
||||
|
||||
def __init__(self, channel_id):
|
||||
self.channel_id = channel_id
|
||||
|
||||
def get_nav(self):
|
||||
"""build nav items"""
|
||||
nav = {
|
||||
"has_pending": self._get_has_pending(),
|
||||
"has_ignored": self._get_has_ignored(),
|
||||
"has_playlists": self._get_has_playlists(),
|
||||
}
|
||||
nav.update(self._get_vid_types())
|
||||
return nav
|
||||
|
||||
def _get_vid_types(self):
|
||||
"""get available vid_types in given channel"""
|
||||
data = {
|
||||
"size": 0,
|
||||
"query": {
|
||||
"term": {"channel.channel_id": {"value": self.channel_id}}
|
||||
},
|
||||
"aggs": {"unique_values": {"terms": {"field": "vid_type"}}},
|
||||
}
|
||||
response, _ = ElasticWrap("ta_video/_search").get(data)
|
||||
buckets = response["aggregations"]["unique_values"]["buckets"]
|
||||
|
||||
type_nav = {
|
||||
"has_videos": False,
|
||||
"has_streams": False,
|
||||
"has_shorts": False,
|
||||
}
|
||||
for bucket in buckets:
|
||||
if bucket["key"] == "videos":
|
||||
type_nav["has_videos"] = True
|
||||
if bucket["key"] == "streams":
|
||||
type_nav["has_streams"] = True
|
||||
if bucket["key"] == "shorts":
|
||||
type_nav["has_shorts"] = True
|
||||
|
||||
return type_nav
|
||||
|
||||
def _get_has_pending(self):
|
||||
"""check if has pending videos in download queue"""
|
||||
data = {
|
||||
"size": 1,
|
||||
"query": {
|
||||
"bool": {
|
||||
"must": [
|
||||
{"term": {"status": {"value": "pending"}}},
|
||||
{"term": {"channel_id": {"value": self.channel_id}}},
|
||||
]
|
||||
}
|
||||
},
|
||||
"_source": False,
|
||||
}
|
||||
response, _ = ElasticWrap("ta_download/_search").get(data=data)
|
||||
|
||||
return bool(response["hits"]["hits"])
|
||||
|
||||
def _get_has_ignored(self):
|
||||
"""Check if there are ignored videos in the download queue"""
|
||||
data = {
|
||||
"size": 1,
|
||||
"query": {
|
||||
"bool": {
|
||||
"must": [
|
||||
{"term": {"status": {"value": "ignore"}}},
|
||||
{"term": {"channel_id": {"value": self.channel_id}}},
|
||||
]
|
||||
}
|
||||
},
|
||||
"_source": False,
|
||||
}
|
||||
response, _ = ElasticWrap("ta_download/_search").get(data=data)
|
||||
|
||||
return bool(response["hits"]["hits"])
|
||||
|
||||
def _get_has_playlists(self):
|
||||
"""check if channel has playlists"""
|
||||
path = "ta_playlist/_search"
|
||||
data = {
|
||||
"size": 1,
|
||||
"query": {
|
||||
"term": {"playlist_channel_id": {"value": self.channel_id}}
|
||||
},
|
||||
"_source": False,
|
||||
}
|
||||
response, _ = ElasticWrap(path).get(data=data)
|
||||
|
||||
return bool(response["hits"]["hits"])
|
@ -1,32 +0,0 @@
|
||||
"""all channel API urls"""
|
||||
|
||||
from channel import views
|
||||
from django.urls import path
|
||||
|
||||
urlpatterns = [
|
||||
path(
|
||||
"",
|
||||
views.ChannelApiListView.as_view(),
|
||||
name="api-channel-list",
|
||||
),
|
||||
path(
|
||||
"search/",
|
||||
views.ChannelApiSearchView.as_view(),
|
||||
name="api-channel-search",
|
||||
),
|
||||
path(
|
||||
"<slug:channel_id>/",
|
||||
views.ChannelApiView.as_view(),
|
||||
name="api-channel",
|
||||
),
|
||||
path(
|
||||
"<slug:channel_id>/aggs/",
|
||||
views.ChannelAggsApiView.as_view(),
|
||||
name="api-channel-aggs",
|
||||
),
|
||||
path(
|
||||
"<slug:channel_id>/nav/",
|
||||
views.ChannelNavApiView.as_view(),
|
||||
name="api-channel-nav",
|
||||
),
|
||||
]
|
@ -1,281 +0,0 @@
|
||||
"""all channel API views"""
|
||||
|
||||
from channel.serializers import (
|
||||
ChannelAggSerializer,
|
||||
ChannelListQuerySerializer,
|
||||
ChannelListSerializer,
|
||||
ChannelNavSerializer,
|
||||
ChannelSearchQuerySerializer,
|
||||
ChannelSerializer,
|
||||
ChannelUpdateSerializer,
|
||||
)
|
||||
from channel.src.index import YoutubeChannel, channel_overwrites
|
||||
from channel.src.nav import ChannelNav
|
||||
from common.serializers import ErrorResponseSerializer
|
||||
from common.src.urlparser import Parser
|
||||
from common.views_base import AdminWriteOnly, ApiBaseView
|
||||
from download.src.subscriptions import ChannelSubscription
|
||||
from drf_spectacular.utils import (
|
||||
OpenApiParameter,
|
||||
OpenApiResponse,
|
||||
extend_schema,
|
||||
)
|
||||
from rest_framework.response import Response
|
||||
from task.tasks import index_channel_playlists, subscribe_to
|
||||
|
||||
|
||||
class ChannelApiListView(ApiBaseView):
|
||||
"""resolves to /api/channel/
|
||||
GET: returns list of channels
|
||||
POST: edit a list of channels
|
||||
"""
|
||||
|
||||
search_base = "ta_channel/_search/"
|
||||
valid_filter = ["subscribed"]
|
||||
permission_classes = [AdminWriteOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(ChannelListSerializer()),
|
||||
},
|
||||
parameters=[ChannelListQuerySerializer()],
|
||||
)
|
||||
def get(self, request):
|
||||
"""get request"""
|
||||
self.data.update(
|
||||
{"sort": [{"channel_name.keyword": {"order": "asc"}}]}
|
||||
)
|
||||
|
||||
serializer = ChannelListQuerySerializer(data=request.query_params)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
|
||||
must_list = []
|
||||
query_filter = validated_data.get("filter")
|
||||
if query_filter:
|
||||
must_list.append({"term": {"channel_subscribed": {"value": True}}})
|
||||
|
||||
self.data["query"] = {"bool": {"must": must_list}}
|
||||
self.get_document_list(request)
|
||||
serializer = ChannelListSerializer(self.response)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
def post(self, request):
|
||||
"""subscribe/unsubscribe to list of channels"""
|
||||
data = request.data
|
||||
try:
|
||||
to_add = data["data"]
|
||||
except KeyError:
|
||||
message = "missing expected data key"
|
||||
print(message)
|
||||
return Response({"message": message}, status=400)
|
||||
|
||||
pending = []
|
||||
for channel_item in to_add:
|
||||
channel_id = channel_item["channel_id"]
|
||||
if channel_item["channel_subscribed"]:
|
||||
pending.append(channel_id)
|
||||
else:
|
||||
self._unsubscribe(channel_id)
|
||||
|
||||
if pending:
|
||||
url_str = " ".join(pending)
|
||||
subscribe_to.delay(url_str, expected_type="channel")
|
||||
|
||||
return Response(data)
|
||||
|
||||
@staticmethod
|
||||
def _unsubscribe(channel_id: str):
|
||||
"""unsubscribe"""
|
||||
print(f"[{channel_id}] unsubscribe from channel")
|
||||
ChannelSubscription().change_subscribe(
|
||||
channel_id, channel_subscribed=False
|
||||
)
|
||||
|
||||
|
||||
class ChannelApiView(ApiBaseView):
|
||||
"""resolves to /api/channel/<channel_id>/
|
||||
GET: returns metadata dict of channel
|
||||
"""
|
||||
|
||||
search_base = "ta_channel/_doc/"
|
||||
permission_classes = [AdminWriteOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(ChannelSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Channel not found"
|
||||
),
|
||||
}
|
||||
)
|
||||
def get(self, request, channel_id):
|
||||
# pylint: disable=unused-argument
|
||||
"""get channel detail"""
|
||||
self.get_document(channel_id)
|
||||
if not self.response:
|
||||
error = ErrorResponseSerializer({"error": "channel not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
response_serializer = ChannelSerializer(self.response)
|
||||
return Response(response_serializer.data, status=self.status_code)
|
||||
|
||||
@extend_schema(
|
||||
request=ChannelUpdateSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(ChannelUpdateSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Channel not found"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request, channel_id):
|
||||
"""modify channel"""
|
||||
self.get_document(channel_id)
|
||||
if not self.response:
|
||||
error = ErrorResponseSerializer({"error": "channel not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
serializer = ChannelUpdateSerializer(data=request.data)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
|
||||
subscribed = validated_data.get("channel_subscribed")
|
||||
if subscribed is not None:
|
||||
ChannelSubscription().change_subscribe(channel_id, subscribed)
|
||||
|
||||
overwrites = validated_data.get("channel_overwrites")
|
||||
if overwrites:
|
||||
channel_overwrites(channel_id, overwrites)
|
||||
if overwrites.get("index_playlists"):
|
||||
index_channel_playlists.delay(channel_id)
|
||||
|
||||
return Response(serializer.data, status=200)
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="Channel deleted"),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Channel not found"
|
||||
),
|
||||
},
|
||||
)
|
||||
def delete(self, request, channel_id):
|
||||
# pylint: disable=unused-argument
|
||||
"""delete channel"""
|
||||
try:
|
||||
YoutubeChannel(channel_id).delete_channel()
|
||||
return Response(status=204)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
|
||||
error = ErrorResponseSerializer({"error": "channel not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
|
||||
class ChannelAggsApiView(ApiBaseView):
|
||||
"""resolves to /api/channel/<channel_id>/aggs/
|
||||
GET: get channel aggregations
|
||||
"""
|
||||
|
||||
search_base = "ta_video/_search"
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(ChannelAggSerializer()),
|
||||
},
|
||||
)
|
||||
def get(self, request, channel_id):
|
||||
"""get channel aggregations"""
|
||||
self.data.update(
|
||||
{
|
||||
"query": {
|
||||
"term": {"channel.channel_id": {"value": channel_id}}
|
||||
},
|
||||
"aggs": {
|
||||
"total_items": {"value_count": {"field": "youtube_id"}},
|
||||
"total_size": {"sum": {"field": "media_size"}},
|
||||
"total_duration": {"sum": {"field": "player.duration"}},
|
||||
},
|
||||
}
|
||||
)
|
||||
self.get_aggs()
|
||||
serializer = ChannelAggSerializer(self.response)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class ChannelNavApiView(ApiBaseView):
|
||||
"""resolves to /api/channel/<channel_id>/nav/
|
||||
GET: get channel nav
|
||||
"""
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(ChannelNavSerializer()),
|
||||
},
|
||||
)
|
||||
def get(self, request, channel_id):
|
||||
"""get navigation"""
|
||||
|
||||
nav = ChannelNav(channel_id).get_nav()
|
||||
serializer = ChannelNavSerializer(nav)
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class ChannelApiSearchView(ApiBaseView):
|
||||
"""resolves to /api/channel/search/
|
||||
search for channel
|
||||
"""
|
||||
|
||||
search_base = "ta_channel/_doc/"
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(ChannelSerializer()),
|
||||
400: OpenApiResponse(description="Bad Request"),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Channel not found"
|
||||
),
|
||||
},
|
||||
parameters=[
|
||||
OpenApiParameter(
|
||||
name="q",
|
||||
description="Search query string",
|
||||
required=True,
|
||||
type=str,
|
||||
),
|
||||
],
|
||||
)
|
||||
def get(self, request):
|
||||
"""search for local channel ID"""
|
||||
|
||||
serializer = ChannelSearchQuerySerializer(data=request.query_params)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
|
||||
query = validated_data.get("q")
|
||||
if not query:
|
||||
message = "missing expected q parameter"
|
||||
return Response({"message": message, "data": False}, status=400)
|
||||
|
||||
try:
|
||||
parsed = Parser(query).parse()[0]
|
||||
except (ValueError, IndexError, AttributeError):
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": f"channel not found: {query}"}
|
||||
)
|
||||
return Response(error.data, status=404)
|
||||
|
||||
if not parsed["type"] == "channel":
|
||||
error = ErrorResponseSerializer({"error": "expected channel data"})
|
||||
return Response(error.data, status=400)
|
||||
|
||||
self.get_document(parsed["url"])
|
||||
serializer = ChannelSerializer(self.response)
|
||||
|
||||
return Response(serializer.data, status=self.status_code)
|
@ -1,143 +0,0 @@
|
||||
"""common serializers"""
|
||||
|
||||
# pylint: disable=abstract-method
|
||||
|
||||
from rest_framework import serializers
|
||||
|
||||
|
||||
class ValidateUnknownFieldsMixin:
|
||||
"""
|
||||
Mixin to validate and reject unknown fields in a serializer.
|
||||
"""
|
||||
|
||||
def to_internal_value(self, data):
|
||||
"""check expected keys"""
|
||||
allowed_fields = set(self.fields.keys())
|
||||
input_fields = set(data.keys())
|
||||
|
||||
# Find unknown fields
|
||||
unknown_fields = input_fields - allowed_fields
|
||||
if unknown_fields:
|
||||
raise serializers.ValidationError(
|
||||
{"error": f"Unknown fields: {', '.join(unknown_fields)}"}
|
||||
)
|
||||
|
||||
return super().to_internal_value(data)
|
||||
|
||||
|
||||
class ErrorResponseSerializer(serializers.Serializer):
|
||||
"""error message"""
|
||||
|
||||
error = serializers.CharField()
|
||||
|
||||
|
||||
class PaginationSerializer(serializers.Serializer):
|
||||
"""serialize paginate response"""
|
||||
|
||||
page_size = serializers.IntegerField()
|
||||
page_from = serializers.IntegerField()
|
||||
prev_pages = serializers.ListField(
|
||||
child=serializers.IntegerField(), allow_null=True
|
||||
)
|
||||
current_page = serializers.IntegerField()
|
||||
max_hits = serializers.BooleanField()
|
||||
params = serializers.CharField()
|
||||
last_page = serializers.IntegerField()
|
||||
next_pages = serializers.ListField(
|
||||
child=serializers.IntegerField(), allow_null=True
|
||||
)
|
||||
total_hits = serializers.IntegerField()
|
||||
|
||||
|
||||
class AsyncTaskResponseSerializer(serializers.Serializer):
|
||||
"""serialize new async task"""
|
||||
|
||||
message = serializers.CharField(required=False)
|
||||
task_id = serializers.CharField()
|
||||
status = serializers.CharField(required=False)
|
||||
filename = serializers.CharField(required=False)
|
||||
|
||||
|
||||
class NotificationSerializer(serializers.Serializer):
|
||||
"""serialize notification messages"""
|
||||
|
||||
id = serializers.CharField()
|
||||
title = serializers.CharField()
|
||||
group = serializers.CharField()
|
||||
api_start = serializers.BooleanField()
|
||||
api_stop = serializers.BooleanField()
|
||||
level = serializers.ChoiceField(choices=["info", "error"])
|
||||
messages = serializers.ListField(child=serializers.CharField())
|
||||
progress = serializers.FloatField(required=False)
|
||||
command = serializers.ChoiceField(choices=["STOP", "KILL"], required=False)
|
||||
|
||||
|
||||
class NotificationQueryFilterSerializer(serializers.Serializer):
|
||||
"""serialize notification query filter"""
|
||||
|
||||
filter = serializers.ChoiceField(
|
||||
choices=["download", "settings", "channel"], required=False
|
||||
)
|
||||
|
||||
|
||||
class PingUpdateSerializer(serializers.Serializer):
|
||||
"""serialize update notification"""
|
||||
|
||||
status = serializers.BooleanField()
|
||||
version = serializers.CharField()
|
||||
is_breaking = serializers.BooleanField()
|
||||
|
||||
|
||||
class PingSerializer(serializers.Serializer):
|
||||
"""serialize ping response"""
|
||||
|
||||
response = serializers.ChoiceField(choices=["pong"])
|
||||
user = serializers.IntegerField()
|
||||
version = serializers.CharField()
|
||||
ta_update = PingUpdateSerializer(required=False)
|
||||
|
||||
|
||||
class WatchedDataSerializer(serializers.Serializer):
|
||||
"""mark as watched serializer"""
|
||||
|
||||
id = serializers.CharField()
|
||||
is_watched = serializers.BooleanField()
|
||||
|
||||
|
||||
class RefreshQuerySerializer(serializers.Serializer):
|
||||
"""refresh query filtering"""
|
||||
|
||||
type = serializers.ChoiceField(
|
||||
choices=["video", "channel", "playlist"], required=False
|
||||
)
|
||||
id = serializers.CharField(required=False)
|
||||
|
||||
|
||||
class RefreshResponseSerializer(serializers.Serializer):
|
||||
"""serialize refresh response"""
|
||||
|
||||
state = serializers.ChoiceField(
|
||||
choices=["running", "queued", "empty", False]
|
||||
)
|
||||
total_queued = serializers.IntegerField()
|
||||
in_queue_name = serializers.CharField(required=False)
|
||||
|
||||
|
||||
class RefreshAddQuerySerializer(serializers.Serializer):
|
||||
"""serialize add to refresh queue"""
|
||||
|
||||
extract_videos = serializers.BooleanField(required=False)
|
||||
|
||||
|
||||
class RefreshAddDataSerializer(serializers.Serializer):
|
||||
"""add to refresh queue serializer"""
|
||||
|
||||
video = serializers.ListField(
|
||||
child=serializers.CharField(), required=False
|
||||
)
|
||||
channel = serializers.ListField(
|
||||
child=serializers.CharField(), required=False
|
||||
)
|
||||
playlist = serializers.ListField(
|
||||
child=serializers.CharField(), required=False
|
||||
)
|
@ -1,115 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- read and write application config backed by ES
|
||||
- encapsulate persistence of application properties
|
||||
"""
|
||||
|
||||
from os import environ
|
||||
|
||||
try:
|
||||
from dotenv import load_dotenv
|
||||
|
||||
print("loading local dotenv")
|
||||
load_dotenv(".env")
|
||||
except ModuleNotFoundError:
|
||||
pass
|
||||
|
||||
|
||||
class EnvironmentSettings:
|
||||
"""
|
||||
Handle settings for the application that are driven from the environment.
|
||||
These will not change when the user is using the application.
|
||||
These settings are only provided only on startup.
|
||||
"""
|
||||
|
||||
HOST_UID: int = int(environ.get("HOST_UID", False))
|
||||
HOST_GID: int = int(environ.get("HOST_GID", False))
|
||||
DISABLE_STATIC_AUTH: bool = bool(environ.get("DISABLE_STATIC_AUTH"))
|
||||
TZ: str = str(environ.get("TZ", "UTC"))
|
||||
TA_PORT: int = int(environ.get("TA_PORT", False))
|
||||
TA_BACKEND_PORT: int = int(environ.get("TA_BACKEND_PORT", False))
|
||||
TA_USERNAME: str = str(environ.get("TA_USERNAME"))
|
||||
TA_PASSWORD: str = str(environ.get("TA_PASSWORD"))
|
||||
|
||||
# Application Paths
|
||||
MEDIA_DIR: str = str(environ.get("TA_MEDIA_DIR", "/youtube"))
|
||||
APP_DIR: str = str(environ.get("TA_APP_DIR", "/app"))
|
||||
CACHE_DIR: str = str(environ.get("TA_CACHE_DIR", "/cache"))
|
||||
|
||||
# Redis
|
||||
REDIS_CON: str = str(environ.get("REDIS_CON"))
|
||||
REDIS_NAME_SPACE: str = str(environ.get("REDIS_NAME_SPACE", "ta:"))
|
||||
|
||||
# ElasticSearch
|
||||
ES_URL: str = str(environ.get("ES_URL"))
|
||||
ES_PASS: str = str(environ.get("ELASTIC_PASSWORD"))
|
||||
ES_USER: str = str(environ.get("ELASTIC_USER", "elastic"))
|
||||
ES_SNAPSHOT_DIR: str = str(
|
||||
environ.get(
|
||||
"ES_SNAPSHOT_DIR", "/usr/share/elasticsearch/data/snapshot"
|
||||
)
|
||||
)
|
||||
ES_DISABLE_VERIFY_SSL: bool = bool(environ.get("ES_DISABLE_VERIFY_SSL"))
|
||||
|
||||
def get_cache_root(self):
|
||||
"""get root for web server"""
|
||||
if self.CACHE_DIR.startswith("/"):
|
||||
return self.CACHE_DIR
|
||||
|
||||
return f"/{self.CACHE_DIR}"
|
||||
|
||||
def get_media_root(self):
|
||||
"""get root for media folder"""
|
||||
if self.MEDIA_DIR.startswith("/"):
|
||||
return self.MEDIA_DIR
|
||||
|
||||
return f"/{self.MEDIA_DIR}"
|
||||
|
||||
def print_generic(self):
|
||||
"""print generic env vars"""
|
||||
print(
|
||||
f"""
|
||||
HOST_UID: {self.HOST_UID}
|
||||
HOST_GID: {self.HOST_GID}
|
||||
TZ: {self.TZ}
|
||||
DISABLE_STATIC_AUTH: {self.DISABLE_STATIC_AUTH}
|
||||
TA_PORT: {self.TA_PORT}
|
||||
TA_BACKEND_PORT: {self.TA_BACKEND_PORT}
|
||||
TA_USERNAME: {self.TA_USERNAME}
|
||||
TA_PASSWORD: *****"""
|
||||
)
|
||||
|
||||
def print_paths(self):
|
||||
"""debug paths set"""
|
||||
print(
|
||||
f"""
|
||||
MEDIA_DIR: {self.MEDIA_DIR}
|
||||
APP_DIR: {self.APP_DIR}
|
||||
CACHE_DIR: {self.CACHE_DIR}"""
|
||||
)
|
||||
|
||||
def print_redis_conf(self):
|
||||
"""debug redis conf paths"""
|
||||
print(
|
||||
f"""
|
||||
REDIS_CON: {self.REDIS_CON}
|
||||
REDIS_NAME_SPACE: {self.REDIS_NAME_SPACE}"""
|
||||
)
|
||||
|
||||
def print_es_paths(self):
|
||||
"""debug es conf"""
|
||||
print(
|
||||
f"""
|
||||
ES_URL: {self.ES_URL}
|
||||
ES_PASS: *****
|
||||
ES_USER: {self.ES_USER}
|
||||
ES_SNAPSHOT_DIR: {self.ES_SNAPSHOT_DIR}
|
||||
ES_DISABLE_VERIFY_SSL: {self.ES_DISABLE_VERIFY_SSL}"""
|
||||
)
|
||||
|
||||
def print_all(self):
|
||||
"""print all"""
|
||||
self.print_generic()
|
||||
self.print_paths()
|
||||
self.print_redis_conf()
|
||||
self.print_es_paths()
|
@ -1,231 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- wrapper around requests to call elastic search
|
||||
- reusable search_after to extract total index
|
||||
"""
|
||||
|
||||
# pylint: disable=missing-timeout
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
import requests
|
||||
import urllib3
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
|
||||
|
||||
class ElasticWrap:
|
||||
"""makes all calls to elastic search
|
||||
returns response json and status code tuple
|
||||
"""
|
||||
|
||||
def __init__(self, path: str):
|
||||
self.url: str = f"{EnvironmentSettings.ES_URL}/{path}"
|
||||
self.auth: tuple[str, str] = (
|
||||
EnvironmentSettings.ES_USER,
|
||||
EnvironmentSettings.ES_PASS,
|
||||
)
|
||||
|
||||
if EnvironmentSettings.ES_DISABLE_VERIFY_SSL:
|
||||
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
|
||||
|
||||
def get(
|
||||
self,
|
||||
data: bool | dict = False,
|
||||
timeout: int = 10,
|
||||
print_error: bool = True,
|
||||
) -> tuple[dict, int]:
|
||||
"""get data from es"""
|
||||
|
||||
kwargs: dict[str, Any] = {
|
||||
"auth": self.auth,
|
||||
"timeout": timeout,
|
||||
}
|
||||
|
||||
if EnvironmentSettings.ES_DISABLE_VERIFY_SSL:
|
||||
kwargs["verify"] = False
|
||||
|
||||
if data:
|
||||
kwargs["json"] = data
|
||||
|
||||
response = requests.get(self.url, **kwargs)
|
||||
|
||||
if print_error and not response.ok:
|
||||
print(response.text)
|
||||
|
||||
return response.json(), response.status_code
|
||||
|
||||
def post(
|
||||
self, data: bool | dict = False, ndjson: bool = False
|
||||
) -> tuple[dict, int]:
|
||||
"""post data to es"""
|
||||
|
||||
kwargs: dict[str, Any] = {"auth": self.auth}
|
||||
|
||||
if ndjson and data:
|
||||
kwargs.update(
|
||||
{
|
||||
"headers": {"Content-type": "application/x-ndjson"},
|
||||
"data": data,
|
||||
}
|
||||
)
|
||||
elif data:
|
||||
kwargs.update(
|
||||
{
|
||||
"headers": {"Content-type": "application/json"},
|
||||
"data": json.dumps(data),
|
||||
}
|
||||
)
|
||||
|
||||
if EnvironmentSettings.ES_DISABLE_VERIFY_SSL:
|
||||
kwargs["verify"] = False
|
||||
|
||||
response = requests.post(self.url, **kwargs)
|
||||
|
||||
if not response.ok:
|
||||
print(response.text)
|
||||
|
||||
return response.json(), response.status_code
|
||||
|
||||
def put(
|
||||
self,
|
||||
data: bool | dict = False,
|
||||
refresh: bool = False,
|
||||
) -> tuple[dict, Any]:
|
||||
"""put data to es"""
|
||||
|
||||
if refresh:
|
||||
self.url = f"{self.url}/?refresh=true"
|
||||
|
||||
kwargs: dict[str, Any] = {
|
||||
"json": data,
|
||||
"auth": self.auth,
|
||||
}
|
||||
|
||||
if EnvironmentSettings.ES_DISABLE_VERIFY_SSL:
|
||||
kwargs["verify"] = False
|
||||
|
||||
response = requests.put(self.url, **kwargs)
|
||||
|
||||
if not response.ok:
|
||||
print(response.text)
|
||||
print(data)
|
||||
raise ValueError("failed to add item to index")
|
||||
|
||||
return response.json(), response.status_code
|
||||
|
||||
def delete(
|
||||
self,
|
||||
data: bool | dict = False,
|
||||
refresh: bool = False,
|
||||
) -> tuple[dict, Any]:
|
||||
"""delete document from es"""
|
||||
|
||||
if refresh:
|
||||
self.url = f"{self.url}/?refresh=true"
|
||||
|
||||
kwargs: dict[str, Any] = {"auth": self.auth}
|
||||
|
||||
if data:
|
||||
kwargs["json"] = data
|
||||
|
||||
if EnvironmentSettings.ES_DISABLE_VERIFY_SSL:
|
||||
kwargs["verify"] = False
|
||||
|
||||
response = requests.delete(self.url, **kwargs)
|
||||
|
||||
if not response.ok:
|
||||
print(response.text)
|
||||
|
||||
return response.json(), response.status_code
|
||||
|
||||
|
||||
class IndexPaginate:
|
||||
"""use search_after to go through whole index
|
||||
kwargs:
|
||||
- size: int, overwrite DEFAULT_SIZE
|
||||
- keep_source: bool, keep _source key from es results
|
||||
- callback: obj, Class implementing run method callback for every loop
|
||||
- task: task object to send notification
|
||||
- total: int, total items in index for progress message
|
||||
"""
|
||||
|
||||
DEFAULT_SIZE = 500
|
||||
|
||||
def __init__(self, index_name, data, **kwargs):
|
||||
self.index_name = index_name
|
||||
self.data = data
|
||||
self.pit_id = False
|
||||
self.kwargs = kwargs
|
||||
|
||||
def get_results(self):
|
||||
"""get all results, add task and total for notifications"""
|
||||
self.get_pit()
|
||||
self.validate_data()
|
||||
all_results = self.run_loop()
|
||||
self.clean_pit()
|
||||
return all_results
|
||||
|
||||
def get_pit(self):
|
||||
"""get pit for index"""
|
||||
path = f"{self.index_name}/_pit?keep_alive=10m"
|
||||
response, _ = ElasticWrap(path).post()
|
||||
self.pit_id = response["id"]
|
||||
|
||||
def validate_data(self):
|
||||
"""add pit and size to data"""
|
||||
if not self.data:
|
||||
self.data = {}
|
||||
|
||||
if "query" not in self.data.keys():
|
||||
self.data.update({"query": {"match_all": {}}})
|
||||
|
||||
if "sort" not in self.data.keys():
|
||||
self.data.update({"sort": [{"_doc": {"order": "desc"}}]})
|
||||
|
||||
self.data["size"] = self.kwargs.get("size") or self.DEFAULT_SIZE
|
||||
self.data["pit"] = {"id": self.pit_id, "keep_alive": "10m"}
|
||||
|
||||
def run_loop(self):
|
||||
"""loop through results until last hit"""
|
||||
all_results = []
|
||||
counter = 0
|
||||
while True:
|
||||
response, _ = ElasticWrap("_search").get(data=self.data)
|
||||
all_hits = response["hits"]["hits"]
|
||||
if not all_hits:
|
||||
break
|
||||
|
||||
for hit in all_hits:
|
||||
if self.kwargs.get("keep_source"):
|
||||
all_results.append(hit)
|
||||
else:
|
||||
all_results.append(hit["_source"])
|
||||
|
||||
if self.kwargs.get("callback"):
|
||||
self.kwargs.get("callback")(
|
||||
all_hits, self.index_name, counter=counter
|
||||
).run()
|
||||
|
||||
if self.kwargs.get("task"):
|
||||
print(f"{self.index_name}: processing page {counter}")
|
||||
self._notify(len(all_results))
|
||||
|
||||
counter += 1
|
||||
|
||||
# update search_after with last hit data
|
||||
self.data["search_after"] = all_hits[-1]["sort"]
|
||||
|
||||
return all_results
|
||||
|
||||
def _notify(self, processed):
|
||||
"""send notification on task"""
|
||||
total = self.kwargs.get("total")
|
||||
progress = processed / total
|
||||
index_clean = self.index_name.lstrip("ta_").title()
|
||||
message = [f"Processing {index_clean}s {processed}/{total}"]
|
||||
self.kwargs.get("task").send_progress(message, progress=progress)
|
||||
|
||||
def clean_pit(self):
|
||||
"""delete pit from elastic search"""
|
||||
ElasticWrap("_pit").delete(data={"id": self.pit_id})
|
@ -1,300 +0,0 @@
|
||||
"""
|
||||
Loose collection of helper functions
|
||||
- don't import AppConfig class here to avoid circular imports
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import random
|
||||
import string
|
||||
import subprocess
|
||||
from datetime import datetime, timezone
|
||||
from time import sleep
|
||||
from typing import Any
|
||||
from urllib.parse import urlparse
|
||||
|
||||
import requests
|
||||
from common.src.es_connect import IndexPaginate
|
||||
|
||||
|
||||
def ignore_filelist(filelist: list[str]) -> list[str]:
|
||||
"""ignore temp files for os.listdir sanitizer"""
|
||||
to_ignore = [
|
||||
"@eaDir",
|
||||
"Icon\r\r",
|
||||
"Network Trash Folder",
|
||||
"Temporary Items",
|
||||
]
|
||||
cleaned: list[str] = []
|
||||
for file_name in filelist:
|
||||
if file_name.startswith(".") or file_name in to_ignore:
|
||||
continue
|
||||
|
||||
cleaned.append(file_name)
|
||||
|
||||
return cleaned
|
||||
|
||||
|
||||
def randomizor(length: int) -> str:
|
||||
"""generate random alpha numeric string"""
|
||||
pool: str = string.digits + string.ascii_letters
|
||||
return "".join(random.choice(pool) for i in range(length))
|
||||
|
||||
|
||||
def rand_sleep(config) -> None:
|
||||
"""randomized sleep based on config"""
|
||||
sleep_config = config["downloads"].get("sleep_interval")
|
||||
if not sleep_config:
|
||||
return
|
||||
|
||||
secs = random.randrange(int(sleep_config * 0.5), int(sleep_config * 1.5))
|
||||
sleep(secs)
|
||||
|
||||
|
||||
def requests_headers() -> dict[str, str]:
|
||||
"""build header with random user agent for requests outside of yt-dlp"""
|
||||
|
||||
chrome_versions = (
|
||||
"90.0.4430.212",
|
||||
"90.0.4430.24",
|
||||
"90.0.4430.70",
|
||||
"90.0.4430.72",
|
||||
"90.0.4430.85",
|
||||
"90.0.4430.93",
|
||||
"91.0.4472.101",
|
||||
"91.0.4472.106",
|
||||
"91.0.4472.114",
|
||||
"91.0.4472.124",
|
||||
"91.0.4472.164",
|
||||
"91.0.4472.19",
|
||||
"91.0.4472.77",
|
||||
"92.0.4515.107",
|
||||
"92.0.4515.115",
|
||||
"92.0.4515.131",
|
||||
"92.0.4515.159",
|
||||
"92.0.4515.43",
|
||||
"93.0.4556.0",
|
||||
"93.0.4577.15",
|
||||
"93.0.4577.63",
|
||||
"93.0.4577.82",
|
||||
"94.0.4606.41",
|
||||
"94.0.4606.54",
|
||||
"94.0.4606.61",
|
||||
"94.0.4606.71",
|
||||
"94.0.4606.81",
|
||||
"94.0.4606.85",
|
||||
"95.0.4638.17",
|
||||
"95.0.4638.50",
|
||||
"95.0.4638.54",
|
||||
"95.0.4638.69",
|
||||
"95.0.4638.74",
|
||||
"96.0.4664.18",
|
||||
"96.0.4664.45",
|
||||
"96.0.4664.55",
|
||||
"96.0.4664.93",
|
||||
"97.0.4692.20",
|
||||
)
|
||||
template = (
|
||||
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
|
||||
+ "AppleWebKit/537.36 (KHTML, like Gecko) "
|
||||
+ f"Chrome/{random.choice(chrome_versions)} Safari/537.36"
|
||||
)
|
||||
|
||||
return {"User-Agent": template}
|
||||
|
||||
|
||||
def date_parser(timestamp: int | str) -> str:
|
||||
"""return formatted date string"""
|
||||
if isinstance(timestamp, int):
|
||||
date_obj = datetime.fromtimestamp(timestamp, tz=timezone.utc)
|
||||
elif isinstance(timestamp, str):
|
||||
date_obj = datetime.strptime(timestamp, "%Y-%m-%d")
|
||||
date_obj = date_obj.replace(tzinfo=timezone.utc)
|
||||
else:
|
||||
raise TypeError(f"invalid timestamp: {timestamp}")
|
||||
|
||||
return date_obj.isoformat()
|
||||
|
||||
|
||||
def time_parser(timestamp: str) -> float:
|
||||
"""return seconds from timestamp, false on empty"""
|
||||
if not timestamp:
|
||||
return False
|
||||
|
||||
if timestamp.isnumeric():
|
||||
return int(timestamp)
|
||||
|
||||
hours, minutes, seconds = timestamp.split(":", maxsplit=3)
|
||||
return int(hours) * 60 * 60 + int(minutes) * 60 + float(seconds)
|
||||
|
||||
|
||||
def clear_dl_cache(cache_dir: str) -> int:
|
||||
"""clear leftover files from dl cache"""
|
||||
print("clear download cache")
|
||||
download_cache_dir = os.path.join(cache_dir, "download")
|
||||
leftover_files = ignore_filelist(os.listdir(download_cache_dir))
|
||||
for cached in leftover_files:
|
||||
to_delete = os.path.join(download_cache_dir, cached)
|
||||
os.remove(to_delete)
|
||||
|
||||
return len(leftover_files)
|
||||
|
||||
|
||||
def get_mapping() -> dict:
|
||||
"""read index_mapping.json and get expected mapping and settings"""
|
||||
with open("appsettings/index_mapping.json", "r", encoding="utf-8") as f:
|
||||
index_config: dict = json.load(f).get("index_config")
|
||||
|
||||
return index_config
|
||||
|
||||
|
||||
def is_shorts(youtube_id: str) -> bool:
|
||||
"""check if youtube_id is a shorts video, bot not it it's not a shorts"""
|
||||
shorts_url = f"https://www.youtube.com/shorts/{youtube_id}"
|
||||
cookies = {"SOCS": "CAI"}
|
||||
response = requests.head(
|
||||
shorts_url, cookies=cookies, headers=requests_headers(), timeout=10
|
||||
)
|
||||
|
||||
return response.status_code == 200
|
||||
|
||||
|
||||
def get_duration_sec(file_path: str) -> int:
|
||||
"""get duration of media file from file path"""
|
||||
|
||||
duration = subprocess.run(
|
||||
[
|
||||
"ffprobe",
|
||||
"-v",
|
||||
"error",
|
||||
"-show_entries",
|
||||
"format=duration",
|
||||
"-of",
|
||||
"default=noprint_wrappers=1:nokey=1",
|
||||
file_path,
|
||||
],
|
||||
capture_output=True,
|
||||
check=True,
|
||||
)
|
||||
duration_raw = duration.stdout.decode().strip()
|
||||
if duration_raw == "N/A":
|
||||
return 0
|
||||
|
||||
duration_sec = int(float(duration_raw))
|
||||
return duration_sec
|
||||
|
||||
|
||||
def get_duration_str(seconds: int) -> str:
|
||||
"""Return a human-readable duration string from seconds."""
|
||||
if not seconds:
|
||||
return "NA"
|
||||
|
||||
units = [("y", 31536000), ("d", 86400), ("h", 3600), ("m", 60), ("s", 1)]
|
||||
duration_parts = []
|
||||
|
||||
for unit_label, unit_seconds in units:
|
||||
if seconds >= unit_seconds:
|
||||
unit_count, seconds = divmod(seconds, unit_seconds)
|
||||
duration_parts.append(f"{unit_count:02}{unit_label}")
|
||||
|
||||
duration_parts[0] = duration_parts[0].lstrip("0")
|
||||
|
||||
return " ".join(duration_parts)
|
||||
|
||||
|
||||
def ta_host_parser(ta_host: str) -> tuple[list[str], list[str]]:
|
||||
"""parse ta_host env var for ALLOWED_HOSTS and CSRF_TRUSTED_ORIGINS"""
|
||||
allowed_hosts: list[str] = [
|
||||
"localhost",
|
||||
"tubearchivist",
|
||||
]
|
||||
csrf_trusted_origins: list[str] = [
|
||||
"http://localhost",
|
||||
"http://tubearchivist",
|
||||
]
|
||||
for host in ta_host.split():
|
||||
host_clean = host.strip()
|
||||
if not host_clean.startswith("http"):
|
||||
host_clean = f"http://{host_clean}"
|
||||
|
||||
parsed = urlparse(host_clean)
|
||||
allowed_hosts.append(f"{parsed.hostname}")
|
||||
cors_url = f"{parsed.scheme}://{parsed.hostname}"
|
||||
|
||||
if parsed.port:
|
||||
cors_url = f"{cors_url}:{parsed.port}"
|
||||
|
||||
csrf_trusted_origins.append(cors_url)
|
||||
|
||||
return allowed_hosts, csrf_trusted_origins
|
||||
|
||||
|
||||
def get_stylesheets() -> list:
|
||||
"""Get all valid stylesheets from /static/css"""
|
||||
|
||||
stylesheets = [
|
||||
"dark.css",
|
||||
"light.css",
|
||||
"matrix.css",
|
||||
"midnight.css",
|
||||
"custom.css",
|
||||
]
|
||||
return stylesheets
|
||||
|
||||
|
||||
def check_stylesheet(stylesheet: str):
|
||||
"""Check if a stylesheet exists. Return dark.css as a fallback"""
|
||||
if stylesheet in get_stylesheets():
|
||||
return stylesheet
|
||||
|
||||
return "dark.css"
|
||||
|
||||
|
||||
def is_missing(
|
||||
to_check: str | list[str],
|
||||
index_name: str = "ta_video,ta_download",
|
||||
on_key: str = "youtube_id",
|
||||
) -> list[str]:
|
||||
"""id or list of ids that are missing from index_name"""
|
||||
if isinstance(to_check, str):
|
||||
to_check = [to_check]
|
||||
|
||||
data = {
|
||||
"query": {"terms": {on_key: to_check}},
|
||||
"_source": [on_key],
|
||||
}
|
||||
result = IndexPaginate(index_name, data=data).get_results()
|
||||
existing_ids = [i[on_key] for i in result]
|
||||
dl = [i for i in to_check if i not in existing_ids]
|
||||
|
||||
return dl
|
||||
|
||||
|
||||
def get_channel_overwrites() -> dict[str, dict[str, Any]]:
|
||||
"""get overwrites indexed my channel_id"""
|
||||
data = {
|
||||
"query": {
|
||||
"bool": {"must": [{"exists": {"field": "channel_overwrites"}}]}
|
||||
},
|
||||
"_source": ["channel_id", "channel_overwrites"],
|
||||
}
|
||||
result = IndexPaginate("ta_channel", data).get_results()
|
||||
overwrites = {i["channel_id"]: i["channel_overwrites"] for i in result}
|
||||
|
||||
return overwrites
|
||||
|
||||
|
||||
def calc_is_watched(duration: float, position: float) -> bool:
|
||||
"""considered watched based on duration position"""
|
||||
|
||||
if not duration or duration <= 0:
|
||||
return False
|
||||
|
||||
if duration < 60:
|
||||
threshold = 0.5
|
||||
elif duration > 900:
|
||||
threshold = 1 - (180 / duration)
|
||||
else:
|
||||
threshold = 0.9
|
||||
|
||||
return position >= duration * threshold
|
@ -1,219 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- processing search results for frontend
|
||||
- this is duplicated code from home.src.frontend.searching.SearchHandler
|
||||
"""
|
||||
|
||||
import urllib.parse
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.helper import date_parser, get_duration_str
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from download.src.thumbnails import ThumbManager
|
||||
|
||||
|
||||
class SearchProcess:
|
||||
"""process search results"""
|
||||
|
||||
def __init__(self, response, match_video_user_progress: None | int = None):
|
||||
self.response = response
|
||||
self.processed = False
|
||||
self.position_index = self.get_user_progress(match_video_user_progress)
|
||||
|
||||
def process(self):
|
||||
"""detect type and process"""
|
||||
if "_source" in self.response.keys():
|
||||
# single
|
||||
self.processed = self._process_result(self.response)
|
||||
|
||||
elif "hits" in self.response.keys():
|
||||
# multiple
|
||||
self.processed = []
|
||||
all_sources = self.response["hits"]["hits"]
|
||||
for result in all_sources:
|
||||
self.processed.append(self._process_result(result))
|
||||
|
||||
return self.processed
|
||||
|
||||
def get_user_progress(self, match_video_user_progress) -> dict | None:
|
||||
"""get user video watch progress"""
|
||||
if not match_video_user_progress:
|
||||
return None
|
||||
|
||||
query = f"{match_video_user_progress}:progress:*"
|
||||
all_positions = RedisArchivist().list_items(query)
|
||||
if not all_positions:
|
||||
return None
|
||||
|
||||
pos_index = {
|
||||
i["youtube_id"]: i["position"]
|
||||
for i in all_positions
|
||||
if not i.get("watched")
|
||||
}
|
||||
return pos_index
|
||||
|
||||
def _process_result(self, result):
|
||||
"""detect which type of data to process"""
|
||||
index = result["_index"]
|
||||
processed = False
|
||||
if index == "ta_video":
|
||||
processed = self._process_video(result["_source"])
|
||||
if index == "ta_channel":
|
||||
processed = self._process_channel(result["_source"])
|
||||
if index == "ta_playlist":
|
||||
processed = self._process_playlist(result["_source"])
|
||||
if index == "ta_download":
|
||||
processed = self._process_download(result["_source"])
|
||||
if index == "ta_comment":
|
||||
processed = self._process_comment(result["_source"])
|
||||
if index == "ta_subtitle":
|
||||
processed = self._process_subtitle(result)
|
||||
|
||||
if isinstance(processed, dict):
|
||||
processed.update(
|
||||
{
|
||||
"_index": index,
|
||||
"_score": round(result.get("_score") or 0, 2),
|
||||
}
|
||||
)
|
||||
|
||||
return processed
|
||||
|
||||
@staticmethod
|
||||
def _process_channel(channel_dict):
|
||||
"""run on single channel"""
|
||||
channel_id = channel_dict["channel_id"]
|
||||
cache_root = EnvironmentSettings().get_cache_root()
|
||||
art_base = f"{cache_root}/channels/{channel_id}"
|
||||
date_str = date_parser(channel_dict["channel_last_refresh"])
|
||||
channel_dict.update(
|
||||
{
|
||||
"channel_last_refresh": date_str,
|
||||
"channel_banner_url": f"{art_base}_banner.jpg",
|
||||
"channel_thumb_url": f"{art_base}_thumb.jpg",
|
||||
"channel_tvart_url": f"{art_base}_tvart.jpg",
|
||||
}
|
||||
)
|
||||
|
||||
return dict(sorted(channel_dict.items()))
|
||||
|
||||
def _process_video(self, video_dict):
|
||||
"""run on single video dict"""
|
||||
video_id = video_dict["youtube_id"]
|
||||
media_url = urllib.parse.quote(video_dict["media_url"])
|
||||
vid_last_refresh = date_parser(video_dict["vid_last_refresh"])
|
||||
published = date_parser(video_dict["published"])
|
||||
vid_thumb_url = ThumbManager(video_id).vid_thumb_path()
|
||||
channel = self._process_channel(video_dict["channel"])
|
||||
|
||||
cache_root = EnvironmentSettings().get_cache_root()
|
||||
media_root = EnvironmentSettings().get_media_root()
|
||||
|
||||
if "subtitles" in video_dict:
|
||||
for idx, _ in enumerate(video_dict["subtitles"]):
|
||||
url = video_dict["subtitles"][idx]["media_url"]
|
||||
video_dict["subtitles"][idx][
|
||||
"media_url"
|
||||
] = f"{media_root}/{url}"
|
||||
else:
|
||||
video_dict["subtitles"] = []
|
||||
|
||||
video_dict.update(
|
||||
{
|
||||
"channel": channel,
|
||||
"media_url": f"{media_root}/{media_url}",
|
||||
"vid_last_refresh": vid_last_refresh,
|
||||
"published": published,
|
||||
"vid_thumb_url": f"{cache_root}/{vid_thumb_url}",
|
||||
}
|
||||
)
|
||||
|
||||
if self.position_index:
|
||||
player_position = self.position_index.get(video_id)
|
||||
total = video_dict["player"].get("duration")
|
||||
if player_position and total:
|
||||
progress = 100 * (player_position / total)
|
||||
video_dict["player"].update(
|
||||
{
|
||||
"progress": progress,
|
||||
"position": player_position,
|
||||
}
|
||||
)
|
||||
|
||||
if "playlist" not in video_dict:
|
||||
video_dict["playlist"] = []
|
||||
|
||||
return dict(sorted(video_dict.items()))
|
||||
|
||||
@staticmethod
|
||||
def _process_playlist(playlist_dict):
|
||||
"""run on single playlist dict"""
|
||||
playlist_id = playlist_dict["playlist_id"]
|
||||
playlist_last_refresh = date_parser(
|
||||
playlist_dict["playlist_last_refresh"]
|
||||
)
|
||||
cache_root = EnvironmentSettings().get_cache_root()
|
||||
playlist_thumbnail = f"{cache_root}/playlists/{playlist_id}.jpg"
|
||||
playlist_dict.update(
|
||||
{
|
||||
"playlist_thumbnail": playlist_thumbnail,
|
||||
"playlist_last_refresh": playlist_last_refresh,
|
||||
}
|
||||
)
|
||||
|
||||
return dict(sorted(playlist_dict.items()))
|
||||
|
||||
def _process_download(self, download_dict):
|
||||
"""run on single download item"""
|
||||
video_id = download_dict["youtube_id"]
|
||||
cache_root = EnvironmentSettings().get_cache_root()
|
||||
vid_thumb_url = ThumbManager(video_id).vid_thumb_path()
|
||||
published = date_parser(download_dict["published"])
|
||||
|
||||
download_dict.update(
|
||||
{
|
||||
"vid_thumb_url": f"{cache_root}/{vid_thumb_url}",
|
||||
"published": published,
|
||||
}
|
||||
)
|
||||
return dict(sorted(download_dict.items()))
|
||||
|
||||
def _process_comment(self, comment_dict):
|
||||
"""run on all comments, create reply thread"""
|
||||
all_comments = comment_dict["comment_comments"]
|
||||
processed_comments = []
|
||||
|
||||
for comment in all_comments:
|
||||
if comment["comment_parent"] == "root":
|
||||
comment.update({"comment_replies": []})
|
||||
processed_comments.append(comment)
|
||||
else:
|
||||
processed_comments[-1]["comment_replies"].append(comment)
|
||||
|
||||
return processed_comments
|
||||
|
||||
def _process_subtitle(self, result):
|
||||
"""take complete result dict to extract highlight"""
|
||||
subtitle_dict = result["_source"]
|
||||
highlight = result.get("highlight")
|
||||
if highlight:
|
||||
# replace lines with the highlighted markdown
|
||||
subtitle_line = highlight.get("subtitle_line")[0]
|
||||
subtitle_dict.update({"subtitle_line": subtitle_line})
|
||||
|
||||
thumb_path = ThumbManager(subtitle_dict["youtube_id"]).vid_thumb_path()
|
||||
subtitle_dict.update({"vid_thumb_url": f"/cache/{thumb_path}"})
|
||||
|
||||
return subtitle_dict
|
||||
|
||||
|
||||
def process_aggs(response):
|
||||
"""convert aggs duration to str"""
|
||||
|
||||
if response.get("aggregations"):
|
||||
aggs = response["aggregations"]
|
||||
if "total_duration" in aggs:
|
||||
duration_sec = int(aggs["total_duration"]["value"])
|
||||
aggs["total_duration"].update(
|
||||
{"value_str": get_duration_str(duration_sec)}
|
||||
)
|
@ -1,258 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- interact with redis
|
||||
- hold temporary download queue in redis
|
||||
- interact with celery tasks results
|
||||
"""
|
||||
|
||||
import json
|
||||
|
||||
import redis
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
|
||||
|
||||
class RedisBase:
|
||||
"""connection base for redis"""
|
||||
|
||||
NAME_SPACE: str = EnvironmentSettings.REDIS_NAME_SPACE
|
||||
|
||||
def __init__(self):
|
||||
self.conn = redis.from_url(
|
||||
url=EnvironmentSettings.REDIS_CON, decode_responses=True
|
||||
)
|
||||
|
||||
|
||||
class RedisArchivist(RedisBase):
|
||||
"""collection of methods to interact with redis"""
|
||||
|
||||
CHANNELS: list[str] = [
|
||||
"download",
|
||||
"add",
|
||||
"rescan",
|
||||
"subchannel",
|
||||
"subplaylist",
|
||||
"playlistscan",
|
||||
"setting",
|
||||
]
|
||||
|
||||
def set_message(
|
||||
self,
|
||||
key: str,
|
||||
message: dict | str,
|
||||
expire: bool | int = False,
|
||||
save: bool = False,
|
||||
) -> None:
|
||||
"""write new message to redis"""
|
||||
to_write = (
|
||||
json.dumps(message) if isinstance(message, dict) else message
|
||||
)
|
||||
self.conn.execute_command("SET", self.NAME_SPACE + key, to_write)
|
||||
|
||||
if expire:
|
||||
if isinstance(expire, bool):
|
||||
secs: int = 20
|
||||
else:
|
||||
secs = expire
|
||||
self.conn.execute_command("EXPIRE", self.NAME_SPACE + key, secs)
|
||||
|
||||
if save:
|
||||
self.bg_save()
|
||||
|
||||
def bg_save(self) -> None:
|
||||
"""save to aof"""
|
||||
try:
|
||||
self.conn.bgsave()
|
||||
except redis.exceptions.ResponseError:
|
||||
pass
|
||||
|
||||
def get_message_str(self, key: str) -> str | None:
|
||||
"""get message string"""
|
||||
reply = self.conn.execute_command("GET", self.NAME_SPACE + key)
|
||||
return reply
|
||||
|
||||
def get_message_dict(self, key: str) -> dict:
|
||||
"""get message dict"""
|
||||
reply = self.conn.execute_command("GET", self.NAME_SPACE + key)
|
||||
if not reply:
|
||||
return {}
|
||||
|
||||
return json.loads(reply)
|
||||
|
||||
def get_message(self, key: str) -> dict | None:
|
||||
"""
|
||||
get message dict from redis
|
||||
old json get message, only used for migration, to be removed later
|
||||
"""
|
||||
reply = self.conn.execute_command("JSON.GET", self.NAME_SPACE + key)
|
||||
if reply:
|
||||
return json.loads(reply)
|
||||
|
||||
return {"status": False}
|
||||
|
||||
def list_keys(self, query: str) -> list:
|
||||
"""return all key matches"""
|
||||
reply = self.conn.execute_command(
|
||||
"KEYS", self.NAME_SPACE + query + "*"
|
||||
)
|
||||
if not reply:
|
||||
return []
|
||||
|
||||
return [i.lstrip(self.NAME_SPACE) for i in reply]
|
||||
|
||||
def list_items(self, query: str) -> list:
|
||||
"""list all matches"""
|
||||
all_matches = self.list_keys(query)
|
||||
if not all_matches:
|
||||
return []
|
||||
|
||||
return [self.get_message_dict(i) for i in all_matches]
|
||||
|
||||
def del_message(self, key: str, save: bool = False) -> bool:
|
||||
"""delete key from redis"""
|
||||
response = self.conn.execute_command("DEL", self.NAME_SPACE + key)
|
||||
if save:
|
||||
self.bg_save()
|
||||
|
||||
return response
|
||||
|
||||
|
||||
class RedisQueue(RedisBase):
|
||||
"""
|
||||
dynamically interact with queues in redis using sorted set
|
||||
- low score number is first in queue
|
||||
- add new items with high score number
|
||||
|
||||
queue names in use:
|
||||
download:channel channels during download
|
||||
download:playlist:full playlists during dl for full refresh
|
||||
download:playlist:quick playlists during dl for quick refresh
|
||||
download:video videos during downloads
|
||||
index:comment videos needing comment indexing
|
||||
reindex:ta_video reindex videos
|
||||
reindex:ta_channel reindex channels
|
||||
reindex:ta_playlist reindex playlists
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, queue_name: str):
|
||||
super().__init__()
|
||||
self.key = f"{self.NAME_SPACE}{queue_name}"
|
||||
|
||||
def get_all(self) -> list[str]:
|
||||
"""return all elements in list"""
|
||||
result = self.conn.zrange(self.key, 0, -1)
|
||||
return result
|
||||
|
||||
def length(self) -> int:
|
||||
"""return total elements in list"""
|
||||
return self.conn.zcard(self.key)
|
||||
|
||||
def in_queue(self, element) -> str | bool:
|
||||
"""check if element is in list"""
|
||||
result = self.conn.zrank(self.key, element)
|
||||
if result is not None:
|
||||
return "in_queue"
|
||||
|
||||
return False
|
||||
|
||||
def add(self, to_add: str) -> None:
|
||||
"""add single item to queue"""
|
||||
if not to_add:
|
||||
return
|
||||
|
||||
next_score = self._get_next_score()
|
||||
self.conn.zadd(self.key, {to_add: next_score})
|
||||
|
||||
def add_list(self, to_add: list) -> None:
|
||||
"""add list to queue"""
|
||||
if not to_add:
|
||||
return
|
||||
|
||||
next_score = self._get_next_score()
|
||||
mapping = {i[1]: next_score + i[0] for i in enumerate(to_add)}
|
||||
self.conn.zadd(self.key, mapping)
|
||||
|
||||
def max_score(self) -> int | None:
|
||||
"""get max score"""
|
||||
last = self.conn.zrange(self.key, -1, -1, withscores=True)
|
||||
if not last:
|
||||
return None
|
||||
|
||||
return int(last[0][1])
|
||||
|
||||
def _get_next_score(self) -> float:
|
||||
"""get next score in queue to append"""
|
||||
last = self.conn.zrange(self.key, -1, -1, withscores=True)
|
||||
if not last:
|
||||
return 1.0
|
||||
|
||||
return last[0][1] + 1
|
||||
|
||||
def get_next(self) -> tuple[str | None, int | None]:
|
||||
"""return next element in the queue, if available"""
|
||||
result = self.conn.zpopmin(self.key)
|
||||
if not result:
|
||||
return None, None
|
||||
|
||||
item, idx = result[0][0], int(result[0][1])
|
||||
|
||||
return item, idx
|
||||
|
||||
def clear(self) -> None:
|
||||
"""delete list from redis"""
|
||||
self.conn.delete(self.key)
|
||||
|
||||
|
||||
class TaskRedis(RedisBase):
|
||||
"""interact with redis tasks"""
|
||||
|
||||
BASE: str = "celery-task-meta-"
|
||||
EXPIRE: int = 60 * 60 * 24
|
||||
COMMANDS: list[str] = ["STOP", "KILL"]
|
||||
|
||||
def get_all(self) -> list:
|
||||
"""return all tasks"""
|
||||
all_keys = self.conn.execute_command("KEYS", f"{self.BASE}*")
|
||||
return [i.replace(self.BASE, "") for i in all_keys]
|
||||
|
||||
def get_single(self, task_id: str) -> dict:
|
||||
"""return content of single task"""
|
||||
result = self.conn.execute_command("GET", self.BASE + task_id)
|
||||
if not result:
|
||||
return {}
|
||||
|
||||
return json.loads(result)
|
||||
|
||||
def set_key(
|
||||
self, task_id: str, message: dict, expire: bool | int = False
|
||||
) -> None:
|
||||
"""set value for lock, initial or update"""
|
||||
key: str = f"{self.BASE}{task_id}"
|
||||
self.conn.execute_command("SET", key, json.dumps(message))
|
||||
|
||||
if expire:
|
||||
self.conn.execute_command("EXPIRE", key, self.EXPIRE)
|
||||
|
||||
def set_command(self, task_id: str, command: str) -> None:
|
||||
"""set task command"""
|
||||
if command not in self.COMMANDS:
|
||||
print(f"{command} not in valid commands {self.COMMANDS}")
|
||||
raise ValueError
|
||||
|
||||
message = self.get_single(task_id)
|
||||
if not message:
|
||||
print(f"{task_id} not found")
|
||||
raise KeyError
|
||||
|
||||
message.update({"command": command})
|
||||
self.set_key(task_id, message)
|
||||
|
||||
def del_task(self, task_id: str) -> None:
|
||||
"""delete task result by id"""
|
||||
self.conn.execute_command("DEL", f"{self.BASE}{task_id}")
|
||||
|
||||
def del_all(self) -> None:
|
||||
"""delete all task results"""
|
||||
all_tasks = self.get_all()
|
||||
for task_id in all_tasks:
|
||||
self.del_task(task_id)
|
@ -1,192 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- detect valid youtube ids and links from multi line string
|
||||
- identify vid_type if possible
|
||||
"""
|
||||
|
||||
from urllib.parse import parse_qs, urlparse
|
||||
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from download.src.yt_dlp_base import YtWrap
|
||||
from video.src.constants import VideoTypeEnum
|
||||
|
||||
|
||||
class Parser:
|
||||
"""
|
||||
take a multi line string and detect valid youtube ids
|
||||
channel handle lookup is cached, can be disabled for unittests
|
||||
"""
|
||||
|
||||
def __init__(self, url_str, use_cache=True):
|
||||
self.url_list = [i.strip() for i in url_str.split()]
|
||||
self.use_cache = use_cache
|
||||
|
||||
def parse(self):
|
||||
"""parse the list"""
|
||||
ids = []
|
||||
for url in self.url_list:
|
||||
parsed = urlparse(url)
|
||||
if parsed.netloc:
|
||||
# is url
|
||||
identified = self.process_url(parsed)
|
||||
else:
|
||||
# is not url
|
||||
identified = self._find_valid_id(url)
|
||||
|
||||
if "vid_type" not in identified:
|
||||
identified.update(self._detect_vid_type(parsed.path))
|
||||
|
||||
ids.append(identified)
|
||||
|
||||
return ids
|
||||
|
||||
def process_url(self, parsed):
|
||||
"""process as url"""
|
||||
if parsed.netloc == "youtu.be":
|
||||
# shortened
|
||||
youtube_id = parsed.path.strip("/")
|
||||
return self._validate_expected(youtube_id, "video")
|
||||
|
||||
if "youtube.com" not in parsed.netloc:
|
||||
message = f"invalid domain: {parsed.netloc}"
|
||||
raise ValueError(message)
|
||||
|
||||
query_parsed = parse_qs(parsed.query)
|
||||
if "v" in query_parsed:
|
||||
# video from v query str
|
||||
youtube_id = query_parsed["v"][0]
|
||||
return self._validate_expected(youtube_id, "video")
|
||||
|
||||
if "list" in query_parsed:
|
||||
# playlist from list query str
|
||||
youtube_id = query_parsed["list"][0]
|
||||
return self._validate_expected(youtube_id, "playlist")
|
||||
|
||||
all_paths = parsed.path.strip("/").split("/")
|
||||
if all_paths[0] == "shorts":
|
||||
# is shorts video
|
||||
item = self._validate_expected(all_paths[1], "video")
|
||||
item.update({"vid_type": VideoTypeEnum.SHORTS.value})
|
||||
return item
|
||||
|
||||
if all_paths[0] == "channel":
|
||||
return self._validate_expected(all_paths[1], "channel")
|
||||
|
||||
if all_paths[0] == "live":
|
||||
return self._validate_expected(all_paths[1], "video")
|
||||
|
||||
# detect channel
|
||||
channel_id = self._extract_channel_name(parsed.geturl())
|
||||
return {"type": "channel", "url": channel_id}
|
||||
|
||||
def _validate_expected(self, youtube_id, expected_type):
|
||||
"""raise value error if not matching"""
|
||||
matched = self._find_valid_id(youtube_id)
|
||||
if matched["type"] != expected_type:
|
||||
raise ValueError(
|
||||
f"{youtube_id} not of expected type {expected_type}"
|
||||
)
|
||||
|
||||
return {"type": expected_type, "url": youtube_id}
|
||||
|
||||
def _find_valid_id(self, id_str):
|
||||
"""detect valid id from length of string"""
|
||||
if id_str in ("LL", "WL"):
|
||||
return {"type": "playlist", "url": id_str}
|
||||
|
||||
if id_str.startswith("@"):
|
||||
url = f"https://www.youtube.com/{id_str}"
|
||||
channel_id = self._extract_channel_name(url)
|
||||
return {"type": "channel", "url": channel_id}
|
||||
|
||||
len_id_str = len(id_str)
|
||||
if len_id_str == 11:
|
||||
item_type = "video"
|
||||
elif len_id_str == 24:
|
||||
item_type = "channel"
|
||||
elif len_id_str in (34, 26, 18) or id_str.startswith("TA_playlist_"):
|
||||
item_type = "playlist"
|
||||
else:
|
||||
raise ValueError(f"not a valid id_str: {id_str}")
|
||||
|
||||
return {"type": item_type, "url": id_str}
|
||||
|
||||
def _extract_channel_name(self, url):
|
||||
"""find channel id from channel name with yt-dlp help, cache result"""
|
||||
if self.use_cache:
|
||||
cached = self._get_cached(url)
|
||||
if cached:
|
||||
return cached
|
||||
|
||||
obs_request = {
|
||||
"check_formats": None,
|
||||
"skip_download": True,
|
||||
"extract_flat": True,
|
||||
"playlistend": 0,
|
||||
}
|
||||
url_info = YtWrap(obs_request).extract(url)
|
||||
if not url_info:
|
||||
raise ValueError(f"failed to retrieve content from URL: {url}")
|
||||
|
||||
channel_id = url_info.get("channel_id", False)
|
||||
if channel_id:
|
||||
if self.use_cache:
|
||||
self._set_cache(url, channel_id)
|
||||
|
||||
return channel_id
|
||||
|
||||
url = url_info.get("url", False)
|
||||
if url:
|
||||
# handle old channel name redirect with url path split
|
||||
channel_id = urlparse(url).path.strip("/").split("/")[1]
|
||||
|
||||
return channel_id
|
||||
|
||||
print(f"failed to extract channel id from {url}")
|
||||
raise ValueError
|
||||
|
||||
@staticmethod
|
||||
def _get_cached(url) -> str | None:
|
||||
"""get cached channel ID, if available"""
|
||||
path = urlparse(url).path.lstrip("/")
|
||||
if not path.startswith("@"):
|
||||
return None
|
||||
|
||||
handle = path.split("/")[0]
|
||||
if not handle:
|
||||
return None
|
||||
|
||||
cache_key = f"channel:handlesearch:{handle.lower()}"
|
||||
cached = RedisArchivist().get_message_dict(cache_key)
|
||||
if cached:
|
||||
return cached["channel_id"]
|
||||
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _set_cache(url, channel_id) -> None:
|
||||
"""set cache"""
|
||||
path = urlparse(url).path.lstrip("/")
|
||||
if not path.startswith("@"):
|
||||
return
|
||||
|
||||
handle = path.split("/")[0]
|
||||
if not handle:
|
||||
return
|
||||
|
||||
cache_key = f"channel:handlesearch:{handle.lower()}"
|
||||
message = {
|
||||
"channel_id": channel_id,
|
||||
"handle": handle,
|
||||
}
|
||||
RedisArchivist().set_message(cache_key, message, expire=3600 * 24 * 7)
|
||||
|
||||
def _detect_vid_type(self, path):
|
||||
"""try to match enum from path, needs to be serializable"""
|
||||
last = path.strip("/").split("/")[-1]
|
||||
try:
|
||||
vid_type = VideoTypeEnum(last).value
|
||||
except ValueError:
|
||||
vid_type = VideoTypeEnum.UNKNOWN.value
|
||||
|
||||
return {"vid_type": vid_type}
|
@ -1,104 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- handle watched state for videos, channels and playlists
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from common.src.urlparser import Parser
|
||||
|
||||
|
||||
class WatchState:
|
||||
"""handle watched checkbox for videos and channels"""
|
||||
|
||||
def __init__(self, youtube_id: str, is_watched: bool, user_id: int):
|
||||
self.youtube_id = youtube_id
|
||||
self.is_watched = is_watched
|
||||
self.user_id = user_id
|
||||
self.stamp = int(datetime.now().timestamp())
|
||||
self.pipeline = f"_ingest/pipeline/watch_{youtube_id}"
|
||||
|
||||
def change(self):
|
||||
"""change watched state of item(s)"""
|
||||
print(f"{self.youtube_id}: change watched state to {self.is_watched}")
|
||||
url_type = self._dedect_type()
|
||||
if url_type == "video":
|
||||
self.change_vid_state()
|
||||
return
|
||||
|
||||
self._add_pipeline()
|
||||
path = f"ta_video/_update_by_query?pipeline=watch_{self.youtube_id}"
|
||||
data = self._build_update_data(url_type)
|
||||
_, _ = ElasticWrap(path).post(data)
|
||||
self._delete_pipeline()
|
||||
|
||||
def _dedect_type(self):
|
||||
"""find youtube id type"""
|
||||
url_process = Parser(self.youtube_id).parse()
|
||||
url_type = url_process[0]["type"]
|
||||
return url_type
|
||||
|
||||
def change_vid_state(self):
|
||||
"""change watched state of video"""
|
||||
path = f"ta_video/_update/{self.youtube_id}"
|
||||
data = {"doc": {"player": {"watched": self.is_watched}}}
|
||||
if self.is_watched:
|
||||
data["doc"]["player"]["watched_date"] = self.stamp
|
||||
response, status_code = ElasticWrap(path).post(data=data)
|
||||
key = f"{self.user_id}:progress:{self.youtube_id}"
|
||||
RedisArchivist().del_message(key)
|
||||
if status_code != 200:
|
||||
print(response)
|
||||
raise ValueError("failed to mark video as watched")
|
||||
|
||||
def _build_update_data(self, url_type):
|
||||
"""build update by query data based on url_type"""
|
||||
term_key_map = {
|
||||
"channel": "channel.channel_id",
|
||||
"playlist": "playlist.keyword",
|
||||
}
|
||||
term_key = term_key_map.get(url_type)
|
||||
|
||||
return {
|
||||
"query": {
|
||||
"bool": {
|
||||
"must": [
|
||||
{"term": {term_key: {"value": self.youtube_id}}},
|
||||
{
|
||||
"term": {
|
||||
"player.watched": {
|
||||
"value": not self.is_watched
|
||||
}
|
||||
}
|
||||
},
|
||||
],
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
def _add_pipeline(self):
|
||||
"""add ingest pipeline"""
|
||||
data = {
|
||||
"description": f"{self.youtube_id}: watched {self.is_watched}",
|
||||
"processors": [
|
||||
{
|
||||
"set": {
|
||||
"field": "player.watched",
|
||||
"value": self.is_watched,
|
||||
}
|
||||
},
|
||||
{
|
||||
"set": {
|
||||
"field": "player.watched_date",
|
||||
"value": self.stamp,
|
||||
}
|
||||
},
|
||||
],
|
||||
}
|
||||
_, _ = ElasticWrap(self.pipeline).put(data)
|
||||
|
||||
def _delete_pipeline(self):
|
||||
"""delete pipeline"""
|
||||
ElasticWrap(self.pipeline).delete()
|
@ -1,11 +0,0 @@
|
||||
"""test configs"""
|
||||
|
||||
import os
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
@pytest.fixture(scope="session", autouse=True)
|
||||
def change_test_dir(request):
|
||||
"""change directory to project folder"""
|
||||
os.chdir(request.config.rootdir / "backend")
|
@ -1,113 +0,0 @@
|
||||
"""tests for helper functions"""
|
||||
|
||||
import pytest
|
||||
from common.src.helper import (
|
||||
date_parser,
|
||||
get_duration_str,
|
||||
get_mapping,
|
||||
is_shorts,
|
||||
randomizor,
|
||||
time_parser,
|
||||
)
|
||||
|
||||
|
||||
def test_randomizor_with_positive_length():
|
||||
"""test randomizer"""
|
||||
length = 10
|
||||
result = randomizor(length)
|
||||
assert len(result) == length
|
||||
assert result.isalnum()
|
||||
|
||||
|
||||
def test_date_parser_with_int():
|
||||
"""unix timestamp"""
|
||||
timestamp = 1621539600
|
||||
expected_date = "2021-05-20T19:40:00+00:00"
|
||||
assert date_parser(timestamp) == expected_date
|
||||
|
||||
|
||||
def test_date_parser_with_str():
|
||||
"""iso timestamp"""
|
||||
date_str = "2021-05-21"
|
||||
expected_date = "2021-05-21T00:00:00+00:00"
|
||||
assert date_parser(date_str) == expected_date
|
||||
|
||||
|
||||
def test_date_parser_with_invalid_input():
|
||||
"""invalid type"""
|
||||
invalid_input = [1621539600]
|
||||
with pytest.raises(TypeError):
|
||||
date_parser(invalid_input)
|
||||
|
||||
|
||||
def test_date_parser_with_invalid_string_format():
|
||||
"""invalid date string"""
|
||||
invalid_date_str = "21/05/2021"
|
||||
with pytest.raises(ValueError):
|
||||
date_parser(invalid_date_str)
|
||||
|
||||
|
||||
def test_time_parser_with_numeric_string():
|
||||
"""as number"""
|
||||
timestamp = "100"
|
||||
expected_seconds = 100
|
||||
assert time_parser(timestamp) == expected_seconds
|
||||
|
||||
|
||||
def test_time_parser_with_hh_mm_ss_format():
|
||||
"""to seconds"""
|
||||
timestamp = "01:00:00"
|
||||
expected_seconds = 3600.0
|
||||
assert time_parser(timestamp) == expected_seconds
|
||||
|
||||
|
||||
def test_time_parser_with_empty_string():
|
||||
"""handle empty"""
|
||||
timestamp = ""
|
||||
assert time_parser(timestamp) is False
|
||||
|
||||
|
||||
def test_time_parser_with_invalid_format():
|
||||
"""not enough to unpack"""
|
||||
timestamp = "01:00"
|
||||
with pytest.raises(ValueError):
|
||||
time_parser(timestamp)
|
||||
|
||||
|
||||
def test_time_parser_with_non_numeric_input():
|
||||
"""non numeric"""
|
||||
timestamp = "1a:00:00"
|
||||
with pytest.raises(ValueError):
|
||||
time_parser(timestamp)
|
||||
|
||||
|
||||
def test_get_mapping():
|
||||
"""test mappint"""
|
||||
index_config = get_mapping()
|
||||
assert isinstance(index_config, list)
|
||||
assert all(isinstance(i, dict) for i in index_config)
|
||||
|
||||
|
||||
def test_is_shorts():
|
||||
"""is shorts id"""
|
||||
youtube_id = "YG3-Pw3rixU"
|
||||
assert is_shorts(youtube_id)
|
||||
|
||||
|
||||
def test_is_not_shorts():
|
||||
"""is not shorts id"""
|
||||
youtube_id = "Ogr9kbypSNg"
|
||||
assert is_shorts(youtube_id) is False
|
||||
|
||||
|
||||
def test_get_duration_str():
|
||||
"""only seconds"""
|
||||
assert get_duration_str(None) == "NA"
|
||||
assert get_duration_str(5) == "5s"
|
||||
assert get_duration_str(10) == "10s"
|
||||
assert get_duration_str(500) == "8m 20s"
|
||||
assert get_duration_str(1000) == "16m 40s"
|
||||
assert get_duration_str(5000) == "1h 23m 20s"
|
||||
assert get_duration_str(500000) == "5d 18h 53m 20s"
|
||||
assert get_duration_str(5000000) == "57d 20h 53m 20s"
|
||||
assert get_duration_str(50000000) == "1y 213d 16h 53m 20s"
|
@ -1,145 +0,0 @@
|
||||
"""tests for url parser"""
|
||||
|
||||
import pytest
|
||||
from common.src.urlparser import Parser
|
||||
|
||||
# video id parsing
|
||||
VIDEO_URL_IN = [
|
||||
"7DKv5H5Frt0",
|
||||
"https://www.youtube.com/watch?v=7DKv5H5Frt0",
|
||||
"https://www.youtube.com/watch?v=7DKv5H5Frt0&t=113&feature=shared",
|
||||
"https://www.youtube.com/watch?v=7DKv5H5Frt0&list=PL96C35uN7xGJu6skU4TBYrIWxggkZBrF5&index=1&pp=iAQB" # noqa: E501
|
||||
"https://youtu.be/7DKv5H5Frt0",
|
||||
"https://www.youtube.com/live/7DKv5H5Frt0",
|
||||
]
|
||||
VIDEO_OUT = [{"type": "video", "url": "7DKv5H5Frt0", "vid_type": "unknown"}]
|
||||
VIDEO_TEST_CASES = [(i, VIDEO_OUT) for i in VIDEO_URL_IN]
|
||||
|
||||
# shorts id parsing
|
||||
SHORTS_URL_IN = [
|
||||
"https://www.youtube.com/shorts/YG3-Pw3rixU",
|
||||
"https://youtube.com/shorts/YG3-Pw3rixU?feature=shared",
|
||||
]
|
||||
SHORTS_OUT = [{"type": "video", "url": "YG3-Pw3rixU", "vid_type": "shorts"}]
|
||||
SHORTS_TEST_CASES = [(i, SHORTS_OUT) for i in SHORTS_URL_IN]
|
||||
|
||||
# channel id parsing
|
||||
CHANNEL_URL_IN = [
|
||||
"UCBa659QWEk1AI4Tg--mrJ2A",
|
||||
"@TomScottGo",
|
||||
"https://www.youtube.com/channel/UCBa659QWEk1AI4Tg--mrJ2A",
|
||||
"https://www.youtube.com/@TomScottGo",
|
||||
]
|
||||
CHANNEL_OUT = [
|
||||
{
|
||||
"type": "channel",
|
||||
"url": "UCBa659QWEk1AI4Tg--mrJ2A",
|
||||
"vid_type": "unknown",
|
||||
}
|
||||
]
|
||||
CHANNEL_TEST_CASES = [(i, CHANNEL_OUT) for i in CHANNEL_URL_IN]
|
||||
|
||||
# channel vid type parsing
|
||||
CHANNEL_VID_TYPES = [
|
||||
(
|
||||
"https://www.youtube.com/@IBRACORP/videos",
|
||||
[
|
||||
{
|
||||
"type": "channel",
|
||||
"url": "UC7aW7chIafJG6ECYAd3N5uQ",
|
||||
"vid_type": "videos",
|
||||
}
|
||||
],
|
||||
),
|
||||
(
|
||||
"https://www.youtube.com/@IBRACORP/shorts",
|
||||
[
|
||||
{
|
||||
"type": "channel",
|
||||
"url": "UC7aW7chIafJG6ECYAd3N5uQ",
|
||||
"vid_type": "shorts",
|
||||
}
|
||||
],
|
||||
),
|
||||
(
|
||||
"https://www.youtube.com/@IBRACORP/streams",
|
||||
[
|
||||
{
|
||||
"type": "channel",
|
||||
"url": "UC7aW7chIafJG6ECYAd3N5uQ",
|
||||
"vid_type": "streams",
|
||||
}
|
||||
],
|
||||
),
|
||||
]
|
||||
|
||||
# playlist id parsing
|
||||
PLAYLIST_URL_IN = [
|
||||
"PL96C35uN7xGJu6skU4TBYrIWxggkZBrF5",
|
||||
"https://www.youtube.com/playlist?list=PL96C35uN7xGJu6skU4TBYrIWxggkZBrF5",
|
||||
]
|
||||
PLAYLIST_OUT = [
|
||||
{
|
||||
"type": "playlist",
|
||||
"url": "PL96C35uN7xGJu6skU4TBYrIWxggkZBrF5",
|
||||
"vid_type": "unknown",
|
||||
}
|
||||
]
|
||||
PLAYLIST_TEST_CASES = [(i, PLAYLIST_OUT) for i in PLAYLIST_URL_IN]
|
||||
|
||||
# personal playlists
|
||||
EXPECTED_WL = [{"type": "playlist", "url": "WL", "vid_type": "unknown"}]
|
||||
EXPECTED_LL = [{"type": "playlist", "url": "LL", "vid_type": "unknown"}]
|
||||
PERSONAL_PLAYLISTS_TEST_CASES = [
|
||||
("WL", EXPECTED_WL),
|
||||
("https://www.youtube.com/playlist?list=WL", EXPECTED_WL),
|
||||
("LL", EXPECTED_LL),
|
||||
("https://www.youtube.com/playlist?list=LL", EXPECTED_LL),
|
||||
]
|
||||
|
||||
# collect tests expected to pass
|
||||
PASSTING_TESTS = []
|
||||
PASSTING_TESTS.extend(VIDEO_TEST_CASES)
|
||||
PASSTING_TESTS.extend(SHORTS_TEST_CASES)
|
||||
PASSTING_TESTS.extend(CHANNEL_TEST_CASES)
|
||||
PASSTING_TESTS.extend(CHANNEL_VID_TYPES)
|
||||
PASSTING_TESTS.extend(PLAYLIST_TEST_CASES)
|
||||
PASSTING_TESTS.extend(PERSONAL_PLAYLISTS_TEST_CASES)
|
||||
|
||||
|
||||
@pytest.mark.parametrize("url_str, expected_result", PASSTING_TESTS)
|
||||
def test_passing_parse(url_str, expected_result):
|
||||
"""test parser"""
|
||||
parser = Parser(url_str, use_cache=False)
|
||||
parsed = parser.parse()
|
||||
assert parsed == expected_result
|
||||
|
||||
|
||||
INVALID_IDS_ERRORS = [
|
||||
"aaaaa",
|
||||
"https://www.youtube.com/playlist?list=AAAA",
|
||||
"https://www.youtube.com/channel/UC9-y-6csu5WGm29I7Jiwpn",
|
||||
"https://www.youtube.com/watch?v=CK3_zarXkw",
|
||||
]
|
||||
|
||||
|
||||
@pytest.mark.parametrize("invalid_value", INVALID_IDS_ERRORS)
|
||||
def test_invalid_ids(invalid_value):
|
||||
"""test for invalid IDs"""
|
||||
with pytest.raises(ValueError, match="not a valid id_str"):
|
||||
parser = Parser(invalid_value, use_cache=False)
|
||||
parser.parse()
|
||||
|
||||
|
||||
INVALID_DOMAINS = [
|
||||
"https://vimeo.com/32001208",
|
||||
"https://peertube.tv/w/8RiJE2j2nw569FVgPNjDt7",
|
||||
]
|
||||
|
||||
|
||||
@pytest.mark.parametrize("invalid_value", INVALID_DOMAINS)
|
||||
def test_invalid_domains(invalid_value):
|
||||
"""raise error on none YT domains"""
|
||||
parser = Parser(invalid_value, use_cache=False)
|
||||
with pytest.raises(ValueError, match="invalid domain"):
|
||||
parser.parse()
|
@ -1,33 +0,0 @@
|
||||
"""all api urls"""
|
||||
|
||||
from common import views
|
||||
from django.urls import path
|
||||
|
||||
urlpatterns = [
|
||||
path("ping/", views.PingView.as_view(), name="ping"),
|
||||
path(
|
||||
"refresh/",
|
||||
views.RefreshView.as_view(),
|
||||
name="api-refresh",
|
||||
),
|
||||
path(
|
||||
"watched/",
|
||||
views.WatchedView.as_view(),
|
||||
name="api-watched",
|
||||
),
|
||||
path(
|
||||
"search/",
|
||||
views.SearchView.as_view(),
|
||||
name="api-search",
|
||||
),
|
||||
path(
|
||||
"notification/",
|
||||
views.NotificationView.as_view(),
|
||||
name="api-notification",
|
||||
),
|
||||
path(
|
||||
"health/",
|
||||
views.HealthCheck.as_view(),
|
||||
name="api-health",
|
||||
),
|
||||
]
|
@ -1,210 +0,0 @@
|
||||
"""all API views"""
|
||||
|
||||
from appsettings.src.config import ReleaseVersion
|
||||
from appsettings.src.reindex import ReindexProgress
|
||||
from common.serializers import (
|
||||
AsyncTaskResponseSerializer,
|
||||
ErrorResponseSerializer,
|
||||
NotificationQueryFilterSerializer,
|
||||
NotificationSerializer,
|
||||
PingSerializer,
|
||||
RefreshAddDataSerializer,
|
||||
RefreshAddQuerySerializer,
|
||||
RefreshQuerySerializer,
|
||||
RefreshResponseSerializer,
|
||||
WatchedDataSerializer,
|
||||
)
|
||||
from common.src.searching import SearchForm
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from common.src.watched import WatchState
|
||||
from common.views_base import AdminOnly, ApiBaseView
|
||||
from drf_spectacular.utils import OpenApiResponse, extend_schema
|
||||
from rest_framework.response import Response
|
||||
from rest_framework.views import APIView
|
||||
from task.tasks import check_reindex
|
||||
|
||||
|
||||
class PingView(ApiBaseView):
|
||||
"""resolves to /api/ping/
|
||||
GET: test your connection
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={200: OpenApiResponse(PingSerializer())},
|
||||
)
|
||||
def get(request):
|
||||
"""get pong"""
|
||||
data = {
|
||||
"response": "pong",
|
||||
"user": request.user.id,
|
||||
"version": ReleaseVersion().get_local_version(),
|
||||
"ta_update": ReleaseVersion().get_update(),
|
||||
}
|
||||
serializer = PingSerializer(data)
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class RefreshView(ApiBaseView):
|
||||
"""resolves to /api/refresh/
|
||||
GET: get refresh progress
|
||||
POST: start a manual refresh task
|
||||
"""
|
||||
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(RefreshResponseSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
parameters=[RefreshQuerySerializer()],
|
||||
)
|
||||
def get(self, request):
|
||||
"""get refresh status"""
|
||||
query_serializer = RefreshQuerySerializer(data=request.query_params)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
request_type = validated_query.get("type")
|
||||
request_id = validated_query.get("id")
|
||||
|
||||
if request_id and not request_type:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "specified id also needs type"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
|
||||
try:
|
||||
progress = ReindexProgress(
|
||||
request_type=request_type, request_id=request_id
|
||||
).get_progress()
|
||||
except ValueError:
|
||||
error = ErrorResponseSerializer({"error": "bad request"})
|
||||
return Response(error.data, status=400)
|
||||
|
||||
response_serializer = RefreshResponseSerializer(progress)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
request=RefreshAddDataSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(AsyncTaskResponseSerializer()),
|
||||
},
|
||||
parameters=[RefreshAddQuerySerializer()],
|
||||
)
|
||||
def post(self, request):
|
||||
"""add to reindex queue"""
|
||||
query_serializer = RefreshAddQuerySerializer(data=request.query_params)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
|
||||
data_serializer = RefreshAddDataSerializer(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
|
||||
extract_videos = validated_query.get("extract_videos")
|
||||
task = check_reindex.delay(
|
||||
data=validated_data, extract_videos=extract_videos
|
||||
)
|
||||
message = {
|
||||
"message": "reindex task started",
|
||||
"task_id": task.id,
|
||||
}
|
||||
serializer = AsyncTaskResponseSerializer(message)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class WatchedView(ApiBaseView):
|
||||
"""resolves to /api/watched/
|
||||
POST: change watched state of video, channel or playlist
|
||||
"""
|
||||
|
||||
@extend_schema(
|
||||
request=WatchedDataSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(WatchedDataSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request):
|
||||
"""change watched state"""
|
||||
data_serializer = WatchedDataSerializer(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
youtube_id = validated_data.get("id")
|
||||
is_watched = validated_data.get("is_watched")
|
||||
|
||||
if not youtube_id or is_watched is None:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "missing id or is_watched"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
|
||||
WatchState(youtube_id, is_watched, request.user.id).change()
|
||||
return Response(data_serializer.data)
|
||||
|
||||
|
||||
class SearchView(ApiBaseView):
|
||||
"""resolves to /api/search/
|
||||
GET: run a search with the string in the ?query parameter
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def get(request):
|
||||
"""handle get request
|
||||
search through all indexes"""
|
||||
search_query = request.GET.get("query", None)
|
||||
if search_query is None:
|
||||
return Response(
|
||||
{"message": "no search query specified"}, status=400
|
||||
)
|
||||
|
||||
search_results = SearchForm().multi_search(search_query)
|
||||
return Response(search_results)
|
||||
|
||||
|
||||
class NotificationView(ApiBaseView):
|
||||
"""resolves to /api/notification/
|
||||
GET: returns a list of notifications
|
||||
filter query to filter messages by group
|
||||
"""
|
||||
|
||||
valid_filters = ["download", "settings", "channel"]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(NotificationSerializer(many=True)),
|
||||
},
|
||||
parameters=[NotificationQueryFilterSerializer],
|
||||
)
|
||||
def get(self, request):
|
||||
"""get all notifications"""
|
||||
query_serializer = NotificationQueryFilterSerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
filter_by = validated_query.get("filter")
|
||||
|
||||
query = "message"
|
||||
if filter_by in self.valid_filters:
|
||||
query = f"{query}:{filter_by}"
|
||||
|
||||
notifications = RedisArchivist().list_items(query)
|
||||
response_serializer = NotificationSerializer(notifications, many=True)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
|
||||
class HealthCheck(APIView):
|
||||
"""health check view, no auth needed"""
|
||||
|
||||
def get(self, request):
|
||||
"""health check, no auth needed"""
|
||||
return Response("OK", status=200)
|
@ -1,102 +0,0 @@
|
||||
"""base classes to inherit from"""
|
||||
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.index_generic import Pagination
|
||||
from common.src.search_processor import SearchProcess, process_aggs
|
||||
from rest_framework import permissions
|
||||
from rest_framework.authentication import (
|
||||
SessionAuthentication,
|
||||
TokenAuthentication,
|
||||
)
|
||||
from rest_framework.views import APIView
|
||||
|
||||
|
||||
def check_admin(user):
|
||||
"""check for admin permission for restricted views"""
|
||||
return user.is_staff or user.groups.filter(name="admin").exists()
|
||||
|
||||
|
||||
class AdminOnly(permissions.BasePermission):
|
||||
"""allow only admin"""
|
||||
|
||||
def has_permission(self, request, view):
|
||||
return check_admin(request.user)
|
||||
|
||||
|
||||
class AdminWriteOnly(permissions.BasePermission):
|
||||
"""allow only admin writes"""
|
||||
|
||||
def has_permission(self, request, view):
|
||||
if request.method in permissions.SAFE_METHODS:
|
||||
return permissions.IsAuthenticated().has_permission(request, view)
|
||||
|
||||
return check_admin(request.user)
|
||||
|
||||
|
||||
class ApiBaseView(APIView):
|
||||
"""base view to inherit from"""
|
||||
|
||||
authentication_classes = [SessionAuthentication, TokenAuthentication]
|
||||
permission_classes = [permissions.IsAuthenticated]
|
||||
search_base = ""
|
||||
data = ""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.response = {}
|
||||
self.data = {"query": {"match_all": {}}}
|
||||
self.status_code = False
|
||||
self.context = False
|
||||
self.pagination_handler = False
|
||||
|
||||
def get_document(self, document_id, progress_match=None):
|
||||
"""get single document from es"""
|
||||
path = f"{self.search_base}{document_id}"
|
||||
response, status_code = ElasticWrap(path).get()
|
||||
try:
|
||||
self.response = SearchProcess(
|
||||
response, match_video_user_progress=progress_match
|
||||
).process()
|
||||
except KeyError:
|
||||
print(f"item not found: {document_id}")
|
||||
|
||||
self.status_code = status_code
|
||||
|
||||
def initiate_pagination(self, request):
|
||||
"""set initial pagination values"""
|
||||
self.pagination_handler = Pagination(request)
|
||||
self.data.update(
|
||||
{
|
||||
"size": self.pagination_handler.pagination["page_size"],
|
||||
"from": self.pagination_handler.pagination["page_from"],
|
||||
}
|
||||
)
|
||||
|
||||
def get_document_list(self, request, pagination=True, progress_match=None):
|
||||
"""get a list of results"""
|
||||
if pagination:
|
||||
self.initiate_pagination(request)
|
||||
|
||||
es_handler = ElasticWrap(self.search_base)
|
||||
response, status_code = es_handler.get(data=self.data)
|
||||
self.response["data"] = SearchProcess(
|
||||
response, match_video_user_progress=progress_match
|
||||
).process()
|
||||
if self.response["data"]:
|
||||
self.status_code = status_code
|
||||
else:
|
||||
self.status_code = 404
|
||||
|
||||
if pagination and response.get("hits"):
|
||||
self.pagination_handler.validate(
|
||||
response["hits"]["total"]["value"]
|
||||
)
|
||||
self.response["paginate"] = self.pagination_handler.pagination
|
||||
|
||||
def get_aggs(self):
|
||||
"""get aggs alone"""
|
||||
self.data["size"] = 0
|
||||
response, _ = ElasticWrap(self.search_base).get(data=self.data)
|
||||
process_aggs(response)
|
||||
|
||||
self.response = response.get("aggregations")
|
@ -1,36 +0,0 @@
|
||||
"""change user password"""
|
||||
|
||||
from django.contrib.auth import get_user_model
|
||||
from django.core.management.base import BaseCommand, CommandError
|
||||
|
||||
User = get_user_model()
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""change password"""
|
||||
|
||||
help = "Change Password of user"
|
||||
|
||||
def add_arguments(self, parser):
|
||||
parser.add_argument("username", type=str)
|
||||
parser.add_argument("password", type=str)
|
||||
|
||||
def handle(self, *args, **kwargs):
|
||||
"""entry point"""
|
||||
username = kwargs["username"]
|
||||
new_password = kwargs["password"]
|
||||
self.stdout.write(f"Changing password for user '{username}'")
|
||||
try:
|
||||
user = User.objects.get(name=username)
|
||||
except User.DoesNotExist as err:
|
||||
message = f"Username '{username}' does not exist. "
|
||||
message += "Available username(s) are:\n"
|
||||
message += ", ".join([i.name for i in User.objects.all()])
|
||||
raise CommandError(message) from err
|
||||
|
||||
user.set_password(new_password)
|
||||
user.save()
|
||||
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ updated password for user '{username}'")
|
||||
)
|
@ -1,76 +0,0 @@
|
||||
"""backup config for sqlite reset and restore"""
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
from django.contrib.auth import get_user_model
|
||||
from django.core.management.base import BaseCommand
|
||||
from home.models import CustomPeriodicTask
|
||||
from home.src.ta.settings import EnvironmentSettings
|
||||
from rest_framework.authtoken.models import Token
|
||||
|
||||
User = get_user_model()
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""export"""
|
||||
|
||||
help = "Exports all users and their auth tokens to a JSON file"
|
||||
FILE = Path(EnvironmentSettings.CACHE_DIR) / "backup" / "migration.json"
|
||||
|
||||
def handle(self, *args, **kwargs):
|
||||
"""entry point"""
|
||||
|
||||
data = {
|
||||
"user_data": self.get_users(),
|
||||
"schedule_data": self.get_schedules(),
|
||||
}
|
||||
|
||||
with open(self.FILE, "w", encoding="utf-8") as json_file:
|
||||
json_file.write(json.dumps(data))
|
||||
|
||||
def get_users(self):
|
||||
"""get users"""
|
||||
|
||||
users = User.objects.all()
|
||||
|
||||
user_data = []
|
||||
|
||||
for user in users:
|
||||
user_info = {
|
||||
"username": user.name,
|
||||
"is_staff": user.is_staff,
|
||||
"is_superuser": user.is_superuser,
|
||||
"password": user.password,
|
||||
"tokens": [],
|
||||
}
|
||||
|
||||
try:
|
||||
token = Token.objects.get(user=user)
|
||||
user_info["tokens"] = [token.key]
|
||||
except Token.DoesNotExist:
|
||||
user_info["tokens"] = []
|
||||
|
||||
user_data.append(user_info)
|
||||
|
||||
return user_data
|
||||
|
||||
def get_schedules(self):
|
||||
"""get schedules"""
|
||||
|
||||
all_schedules = CustomPeriodicTask.objects.all()
|
||||
schedule_data = []
|
||||
|
||||
for schedule in all_schedules:
|
||||
schedule_info = {
|
||||
"name": schedule.name,
|
||||
"crontab": {
|
||||
"minute": schedule.crontab.minute,
|
||||
"hour": schedule.crontab.hour,
|
||||
"day_of_week": schedule.crontab.day_of_week,
|
||||
},
|
||||
}
|
||||
|
||||
schedule_data.append(schedule_info)
|
||||
|
||||
return schedule_data
|
@ -1,89 +0,0 @@
|
||||
"""restore config from backup"""
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from django.core.management.base import BaseCommand
|
||||
from django_celery_beat.models import CrontabSchedule
|
||||
from rest_framework.authtoken.models import Token
|
||||
from task.models import CustomPeriodicTask
|
||||
from task.src.task_config import TASK_CONFIG
|
||||
from user.models import Account
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""export"""
|
||||
|
||||
help = "Exports all users and their auth tokens to a JSON file"
|
||||
FILE = Path(EnvironmentSettings.CACHE_DIR) / "backup" / "migration.json"
|
||||
|
||||
def handle(self, *args, **options):
|
||||
"""handle"""
|
||||
self.stdout.write("restore users and schedules")
|
||||
data = self.get_config()
|
||||
self.restore_users(data["user_data"])
|
||||
self.restore_schedules(data["schedule_data"])
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
" ✓ restore completed. Please restart the container."
|
||||
)
|
||||
)
|
||||
|
||||
def get_config(self) -> dict:
|
||||
"""get config from backup"""
|
||||
with open(self.FILE, "r", encoding="utf-8") as json_file:
|
||||
data = json.loads(json_file.read())
|
||||
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ json file found: {self.FILE}")
|
||||
)
|
||||
|
||||
return data
|
||||
|
||||
def restore_users(self, user_data: list[dict]) -> None:
|
||||
"""restore users from config"""
|
||||
self.stdout.write("delete existing users")
|
||||
Account.objects.all().delete()
|
||||
|
||||
self.stdout.write("recreate users")
|
||||
for user_info in user_data:
|
||||
user = Account.objects.create(
|
||||
name=user_info["username"],
|
||||
is_staff=user_info["is_staff"],
|
||||
is_superuser=user_info["is_superuser"],
|
||||
password=user_info["password"],
|
||||
)
|
||||
for token in user_info["tokens"]:
|
||||
Token.objects.create(user=user, key=token)
|
||||
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
f" ✓ recreated user with name: {user_info['username']}"
|
||||
)
|
||||
)
|
||||
|
||||
def restore_schedules(self, schedule_data: list[dict]) -> None:
|
||||
"""restore schedules"""
|
||||
self.stdout.write("delete existing schedules")
|
||||
CustomPeriodicTask.objects.all().delete()
|
||||
|
||||
self.stdout.write("recreate schedules")
|
||||
for schedule in schedule_data:
|
||||
task_name = schedule["name"]
|
||||
description = TASK_CONFIG[task_name].get("title")
|
||||
crontab, _ = CrontabSchedule.objects.get_or_create(
|
||||
minute=schedule["crontab"]["minute"],
|
||||
hour=schedule["crontab"]["hour"],
|
||||
day_of_week=schedule["crontab"]["day_of_week"],
|
||||
timezone=EnvironmentSettings.TZ,
|
||||
)
|
||||
task = CustomPeriodicTask.objects.create(
|
||||
name=task_name,
|
||||
task=task_name,
|
||||
description=description,
|
||||
crontab=crontab,
|
||||
)
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ recreated schedule: {task}")
|
||||
)
|
@ -1,177 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- check that all connections are working
|
||||
"""
|
||||
|
||||
from time import sleep
|
||||
|
||||
import requests
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from django.core.management.base import BaseCommand, CommandError
|
||||
|
||||
TOPIC = """
|
||||
|
||||
#######################
|
||||
# Connection check #
|
||||
#######################
|
||||
|
||||
"""
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""command framework"""
|
||||
|
||||
TIMEOUT = 120
|
||||
MIN_MAJOR, MAX_MAJOR = 8, 8
|
||||
MIN_MINOR = 0
|
||||
|
||||
# pylint: disable=no-member
|
||||
help = "Check connections"
|
||||
|
||||
def handle(self, *args, **options):
|
||||
"""run all commands"""
|
||||
self.stdout.write(TOPIC)
|
||||
self._redis_connection_check()
|
||||
self._redis_config_set()
|
||||
self._es_connection_check()
|
||||
self._es_version_check()
|
||||
self._es_path_check()
|
||||
|
||||
def _redis_connection_check(self):
|
||||
"""check ir redis connection is established"""
|
||||
self.stdout.write("[1] connect to Redis")
|
||||
redis_conn = RedisArchivist().conn
|
||||
for _ in range(5):
|
||||
try:
|
||||
pong = redis_conn.execute_command("PING")
|
||||
if pong:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ Redis connection verified")
|
||||
)
|
||||
return
|
||||
|
||||
except Exception: # pylint: disable=broad-except
|
||||
self.stdout.write(" ... retry Redis connection")
|
||||
sleep(2)
|
||||
|
||||
message = " 🗙 Redis connection failed"
|
||||
self.stdout.write(self.style.ERROR(f"{message}"))
|
||||
try:
|
||||
redis_conn.execute_command("PING")
|
||||
except Exception as err: # pylint: disable=broad-except
|
||||
message = f" 🗙 {type(err).__name__}: {err}"
|
||||
self.stdout.write(self.style.ERROR(f"{message}"))
|
||||
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
def _redis_config_set(self):
|
||||
"""set config for redis if not set already"""
|
||||
self.stdout.write("[2] set Redis config")
|
||||
redis_conn = RedisArchivist().conn
|
||||
timeout_is = int(redis_conn.config_get("timeout").get("timeout"))
|
||||
if not timeout_is:
|
||||
redis_conn.config_set("timeout", 3600)
|
||||
|
||||
self.stdout.write(self.style.SUCCESS(" ✓ Redis config set"))
|
||||
|
||||
def _es_connection_check(self):
|
||||
"""wait for elasticsearch connection"""
|
||||
self.stdout.write("[3] connect to Elastic Search")
|
||||
total = self.TIMEOUT // 5
|
||||
for i in range(total):
|
||||
self.stdout.write(f" ... waiting for ES [{i}/{total}]")
|
||||
try:
|
||||
_, status_code = ElasticWrap("/").get(
|
||||
timeout=1, print_error=False
|
||||
)
|
||||
except (
|
||||
requests.exceptions.ConnectionError,
|
||||
requests.exceptions.Timeout,
|
||||
):
|
||||
sleep(5)
|
||||
continue
|
||||
|
||||
if status_code and status_code == 401:
|
||||
sleep(5)
|
||||
continue
|
||||
|
||||
if status_code and status_code == 200:
|
||||
path = (
|
||||
"_cluster/health?"
|
||||
"wait_for_status=yellow&"
|
||||
"timeout=60s&"
|
||||
"wait_for_active_shards=1"
|
||||
)
|
||||
_, _ = ElasticWrap(path).get(timeout=60)
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ ES connection established")
|
||||
)
|
||||
return
|
||||
|
||||
response, status_code = ElasticWrap("/").get(
|
||||
timeout=1, print_error=False
|
||||
)
|
||||
|
||||
message = " 🗙 ES connection failed"
|
||||
self.stdout.write(self.style.ERROR(f"{message}"))
|
||||
self.stdout.write(f" error message: {response}")
|
||||
self.stdout.write(f" status code: {status_code}")
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
def _es_version_check(self):
|
||||
"""check for minimal elasticsearch version"""
|
||||
self.stdout.write("[4] Elastic Search version check")
|
||||
response, _ = ElasticWrap("/").get()
|
||||
version = response["version"]["number"]
|
||||
major = int(version.split(".")[0])
|
||||
|
||||
if self.MIN_MAJOR <= major <= self.MAX_MAJOR:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ ES version check passed")
|
||||
)
|
||||
return
|
||||
|
||||
message = (
|
||||
" 🗙 ES version check failed. "
|
||||
+ f"Expected {self.MIN_MAJOR}.{self.MIN_MINOR} but got {version}"
|
||||
)
|
||||
self.stdout.write(self.style.ERROR(f"{message}"))
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
def _es_path_check(self):
|
||||
"""check that path.repo var is set"""
|
||||
self.stdout.write("[5] check ES path.repo env var")
|
||||
response, _ = ElasticWrap("_nodes/_all/settings").get()
|
||||
snaphost_roles = [
|
||||
"data",
|
||||
"data_cold",
|
||||
"data_content",
|
||||
"data_frozen",
|
||||
"data_hot",
|
||||
"data_warm",
|
||||
"master",
|
||||
]
|
||||
for node in response["nodes"].values():
|
||||
if not (set(node["roles"]) & set(snaphost_roles)):
|
||||
continue
|
||||
|
||||
if node["settings"]["path"].get("repo"):
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ path.repo env var is set")
|
||||
)
|
||||
return
|
||||
|
||||
message = (
|
||||
" 🗙 path.repo env var not found. "
|
||||
+ "set the following env var to the ES container:\n"
|
||||
+ " path.repo="
|
||||
+ EnvironmentSettings.ES_SNAPSHOT_DIR
|
||||
)
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
@ -1,218 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- Check environment at startup
|
||||
- Process config file overwrites from env var
|
||||
- Stop startup on error
|
||||
- python management.py ta_envcheck
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
from time import sleep
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from django.core.management.base import BaseCommand, CommandError
|
||||
from user.models import Account
|
||||
|
||||
LOGO = """
|
||||
|
||||
.... .....
|
||||
...'',;:cc,. .;::;;,'...
|
||||
..,;:cccllclc, .:ccllllcc;,..
|
||||
..,:cllcc:;,'.',. ....'',;ccllc:,..
|
||||
..;cllc:,'.. ...,:cccc:'.
|
||||
.;cccc;.. ..,:ccc:'.
|
||||
.ckkkOkxollllllllllllc. .,:::;. .,cclc;
|
||||
.:0MMMMMMMMMMMMMMMMMMMX: .cNMMMWx. .;clc:
|
||||
.;lOXK0000KNMMMMX00000KO; ;KMMMMMNl. .;ccl:,.
|
||||
.;:c:'.....kMMMNo........ 'OMMMWMMMK: '::;;'.
|
||||
....... .xMMMNl .dWMMXdOMMMO' ........
|
||||
.:cc:;. .xMMMNc .lNMMNo.:XMMWx. .:cl:.
|
||||
.:llc,. .:xxxd, ;KMMMk. .oWMMNl. .:llc'
|
||||
.cll:. .;:;;:::,. 'OMMMK:';''kWMMK: .;llc,
|
||||
.cll:. .,;;;;;;,. .,xWMMNl.:l:.;KMMMO' .;llc'
|
||||
.:llc. .cOOOk; .lKNMMWx..:l:..lNMMWx. .:llc'
|
||||
.;lcc,. .xMMMNc :KMMMM0, .:lc. .xWMMNl.'ccl:.
|
||||
.cllc. .xMMMNc 'OMMMMXc...:lc...,0MMMKl:lcc,.
|
||||
.,ccl:. .xMMMNc .xWMMMWo.,;;:lc;;;.cXMMMXdcc;.
|
||||
.,clc:. .xMMMNc .lNMMMWk. .':clc:,. .dWMMW0o;.
|
||||
.,clcc,. .ckkkx; .okkkOx, .';,. 'kKKK0l.
|
||||
.':lcc:'..... . .. ..,;cllc,.
|
||||
.,cclc,.... ....;clc;..
|
||||
..,:,..,c:'.. ...';:,..,:,.
|
||||
....:lcccc:;,'''.....'',;;:clllc,....
|
||||
.'',;:cllllllccccclllllcc:,'..
|
||||
...'',,;;;;;;;;;,''...
|
||||
.....
|
||||
|
||||
"""
|
||||
|
||||
TOPIC = """
|
||||
#######################
|
||||
# Environment Setup #
|
||||
#######################
|
||||
|
||||
"""
|
||||
|
||||
EXPECTED_ENV_VARS = [
|
||||
"TA_USERNAME",
|
||||
"TA_PASSWORD",
|
||||
"ELASTIC_PASSWORD",
|
||||
"ES_URL",
|
||||
"TA_HOST",
|
||||
]
|
||||
UNEXPECTED_ENV_VARS = {
|
||||
"TA_UWSGI_PORT": "Has been replaced with 'TA_BACKEND_PORT'",
|
||||
"REDIS_HOST": "Has been replaced with 'REDIS_CON' connection string",
|
||||
"REDIS_PORT": "Has been consolidated in 'REDIS_CON' connection string",
|
||||
"ENABLE_CAST": "That is now a toggle in setting and DISABLE_STATIC_AUTH",
|
||||
}
|
||||
INST = "https://github.com/tubearchivist/tubearchivist#installing-and-updating"
|
||||
NGINX = "/etc/nginx/sites-available/default"
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""command framework"""
|
||||
|
||||
# pylint: disable=no-member
|
||||
help = "Check environment before startup"
|
||||
|
||||
def handle(self, *args, **options):
|
||||
"""run all commands"""
|
||||
self.stdout.write(LOGO)
|
||||
self.stdout.write(TOPIC)
|
||||
self._expected_vars()
|
||||
self._unexpected_vars()
|
||||
self._elastic_user_overwrite()
|
||||
self._ta_port_overwrite()
|
||||
self._ta_backend_port_overwrite()
|
||||
self._disable_static_auth()
|
||||
self._create_superuser()
|
||||
|
||||
def _expected_vars(self):
|
||||
"""check if expected env vars are set"""
|
||||
self.stdout.write("[1] checking expected env vars")
|
||||
env = os.environ
|
||||
for var in EXPECTED_ENV_VARS:
|
||||
if not env.get(var):
|
||||
message = f" 🗙 expected env var {var} not set\n {INST}"
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
message = " ✓ all expected env vars are set"
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
|
||||
def _unexpected_vars(self):
|
||||
"""check for unexpected env vars"""
|
||||
self.stdout.write("[2] checking for unexpected env vars")
|
||||
for var, message in UNEXPECTED_ENV_VARS.items():
|
||||
if not os.environ.get(var):
|
||||
continue
|
||||
|
||||
message = (
|
||||
f" 🗙 unexpected env var {var} found\n"
|
||||
f" {message} \n"
|
||||
" see release notes for a list of all changes."
|
||||
)
|
||||
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
message = " ✓ no unexpected env vars found"
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
|
||||
def _elastic_user_overwrite(self):
|
||||
"""check for ELASTIC_USER overwrite"""
|
||||
self.stdout.write("[3] check ES user overwrite")
|
||||
env = EnvironmentSettings.ES_USER
|
||||
self.stdout.write(self.style.SUCCESS(f" ✓ ES user is set to {env}"))
|
||||
|
||||
def _ta_port_overwrite(self):
|
||||
"""set TA_PORT overwrite for nginx"""
|
||||
self.stdout.write("[4] check TA_PORT overwrite")
|
||||
overwrite = EnvironmentSettings.TA_PORT
|
||||
if not overwrite:
|
||||
self.stdout.write(self.style.SUCCESS(" TA_PORT is not set"))
|
||||
return
|
||||
|
||||
regex = re.compile(r"listen [0-9]{1,5}")
|
||||
to_overwrite = f"listen {overwrite}"
|
||||
changed = file_overwrite(NGINX, regex, to_overwrite)
|
||||
if changed:
|
||||
message = f" ✓ TA_PORT changed to {overwrite}"
|
||||
else:
|
||||
message = f" ✓ TA_PORT already set to {overwrite}"
|
||||
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
|
||||
def _ta_backend_port_overwrite(self):
|
||||
"""set TA_BACKEND_PORT overwrite"""
|
||||
self.stdout.write("[5] check TA_BACKEND_PORT overwrite")
|
||||
overwrite = EnvironmentSettings.TA_BACKEND_PORT
|
||||
if not overwrite:
|
||||
message = " TA_BACKEND_PORT is not set"
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
return
|
||||
|
||||
# modify nginx conf
|
||||
regex = re.compile(r"proxy_pass http://localhost:[0-9]{1,5}")
|
||||
to_overwrite = f"proxy_pass http://localhost:{overwrite}"
|
||||
changed = file_overwrite(NGINX, regex, to_overwrite)
|
||||
|
||||
if changed:
|
||||
message = f" ✓ TA_BACKEND_PORT changed to {overwrite}"
|
||||
else:
|
||||
message = f" ✓ TA_BACKEND_PORT already set to {overwrite}"
|
||||
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
|
||||
def _disable_static_auth(self):
|
||||
"""cast workaround, remove auth for static files in nginx"""
|
||||
self.stdout.write("[7] check DISABLE_STATIC_AUTH overwrite")
|
||||
overwrite = EnvironmentSettings.DISABLE_STATIC_AUTH
|
||||
if not overwrite:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" DISABLE_STATIC_AUTH is not set")
|
||||
)
|
||||
return
|
||||
|
||||
regex = re.compile(r"[^\S\r\n]*auth_request /api/ping/;\n")
|
||||
changed = file_overwrite(NGINX, regex, "")
|
||||
if changed:
|
||||
message = " ✓ process nginx to disable static auth"
|
||||
else:
|
||||
message = " ✓ static auth is already disabled in nginx"
|
||||
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
|
||||
def _create_superuser(self):
|
||||
"""create superuser if not exist"""
|
||||
self.stdout.write("[8] create superuser")
|
||||
is_created = Account.objects.filter(is_superuser=True)
|
||||
if is_created:
|
||||
message = " superuser already created"
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
return
|
||||
|
||||
name = EnvironmentSettings.TA_USERNAME
|
||||
password = EnvironmentSettings.TA_PASSWORD
|
||||
Account.objects.create_superuser(name, password)
|
||||
message = f" ✓ new superuser with name {name} created"
|
||||
self.stdout.write(self.style.SUCCESS(message))
|
||||
|
||||
|
||||
def file_overwrite(file_path, regex, overwrite):
|
||||
"""change file content from old to overwrite, return true when changed"""
|
||||
with open(file_path, "r", encoding="utf-8") as f:
|
||||
file_content = f.read()
|
||||
|
||||
changed = re.sub(regex, overwrite, file_content)
|
||||
if changed == file_content:
|
||||
return False
|
||||
|
||||
with open(file_path, "w", encoding="utf-8") as f:
|
||||
f.write(changed)
|
||||
|
||||
return True
|
@ -1,391 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- Application startup
|
||||
- Apply migrations
|
||||
"""
|
||||
|
||||
import os
|
||||
from datetime import datetime
|
||||
from random import randint
|
||||
from time import sleep
|
||||
|
||||
from appsettings.src.config import AppConfig, ReleaseVersion
|
||||
from appsettings.src.index_setup import ElasitIndexWrap
|
||||
from appsettings.src.snapshot import ElasticSnapshot
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.helper import clear_dl_cache
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from django.core.management.base import BaseCommand, CommandError
|
||||
from django.utils import dateformat
|
||||
from django_celery_beat.models import CrontabSchedule, PeriodicTasks
|
||||
from redis.exceptions import ResponseError
|
||||
from task.models import CustomPeriodicTask
|
||||
from task.src.config_schedule import ScheduleBuilder
|
||||
from task.src.task_manager import TaskManager
|
||||
from task.tasks import version_check
|
||||
|
||||
TOPIC = """
|
||||
|
||||
#######################
|
||||
# Application Start #
|
||||
#######################
|
||||
|
||||
"""
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""command framework"""
|
||||
|
||||
# pylint: disable=no-member
|
||||
|
||||
def handle(self, *args, **options):
|
||||
"""run all commands"""
|
||||
self.stdout.write(TOPIC)
|
||||
self._make_folders()
|
||||
self._clear_redis_keys()
|
||||
self._clear_tasks()
|
||||
self._clear_dl_cache()
|
||||
self._version_check()
|
||||
self._index_setup()
|
||||
self._snapshot_check()
|
||||
self._mig_app_settings()
|
||||
self._create_default_schedules()
|
||||
self._update_schedule_tz()
|
||||
self._init_app_config()
|
||||
self._mig_channel_tags()
|
||||
self._mig_video_channel_tags()
|
||||
self._mig_fix_download_channel_indexed()
|
||||
|
||||
def _make_folders(self):
|
||||
"""make expected cache folders"""
|
||||
self.stdout.write("[1] create expected cache folders")
|
||||
folders = [
|
||||
"backup",
|
||||
"channels",
|
||||
"download",
|
||||
"import",
|
||||
"playlists",
|
||||
"videos",
|
||||
]
|
||||
cache_dir = EnvironmentSettings.CACHE_DIR
|
||||
for folder in folders:
|
||||
folder_path = os.path.join(cache_dir, folder)
|
||||
os.makedirs(folder_path, exist_ok=True)
|
||||
|
||||
self.stdout.write(self.style.SUCCESS(" ✓ expected folders created"))
|
||||
|
||||
def _clear_redis_keys(self):
|
||||
"""make sure there are no leftover locks or keys set in redis"""
|
||||
self.stdout.write("[2] clear leftover keys in redis")
|
||||
all_keys = [
|
||||
"dl_queue_id",
|
||||
"dl_queue",
|
||||
"downloading",
|
||||
"manual_import",
|
||||
"reindex",
|
||||
"rescan",
|
||||
"run_backup",
|
||||
"startup_check",
|
||||
"reindex:ta_video",
|
||||
"reindex:ta_channel",
|
||||
"reindex:ta_playlist",
|
||||
]
|
||||
|
||||
redis_con = RedisArchivist()
|
||||
has_changed = False
|
||||
for key in all_keys:
|
||||
if redis_con.del_message(key):
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ cleared key {key}")
|
||||
)
|
||||
has_changed = True
|
||||
|
||||
if not has_changed:
|
||||
self.stdout.write(self.style.SUCCESS(" no keys found"))
|
||||
|
||||
def _clear_tasks(self):
|
||||
"""clear tasks and messages"""
|
||||
self.stdout.write("[3] clear task leftovers")
|
||||
TaskManager().fail_pending()
|
||||
redis_con = RedisArchivist()
|
||||
to_delete = redis_con.list_keys("message:")
|
||||
if to_delete:
|
||||
for key in to_delete:
|
||||
redis_con.del_message(key)
|
||||
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ cleared {len(to_delete)} messages")
|
||||
)
|
||||
|
||||
def _clear_dl_cache(self):
|
||||
"""clear leftover files from dl cache"""
|
||||
self.stdout.write("[4] clear leftover files from dl cache")
|
||||
leftover_files = clear_dl_cache(EnvironmentSettings.CACHE_DIR)
|
||||
if leftover_files:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ cleared {leftover_files} files")
|
||||
)
|
||||
else:
|
||||
self.stdout.write(self.style.SUCCESS(" no files found"))
|
||||
|
||||
def _version_check(self):
|
||||
"""remove new release key if updated now"""
|
||||
self.stdout.write("[5] check for first run after update")
|
||||
new_version = ReleaseVersion().is_updated()
|
||||
if new_version:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ update to {new_version} completed")
|
||||
)
|
||||
else:
|
||||
self.stdout.write(self.style.SUCCESS(" no new update found"))
|
||||
|
||||
version_task = CustomPeriodicTask.objects.filter(name="version_check")
|
||||
if not version_task.exists():
|
||||
return
|
||||
|
||||
if not version_task.first().last_run_at:
|
||||
self.style.SUCCESS(" ✓ send initial version check task")
|
||||
version_check.delay()
|
||||
|
||||
def _index_setup(self):
|
||||
"""migration: validate index mappings"""
|
||||
self.stdout.write("[6] validate index mappings")
|
||||
ElasitIndexWrap().setup()
|
||||
|
||||
def _snapshot_check(self):
|
||||
"""migration setup snapshots"""
|
||||
self.stdout.write("[7] setup snapshots")
|
||||
ElasticSnapshot().setup()
|
||||
|
||||
def _mig_app_settings(self) -> None:
|
||||
"""update from v0.4.13 to v0.5.0, migrate application settings"""
|
||||
self.stdout.write("[MIGRATION] move appconfig to ES")
|
||||
try:
|
||||
config = RedisArchivist().get_message("config")
|
||||
except ResponseError:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" Redis does not support JSON decoding")
|
||||
)
|
||||
return
|
||||
|
||||
if not config or config == {"status": False}:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" no config values to migrate")
|
||||
)
|
||||
return
|
||||
|
||||
path = "ta_config/_doc/appsettings"
|
||||
response, status_code = ElasticWrap(path).post(config)
|
||||
|
||||
if status_code in [200, 201]:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ migrated appconfig to ES")
|
||||
)
|
||||
RedisArchivist().del_message("config", save=True)
|
||||
return
|
||||
|
||||
message = " 🗙 failed to migrate app config"
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
self.stdout.write(response)
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
def _create_default_schedules(self) -> None:
|
||||
"""create default schedules for new installations"""
|
||||
self.stdout.write("[8] create initial schedules")
|
||||
init_has_run = CustomPeriodicTask.objects.filter(
|
||||
name="version_check"
|
||||
).exists()
|
||||
|
||||
if init_has_run:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
" schedule init already done, skipping..."
|
||||
)
|
||||
)
|
||||
return
|
||||
|
||||
builder = ScheduleBuilder()
|
||||
check_reindex = builder.get_set_task(
|
||||
"check_reindex", schedule=builder.SCHEDULES["check_reindex"]
|
||||
)
|
||||
check_reindex.task_config.update({"days": 90})
|
||||
check_reindex.last_run_at = dateformat.make_aware(datetime.now())
|
||||
check_reindex.save()
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
f" ✓ created new default schedule: {check_reindex}"
|
||||
)
|
||||
)
|
||||
|
||||
thumbnail_check = builder.get_set_task(
|
||||
"thumbnail_check", schedule=builder.SCHEDULES["thumbnail_check"]
|
||||
)
|
||||
thumbnail_check.last_run_at = dateformat.make_aware(datetime.now())
|
||||
thumbnail_check.save()
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
f" ✓ created new default schedule: {thumbnail_check}"
|
||||
)
|
||||
)
|
||||
daily_random = f"{randint(0, 59)} {randint(0, 23)} *"
|
||||
version_check_task = builder.get_set_task(
|
||||
"version_check", schedule=daily_random
|
||||
)
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
f" ✓ created new default schedule: {version_check_task}"
|
||||
)
|
||||
)
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ all default schedules created")
|
||||
)
|
||||
|
||||
def _update_schedule_tz(self) -> None:
|
||||
"""update timezone for Schedule instances"""
|
||||
self.stdout.write("[9] validate schedules TZ")
|
||||
tz = EnvironmentSettings.TZ
|
||||
to_update = CrontabSchedule.objects.exclude(timezone=tz)
|
||||
|
||||
if not to_update.exists():
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" all schedules have correct TZ")
|
||||
)
|
||||
return
|
||||
|
||||
updated = to_update.update(timezone=tz)
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ updated {updated} schedules to {tz}.")
|
||||
)
|
||||
PeriodicTasks.update_changed()
|
||||
|
||||
def _init_app_config(self) -> None:
|
||||
"""init default app config to ES"""
|
||||
self.stdout.write("[10] Check AppConfig")
|
||||
response, status_code = ElasticWrap("ta_config/_doc/appsettings").get()
|
||||
if status_code in [200, 201]:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" skip completed appsettings init")
|
||||
)
|
||||
updated_defaults = AppConfig().add_new_defaults()
|
||||
for new_default in updated_defaults:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" added new default: {new_default}")
|
||||
)
|
||||
|
||||
return
|
||||
|
||||
if status_code != 404:
|
||||
message = " 🗙 ta_config index lookup failed"
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
self.stdout.write(response)
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
handler = AppConfig.__new__(AppConfig)
|
||||
_, status_code = handler.sync_defaults()
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" ✓ Created default appsettings.")
|
||||
)
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" Status code: {status_code}")
|
||||
)
|
||||
|
||||
def _mig_channel_tags(self) -> None:
|
||||
"""update from v0.4.13 to v0.5.0, migrate incorrect data types"""
|
||||
self.stdout.write("[MIGRATION] fix incorrect channel tags types")
|
||||
path = "ta_channel/_update_by_query"
|
||||
data = {
|
||||
"query": {"match": {"channel_tags": False}},
|
||||
"script": {
|
||||
"source": "ctx._source.channel_tags = []",
|
||||
"lang": "painless",
|
||||
},
|
||||
}
|
||||
response, status_code = ElasticWrap(path).post(data)
|
||||
if status_code in [200, 201]:
|
||||
updated = response.get("updated")
|
||||
if updated:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ fixed {updated} channel tags")
|
||||
)
|
||||
else:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" no channel tags needed fixing")
|
||||
)
|
||||
return
|
||||
|
||||
message = " 🗙 failed to fix channel tags"
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
self.stdout.write(response)
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
def _mig_video_channel_tags(self) -> None:
|
||||
"""update from v0.4.13 to v0.5.0, migrate incorrect data types"""
|
||||
self.stdout.write("[MIGRATION] fix incorrect video channel tags types")
|
||||
path = "ta_video/_update_by_query"
|
||||
data = {
|
||||
"query": {"match": {"channel.channel_tags": False}},
|
||||
"script": {
|
||||
"source": "ctx._source.channel.channel_tags = []",
|
||||
"lang": "painless",
|
||||
},
|
||||
}
|
||||
response, status_code = ElasticWrap(path).post(data)
|
||||
if status_code in [200, 201]:
|
||||
updated = response.get("updated")
|
||||
if updated:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
f" ✓ fixed {updated} video channel tags"
|
||||
)
|
||||
)
|
||||
else:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(
|
||||
" no video channel tags needed fixing"
|
||||
)
|
||||
)
|
||||
return
|
||||
|
||||
message = " 🗙 failed to fix video channel tags"
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
self.stdout.write(response)
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
||||
|
||||
def _mig_fix_download_channel_indexed(self) -> None:
|
||||
"""migrate from v0.5.2 to 0.5.3, fix missing channel_indexed"""
|
||||
self.stdout.write("[MIGRATION] fix incorrect video channel tags types")
|
||||
path = "ta_download/_update_by_query"
|
||||
data = {
|
||||
"query": {
|
||||
"bool": {
|
||||
"must_not": [{"exists": {"field": "channel_indexed"}}]
|
||||
}
|
||||
},
|
||||
"script": {
|
||||
"source": "ctx._source.channel_indexed = false",
|
||||
"lang": "painless",
|
||||
},
|
||||
}
|
||||
response, status_code = ElasticWrap(path).post(data)
|
||||
if status_code in [200, 201]:
|
||||
updated = response.get("updated")
|
||||
if updated:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(f" ✓ fixed {updated} queued videos")
|
||||
)
|
||||
else:
|
||||
self.stdout.write(
|
||||
self.style.SUCCESS(" no queued videos to fix")
|
||||
)
|
||||
return
|
||||
|
||||
message = " 🗙 failed to fix video channel tags"
|
||||
self.stdout.write(self.style.ERROR(message))
|
||||
self.stdout.write(response)
|
||||
sleep(60)
|
||||
raise CommandError(message)
|
@ -1,40 +0,0 @@
|
||||
"""stop on unexpected table"""
|
||||
|
||||
from time import sleep
|
||||
|
||||
from django.core.management.base import BaseCommand, CommandError
|
||||
from django.db import connection
|
||||
|
||||
ERROR_MESSAGE = """
|
||||
🗙 Database is incompatible, see latest release notes for instructions:
|
||||
🗙 https://github.com/tubearchivist/tubearchivist/releases/tag/v0.5.0
|
||||
"""
|
||||
|
||||
|
||||
class Command(BaseCommand):
|
||||
"""command framework"""
|
||||
|
||||
# pylint: disable=no-member
|
||||
|
||||
def handle(self, *args, **options):
|
||||
"""handle"""
|
||||
self.stdout.write("[MIGRATION] Confirming v0.5.0 table layout")
|
||||
all_tables = self.list_tables()
|
||||
for table in all_tables:
|
||||
if table == "home_account":
|
||||
|
||||
self.stdout.write(self.style.ERROR(ERROR_MESSAGE))
|
||||
sleep(60)
|
||||
raise CommandError(ERROR_MESSAGE)
|
||||
|
||||
self.stdout.write(self.style.SUCCESS(" ✓ local DB is up-to-date."))
|
||||
|
||||
def list_tables(self):
|
||||
"""raw list all tables"""
|
||||
with connection.cursor() as cursor:
|
||||
cursor.execute(
|
||||
"SELECT name FROM sqlite_master WHERE type='table';"
|
||||
)
|
||||
tables = cursor.fetchall()
|
||||
|
||||
return [table[0] for table in tables]
|
@ -1,95 +0,0 @@
|
||||
"""download serializers"""
|
||||
|
||||
# pylint: disable=abstract-method
|
||||
|
||||
from common.serializers import PaginationSerializer, ValidateUnknownFieldsMixin
|
||||
from rest_framework import serializers
|
||||
from video.src.constants import VideoTypeEnum
|
||||
|
||||
|
||||
class DownloadItemSerializer(serializers.Serializer):
|
||||
"""serialize download item"""
|
||||
|
||||
auto_start = serializers.BooleanField()
|
||||
channel_id = serializers.CharField()
|
||||
channel_indexed = serializers.BooleanField()
|
||||
channel_name = serializers.CharField()
|
||||
duration = serializers.CharField()
|
||||
published = serializers.CharField()
|
||||
status = serializers.ChoiceField(choices=["pending", "ignore"])
|
||||
timestamp = serializers.IntegerField()
|
||||
title = serializers.CharField()
|
||||
vid_thumb_url = serializers.CharField()
|
||||
vid_type = serializers.ChoiceField(choices=VideoTypeEnum.values())
|
||||
youtube_id = serializers.CharField()
|
||||
message = serializers.CharField(required=False)
|
||||
_index = serializers.CharField(required=False)
|
||||
_score = serializers.IntegerField(required=False)
|
||||
|
||||
|
||||
class DownloadListSerializer(serializers.Serializer):
|
||||
"""serialize download list"""
|
||||
|
||||
data = DownloadItemSerializer(many=True)
|
||||
paginate = PaginationSerializer()
|
||||
|
||||
|
||||
class DownloadListQuerySerializer(
|
||||
ValidateUnknownFieldsMixin, serializers.Serializer
|
||||
):
|
||||
"""serialize query params for download list"""
|
||||
|
||||
filter = serializers.ChoiceField(
|
||||
choices=["pending", "ignore"], required=False
|
||||
)
|
||||
channel = serializers.CharField(required=False, help_text="channel ID")
|
||||
page = serializers.IntegerField(required=False)
|
||||
|
||||
|
||||
class DownloadListQueueDeleteQuerySerializer(serializers.Serializer):
|
||||
"""serialize bulk delete download queue query string"""
|
||||
|
||||
filter = serializers.ChoiceField(choices=["pending", "ignore"])
|
||||
|
||||
|
||||
class AddDownloadItemSerializer(serializers.Serializer):
|
||||
"""serialize single item to add"""
|
||||
|
||||
youtube_id = serializers.CharField()
|
||||
status = serializers.ChoiceField(choices=["pending", "ignore-force"])
|
||||
|
||||
|
||||
class AddToDownloadListSerializer(serializers.Serializer):
|
||||
"""serialize add to download queue data"""
|
||||
|
||||
data = AddDownloadItemSerializer(many=True)
|
||||
|
||||
|
||||
class AddToDownloadQuerySerializer(serializers.Serializer):
|
||||
"""add to queue query serializer"""
|
||||
|
||||
autostart = serializers.BooleanField(required=False)
|
||||
|
||||
|
||||
class DownloadQueueItemUpdateSerializer(serializers.Serializer):
|
||||
"""update single download queue item"""
|
||||
|
||||
status = serializers.ChoiceField(
|
||||
choices=["pending", "ignore", "ignore-force", "priority"]
|
||||
)
|
||||
|
||||
|
||||
class DownloadAggBucketSerializer(serializers.Serializer):
|
||||
"""serialize bucket"""
|
||||
|
||||
key = serializers.ListField(child=serializers.CharField())
|
||||
key_as_string = serializers.CharField()
|
||||
doc_count = serializers.IntegerField()
|
||||
|
||||
|
||||
class DownloadAggsSerializer(serializers.Serializer):
|
||||
"""serialize download channel bucket aggregations"""
|
||||
|
||||
doc_count_error_upper_bound = serializers.IntegerField()
|
||||
sum_other_doc_count = serializers.IntegerField()
|
||||
buckets = DownloadAggBucketSerializer(many=True)
|
@ -1,367 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- handle download queue
|
||||
- linked with ta_dowload index
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
|
||||
from appsettings.src.config import AppConfig
|
||||
from common.src.es_connect import ElasticWrap, IndexPaginate
|
||||
from common.src.helper import get_duration_str, is_shorts, rand_sleep
|
||||
from download.src.subscriptions import ChannelSubscription
|
||||
from download.src.thumbnails import ThumbManager
|
||||
from download.src.yt_dlp_base import YtWrap
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
from video.src.constants import VideoTypeEnum
|
||||
|
||||
|
||||
class PendingIndex:
|
||||
"""base class holding all export methods"""
|
||||
|
||||
def __init__(self):
|
||||
self.all_pending = False
|
||||
self.all_ignored = False
|
||||
self.all_videos = False
|
||||
self.all_channels = False
|
||||
self.channel_overwrites = False
|
||||
self.video_overwrites = False
|
||||
self.to_skip = False
|
||||
|
||||
def get_download(self):
|
||||
"""get a list of all pending videos in ta_download"""
|
||||
data = {
|
||||
"query": {"match_all": {}},
|
||||
"sort": [{"timestamp": {"order": "asc"}}],
|
||||
}
|
||||
all_results = IndexPaginate("ta_download", data).get_results()
|
||||
|
||||
self.all_pending = []
|
||||
self.all_ignored = []
|
||||
self.to_skip = []
|
||||
|
||||
for result in all_results:
|
||||
self.to_skip.append(result["youtube_id"])
|
||||
if result["status"] == "pending":
|
||||
self.all_pending.append(result)
|
||||
elif result["status"] == "ignore":
|
||||
self.all_ignored.append(result)
|
||||
|
||||
def get_indexed(self):
|
||||
"""get a list of all videos indexed"""
|
||||
data = {
|
||||
"query": {"match_all": {}},
|
||||
"sort": [{"published": {"order": "desc"}}],
|
||||
}
|
||||
self.all_videos = IndexPaginate("ta_video", data).get_results()
|
||||
for video in self.all_videos:
|
||||
self.to_skip.append(video["youtube_id"])
|
||||
|
||||
def get_channels(self):
|
||||
"""get a list of all channels indexed"""
|
||||
self.all_channels = []
|
||||
self.channel_overwrites = {}
|
||||
data = {
|
||||
"query": {"match_all": {}},
|
||||
"sort": [{"channel_id": {"order": "asc"}}],
|
||||
}
|
||||
channels = IndexPaginate("ta_channel", data).get_results()
|
||||
|
||||
for channel in channels:
|
||||
channel_id = channel["channel_id"]
|
||||
self.all_channels.append(channel_id)
|
||||
if channel.get("channel_overwrites"):
|
||||
self.channel_overwrites.update(
|
||||
{channel_id: channel.get("channel_overwrites")}
|
||||
)
|
||||
|
||||
self._map_overwrites()
|
||||
|
||||
def _map_overwrites(self):
|
||||
"""map video ids to channel ids overwrites"""
|
||||
self.video_overwrites = {}
|
||||
for video in self.all_pending:
|
||||
video_id = video["youtube_id"]
|
||||
channel_id = video["channel_id"]
|
||||
overwrites = self.channel_overwrites.get(channel_id, False)
|
||||
if overwrites:
|
||||
self.video_overwrites.update({video_id: overwrites})
|
||||
|
||||
|
||||
class PendingInteract:
|
||||
"""interact with items in download queue"""
|
||||
|
||||
def __init__(self, youtube_id=False, status=False):
|
||||
self.youtube_id = youtube_id
|
||||
self.status = status
|
||||
|
||||
def delete_item(self):
|
||||
"""delete single item from pending"""
|
||||
path = f"ta_download/_doc/{self.youtube_id}"
|
||||
_, _ = ElasticWrap(path).delete(refresh=True)
|
||||
|
||||
def delete_by_status(self):
|
||||
"""delete all matching item by status"""
|
||||
data = {"query": {"term": {"status": {"value": self.status}}}}
|
||||
path = "ta_download/_delete_by_query"
|
||||
_, _ = ElasticWrap(path).post(data=data)
|
||||
|
||||
def update_status(self):
|
||||
"""update status of pending item"""
|
||||
if self.status == "priority":
|
||||
data = {
|
||||
"doc": {
|
||||
"status": "pending",
|
||||
"auto_start": True,
|
||||
"message": None,
|
||||
}
|
||||
}
|
||||
else:
|
||||
data = {"doc": {"status": self.status}}
|
||||
|
||||
path = f"ta_download/_update/{self.youtube_id}/?refresh=true"
|
||||
_, _ = ElasticWrap(path).post(data=data)
|
||||
|
||||
def get_item(self):
|
||||
"""return pending item dict"""
|
||||
path = f"ta_download/_doc/{self.youtube_id}"
|
||||
response, status_code = ElasticWrap(path).get()
|
||||
return response["_source"], status_code
|
||||
|
||||
def get_channel(self):
|
||||
"""
|
||||
get channel metadata from queue to not depend on channel to be indexed
|
||||
"""
|
||||
data = {
|
||||
"size": 1,
|
||||
"query": {"term": {"channel_id": {"value": self.youtube_id}}},
|
||||
}
|
||||
response, _ = ElasticWrap("ta_download/_search").get(data=data)
|
||||
hits = response["hits"]["hits"]
|
||||
if not hits:
|
||||
channel_name = "NA"
|
||||
else:
|
||||
channel_name = hits[0]["_source"].get("channel_name", "NA")
|
||||
|
||||
return {
|
||||
"channel_id": self.youtube_id,
|
||||
"channel_name": channel_name,
|
||||
}
|
||||
|
||||
|
||||
class PendingList(PendingIndex):
|
||||
"""manage the pending videos list"""
|
||||
|
||||
yt_obs = {
|
||||
"noplaylist": True,
|
||||
"writethumbnail": True,
|
||||
"simulate": True,
|
||||
"check_formats": None,
|
||||
}
|
||||
|
||||
def __init__(self, youtube_ids=False, task=False):
|
||||
super().__init__()
|
||||
self.config = AppConfig().config
|
||||
self.youtube_ids = youtube_ids
|
||||
self.task = task
|
||||
self.to_skip = False
|
||||
self.missing_videos = False
|
||||
|
||||
def parse_url_list(self, auto_start=False):
|
||||
"""extract youtube ids from list"""
|
||||
self.missing_videos = []
|
||||
self.get_download()
|
||||
self.get_indexed()
|
||||
total = len(self.youtube_ids)
|
||||
for idx, entry in enumerate(self.youtube_ids):
|
||||
self._process_entry(entry, auto_start=auto_start)
|
||||
if not self.task:
|
||||
continue
|
||||
|
||||
self.task.send_progress(
|
||||
message_lines=[f"Extracting items {idx + 1}/{total}"],
|
||||
progress=(idx + 1) / total,
|
||||
)
|
||||
|
||||
def _process_entry(self, entry, auto_start=False):
|
||||
"""process single entry from url list"""
|
||||
vid_type = self._get_vid_type(entry)
|
||||
if entry["type"] == "video":
|
||||
self._add_video(entry["url"], vid_type, auto_start=auto_start)
|
||||
elif entry["type"] == "channel":
|
||||
self._parse_channel(entry["url"], vid_type)
|
||||
elif entry["type"] == "playlist":
|
||||
self._parse_playlist(entry["url"])
|
||||
else:
|
||||
raise ValueError(f"invalid url_type: {entry}")
|
||||
|
||||
@staticmethod
|
||||
def _get_vid_type(entry):
|
||||
"""add vid type enum if available"""
|
||||
vid_type_str = entry.get("vid_type")
|
||||
if not vid_type_str:
|
||||
return VideoTypeEnum.UNKNOWN
|
||||
|
||||
return VideoTypeEnum(vid_type_str)
|
||||
|
||||
def _add_video(self, url, vid_type, auto_start=False):
|
||||
"""add video to list"""
|
||||
if auto_start and url in set(
|
||||
i["youtube_id"] for i in self.all_pending
|
||||
):
|
||||
PendingInteract(youtube_id=url, status="priority").update_status()
|
||||
return
|
||||
|
||||
if url not in self.missing_videos and url not in self.to_skip:
|
||||
self.missing_videos.append((url, vid_type))
|
||||
else:
|
||||
print(f"{url}: skipped adding already indexed video to download.")
|
||||
|
||||
def _parse_channel(self, url, vid_type):
|
||||
"""add all videos of channel to list"""
|
||||
video_results = ChannelSubscription().get_last_youtube_videos(
|
||||
url, limit=False, query_filter=vid_type
|
||||
)
|
||||
for video_id, _, vid_type in video_results:
|
||||
self._add_video(video_id, vid_type)
|
||||
|
||||
def _parse_playlist(self, url):
|
||||
"""add all videos of playlist to list"""
|
||||
playlist = YoutubePlaylist(url)
|
||||
is_active = playlist.update_playlist()
|
||||
if not is_active:
|
||||
message = f"{playlist.youtube_id}: failed to extract metadata"
|
||||
print(message)
|
||||
raise ValueError(message)
|
||||
|
||||
entries = playlist.json_data["playlist_entries"]
|
||||
to_add = [i["youtube_id"] for i in entries if not i["downloaded"]]
|
||||
if not to_add:
|
||||
return
|
||||
|
||||
for video_id in to_add:
|
||||
# match vid_type later
|
||||
self._add_video(video_id, VideoTypeEnum.UNKNOWN)
|
||||
|
||||
def add_to_pending(self, status="pending", auto_start=False):
|
||||
"""add missing videos to pending list"""
|
||||
self.get_channels()
|
||||
|
||||
total = len(self.missing_videos)
|
||||
videos_added = []
|
||||
for idx, (youtube_id, vid_type) in enumerate(self.missing_videos):
|
||||
if self.task and self.task.is_stopped():
|
||||
break
|
||||
|
||||
print(f"{youtube_id}: [{idx + 1}/{total}]: add to queue")
|
||||
self._notify_add(idx, total)
|
||||
video_details = self.get_youtube_details(youtube_id, vid_type)
|
||||
if not video_details:
|
||||
rand_sleep(self.config)
|
||||
continue
|
||||
|
||||
video_details.update(
|
||||
{
|
||||
"status": status,
|
||||
"auto_start": auto_start,
|
||||
}
|
||||
)
|
||||
|
||||
url = video_details["vid_thumb_url"]
|
||||
ThumbManager(youtube_id).download_video_thumb(url)
|
||||
es_url = f"ta_download/_doc/{youtube_id}"
|
||||
_, _ = ElasticWrap(es_url).put(video_details)
|
||||
videos_added.append(youtube_id)
|
||||
|
||||
if idx != total:
|
||||
rand_sleep(self.config)
|
||||
|
||||
return videos_added
|
||||
|
||||
def _notify_add(self, idx, total):
|
||||
"""send notification for adding videos to download queue"""
|
||||
if not self.task:
|
||||
return
|
||||
|
||||
self.task.send_progress(
|
||||
message_lines=[
|
||||
"Adding new videos to download queue.",
|
||||
f"Extracting items {idx + 1}/{total}",
|
||||
],
|
||||
progress=(idx + 1) / total,
|
||||
)
|
||||
|
||||
def get_youtube_details(self, youtube_id, vid_type=VideoTypeEnum.VIDEOS):
|
||||
"""get details from youtubedl for single pending video"""
|
||||
vid = YtWrap(self.yt_obs, self.config).extract(youtube_id)
|
||||
if not vid:
|
||||
return False
|
||||
|
||||
if vid.get("id") != youtube_id:
|
||||
# skip premium videos with different id
|
||||
print(f"{youtube_id}: skipping premium video, id not matching")
|
||||
return False
|
||||
# stop if video is streaming live now
|
||||
if vid["live_status"] in ["is_upcoming", "is_live"]:
|
||||
print(f"{youtube_id}: skip is_upcoming or is_live")
|
||||
return False
|
||||
|
||||
if vid["live_status"] == "was_live":
|
||||
vid_type = VideoTypeEnum.STREAMS
|
||||
else:
|
||||
if self._check_shorts(vid):
|
||||
vid_type = VideoTypeEnum.SHORTS
|
||||
else:
|
||||
vid_type = VideoTypeEnum.VIDEOS
|
||||
|
||||
if not vid.get("channel"):
|
||||
print(f"{youtube_id}: skip video not part of channel")
|
||||
return False
|
||||
|
||||
return self._parse_youtube_details(vid, vid_type)
|
||||
|
||||
@staticmethod
|
||||
def _check_shorts(vid):
|
||||
"""check if vid is shorts video"""
|
||||
if vid["width"] > vid["height"]:
|
||||
return False
|
||||
|
||||
duration = vid.get("duration")
|
||||
if duration and isinstance(duration, int):
|
||||
if duration > 3 * 60:
|
||||
return False
|
||||
|
||||
return is_shorts(vid["id"])
|
||||
|
||||
def _parse_youtube_details(self, vid, vid_type=VideoTypeEnum.VIDEOS):
|
||||
"""parse response"""
|
||||
vid_id = vid.get("id")
|
||||
|
||||
# build dict
|
||||
youtube_details = {
|
||||
"youtube_id": vid_id,
|
||||
"channel_name": vid["channel"],
|
||||
"vid_thumb_url": vid["thumbnail"],
|
||||
"title": vid["title"],
|
||||
"channel_id": vid["channel_id"],
|
||||
"duration": get_duration_str(vid["duration"]),
|
||||
"published": self._build_published(vid),
|
||||
"timestamp": int(datetime.now().timestamp()),
|
||||
"vid_type": vid_type.value,
|
||||
"channel_indexed": vid["channel_id"] in self.all_channels,
|
||||
}
|
||||
|
||||
return youtube_details
|
||||
|
||||
@staticmethod
|
||||
def _build_published(vid):
|
||||
"""build published date or timestamp"""
|
||||
timestamp = vid["timestamp"]
|
||||
if timestamp:
|
||||
return timestamp
|
||||
|
||||
upload_date = vid["upload_date"]
|
||||
upload_date_time = datetime.strptime(upload_date, "%Y%m%d")
|
||||
published = upload_date_time.strftime("%Y-%m-%d")
|
||||
|
||||
return published
|
@ -1,441 +0,0 @@
|
||||
"""
|
||||
Functionality:
|
||||
- handle channel subscriptions
|
||||
- handle playlist subscriptions
|
||||
"""
|
||||
|
||||
from appsettings.src.config import AppConfig
|
||||
from channel.src.index import YoutubeChannel
|
||||
from common.src.es_connect import IndexPaginate
|
||||
from common.src.helper import is_missing, rand_sleep
|
||||
from common.src.urlparser import Parser
|
||||
from download.src.thumbnails import ThumbManager
|
||||
from download.src.yt_dlp_base import YtWrap
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
from video.src.constants import VideoTypeEnum
|
||||
from video.src.index import YoutubeVideo
|
||||
|
||||
|
||||
class ChannelSubscription:
|
||||
"""manage the list of channels subscribed"""
|
||||
|
||||
def __init__(self, task=False):
|
||||
self.config = AppConfig().config
|
||||
self.task = task
|
||||
|
||||
@staticmethod
|
||||
def get_channels(subscribed_only=True):
|
||||
"""get a list of all channels subscribed to"""
|
||||
data = {
|
||||
"sort": [{"channel_name.keyword": {"order": "asc"}}],
|
||||
}
|
||||
if subscribed_only:
|
||||
data["query"] = {"term": {"channel_subscribed": {"value": True}}}
|
||||
else:
|
||||
data["query"] = {"match_all": {}}
|
||||
|
||||
all_channels = IndexPaginate("ta_channel", data).get_results()
|
||||
|
||||
return all_channels
|
||||
|
||||
def get_last_youtube_videos(
|
||||
self,
|
||||
channel_id,
|
||||
limit=True,
|
||||
query_filter=None,
|
||||
channel_overwrites=None,
|
||||
):
|
||||
"""get a list of last videos from channel"""
|
||||
query_handler = VideoQueryBuilder(self.config, channel_overwrites)
|
||||
queries = query_handler.build_queries(query_filter)
|
||||
last_videos = []
|
||||
|
||||
for vid_type_enum, limit_amount in queries:
|
||||
obs = {
|
||||
"skip_download": True,
|
||||
"extract_flat": True,
|
||||
}
|
||||
vid_type = vid_type_enum.value
|
||||
|
||||
if limit:
|
||||
obs["playlistend"] = limit_amount
|
||||
|
||||
url = f"https://www.youtube.com/channel/{channel_id}/{vid_type}"
|
||||
channel_query = YtWrap(obs, self.config).extract(url)
|
||||
if not channel_query:
|
||||
continue
|
||||
|
||||
last_videos.extend(
|
||||
[
|
||||
(i["id"], i["title"], vid_type)
|
||||
for i in channel_query["entries"]
|
||||
]
|
||||
)
|
||||
|
||||
return last_videos
|
||||
|
||||
def find_missing(self):
|
||||
"""add missing videos from subscribed channels to pending"""
|
||||
all_channels = self.get_channels()
|
||||
if not all_channels:
|
||||
return False
|
||||
|
||||
missing_videos = []
|
||||
|
||||
total = len(all_channels)
|
||||
for idx, channel in enumerate(all_channels):
|
||||
channel_id = channel["channel_id"]
|
||||
print(f"{channel_id}: find missing videos.")
|
||||
last_videos = self.get_last_youtube_videos(
|
||||
channel_id,
|
||||
channel_overwrites=channel.get("channel_overwrites"),
|
||||
)
|
||||
|
||||
if last_videos:
|
||||
ids_to_add = is_missing([i[0] for i in last_videos])
|
||||
for video_id, _, vid_type in last_videos:
|
||||
if video_id in ids_to_add:
|
||||
missing_videos.append((video_id, vid_type))
|
||||
|
||||
if not self.task:
|
||||
continue
|
||||
|
||||
if self.task.is_stopped():
|
||||
self.task.send_progress(["Received Stop signal."])
|
||||
break
|
||||
|
||||
self.task.send_progress(
|
||||
message_lines=[f"Scanning Channel {idx + 1}/{total}"],
|
||||
progress=(idx + 1) / total,
|
||||
)
|
||||
rand_sleep(self.config)
|
||||
|
||||
return missing_videos
|
||||
|
||||
@staticmethod
|
||||
def change_subscribe(channel_id, channel_subscribed):
|
||||
"""subscribe or unsubscribe from channel and update"""
|
||||
channel = YoutubeChannel(channel_id)
|
||||
channel.build_json()
|
||||
channel.json_data["channel_subscribed"] = channel_subscribed
|
||||
channel.upload_to_es()
|
||||
channel.sync_to_videos()
|
||||
|
||||
return channel.json_data
|
||||
|
||||
|
||||
class VideoQueryBuilder:
|
||||
"""Build queries for yt-dlp."""
|
||||
|
||||
def __init__(self, config: dict, channel_overwrites: dict | None = None):
|
||||
self.config = config
|
||||
self.channel_overwrites = channel_overwrites or {}
|
||||
|
||||
def build_queries(
|
||||
self, video_type: VideoTypeEnum | None, limit: bool = True
|
||||
) -> list[tuple[VideoTypeEnum, int | None]]:
|
||||
"""Build queries for all or specific video type."""
|
||||
query_methods = {
|
||||
VideoTypeEnum.VIDEOS: self.videos_query,
|
||||
VideoTypeEnum.STREAMS: self.streams_query,
|
||||
VideoTypeEnum.SHORTS: self.shorts_query,
|
||||
}
|
||||
|
||||
if video_type:
|
||||
# build query for specific type
|
||||
query_method = query_methods.get(video_type)
|
||||
if query_method:
|
||||
query = query_method(limit)
|
||||
if query[1] != 0:
|
||||
return [query]
|
||||
return []
|
||||
|
||||
# Build and return queries for all video types
|
||||
queries = []
|
||||
for build_query in query_methods.values():
|
||||
query = build_query(limit)
|
||||
if query[1] != 0:
|
||||
queries.append(query)
|
||||
|
||||
return queries
|
||||
|
||||
def videos_query(self, limit: bool) -> tuple[VideoTypeEnum, int | None]:
|
||||
"""Build query for videos."""
|
||||
return self._build_generic_query(
|
||||
video_type=VideoTypeEnum.VIDEOS,
|
||||
overwrite_key="subscriptions_channel_size",
|
||||
config_key="channel_size",
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
def streams_query(self, limit: bool) -> tuple[VideoTypeEnum, int | None]:
|
||||
"""Build query for streams."""
|
||||
return self._build_generic_query(
|
||||
video_type=VideoTypeEnum.STREAMS,
|
||||
overwrite_key="subscriptions_live_channel_size",
|
||||
config_key="live_channel_size",
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
def shorts_query(self, limit: bool) -> tuple[VideoTypeEnum, int | None]:
|
||||
"""Build query for shorts."""
|
||||
return self._build_generic_query(
|
||||
video_type=VideoTypeEnum.SHORTS,
|
||||
overwrite_key="subscriptions_shorts_channel_size",
|
||||
config_key="shorts_channel_size",
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
def _build_generic_query(
|
||||
self,
|
||||
video_type: VideoTypeEnum,
|
||||
overwrite_key: str,
|
||||
config_key: str,
|
||||
limit: bool,
|
||||
) -> tuple[VideoTypeEnum, int | None]:
|
||||
"""Generic query for video page scraping."""
|
||||
if not limit:
|
||||
return (video_type, None)
|
||||
|
||||
if (
|
||||
overwrite_key in self.channel_overwrites
|
||||
and self.channel_overwrites[overwrite_key] is not None
|
||||
):
|
||||
overwrite = self.channel_overwrites[overwrite_key]
|
||||
return (video_type, overwrite)
|
||||
|
||||
if overwrite := self.config["subscriptions"].get(config_key):
|
||||
return (video_type, overwrite)
|
||||
|
||||
return (video_type, 0)
|
||||
|
||||
|
||||
class PlaylistSubscription:
|
||||
"""manage the playlist download functionality"""
|
||||
|
||||
def __init__(self, task=False):
|
||||
self.config = AppConfig().config
|
||||
self.task = task
|
||||
|
||||
@staticmethod
|
||||
def get_playlists(subscribed_only=True):
|
||||
"""get a list of all active playlists"""
|
||||
data = {
|
||||
"sort": [{"playlist_channel.keyword": {"order": "desc"}}],
|
||||
}
|
||||
data["query"] = {
|
||||
"bool": {"must": [{"term": {"playlist_active": {"value": True}}}]}
|
||||
}
|
||||
if subscribed_only:
|
||||
data["query"]["bool"]["must"].append(
|
||||
{"term": {"playlist_subscribed": {"value": True}}}
|
||||
)
|
||||
|
||||
all_playlists = IndexPaginate("ta_playlist", data).get_results()
|
||||
|
||||
return all_playlists
|
||||
|
||||
def process_url_str(self, new_playlists, subscribed=True):
|
||||
"""process playlist subscribe form url_str"""
|
||||
for idx, playlist in enumerate(new_playlists):
|
||||
playlist_id = playlist["url"]
|
||||
if not playlist["type"] == "playlist":
|
||||
print(f"{playlist_id} not a playlist, skipping...")
|
||||
continue
|
||||
|
||||
playlist_h = YoutubePlaylist(playlist_id)
|
||||
playlist_h.build_json()
|
||||
if not playlist_h.json_data:
|
||||
message = f"{playlist_h.youtube_id}: failed to extract data"
|
||||
print(message)
|
||||
raise ValueError(message)
|
||||
|
||||
playlist_h.json_data["playlist_subscribed"] = subscribed
|
||||
playlist_h.upload_to_es()
|
||||
playlist_h.add_vids_to_playlist()
|
||||
self.channel_validate(playlist_h.json_data["playlist_channel_id"])
|
||||
|
||||
url = playlist_h.json_data["playlist_thumbnail"]
|
||||
thumb = ThumbManager(playlist_id, item_type="playlist")
|
||||
thumb.download_playlist_thumb(url)
|
||||
|
||||
if self.task:
|
||||
self.task.send_progress(
|
||||
message_lines=[
|
||||
f"Processing {idx + 1} of {len(new_playlists)}"
|
||||
],
|
||||
progress=(idx + 1) / len(new_playlists),
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def channel_validate(channel_id):
|
||||
"""make sure channel of playlist is there"""
|
||||
channel = YoutubeChannel(channel_id)
|
||||
channel.build_json(upload=True)
|
||||
|
||||
@staticmethod
|
||||
def change_subscribe(playlist_id, subscribe_status):
|
||||
"""change the subscribe status of a playlist"""
|
||||
playlist = YoutubePlaylist(playlist_id)
|
||||
playlist.build_json()
|
||||
playlist.json_data["playlist_subscribed"] = subscribe_status
|
||||
playlist.upload_to_es()
|
||||
return playlist.json_data
|
||||
|
||||
def find_missing(self):
|
||||
"""find videos in subscribed playlists not downloaded yet"""
|
||||
all_playlists = [i["playlist_id"] for i in self.get_playlists()]
|
||||
if not all_playlists:
|
||||
return False
|
||||
|
||||
missing_videos = []
|
||||
total = len(all_playlists)
|
||||
for idx, playlist_id in enumerate(all_playlists):
|
||||
playlist = YoutubePlaylist(playlist_id)
|
||||
is_active = playlist.update_playlist()
|
||||
if not is_active:
|
||||
playlist.deactivate()
|
||||
continue
|
||||
|
||||
playlist_entries = playlist.json_data["playlist_entries"]
|
||||
size_limit = self.config["subscriptions"]["channel_size"]
|
||||
if size_limit:
|
||||
del playlist_entries[size_limit:]
|
||||
|
||||
to_check = [
|
||||
i["youtube_id"]
|
||||
for i in playlist_entries
|
||||
if i["downloaded"] is False
|
||||
]
|
||||
needs_downloading = is_missing(to_check)
|
||||
missing_videos.extend(needs_downloading)
|
||||
|
||||
if not self.task:
|
||||
continue
|
||||
|
||||
if self.task.is_stopped():
|
||||
self.task.send_progress(["Received Stop signal."])
|
||||
break
|
||||
|
||||
self.task.send_progress(
|
||||
message_lines=[f"Scanning Playlists {idx + 1}/{total}"],
|
||||
progress=(idx + 1) / total,
|
||||
)
|
||||
rand_sleep(self.config)
|
||||
|
||||
return missing_videos
|
||||
|
||||
|
||||
class SubscriptionScanner:
|
||||
"""add missing videos to queue"""
|
||||
|
||||
def __init__(self, task=False):
|
||||
self.task = task
|
||||
self.missing_videos = False
|
||||
self.auto_start = AppConfig().config["subscriptions"].get("auto_start")
|
||||
|
||||
def scan(self):
|
||||
"""scan channels and playlists"""
|
||||
if self.task:
|
||||
self.task.send_progress(["Rescanning channels and playlists."])
|
||||
|
||||
self.missing_videos = []
|
||||
self.scan_channels()
|
||||
if self.task and not self.task.is_stopped():
|
||||
self.scan_playlists()
|
||||
|
||||
return self.missing_videos
|
||||
|
||||
def scan_channels(self):
|
||||
"""get missing from channels"""
|
||||
channel_handler = ChannelSubscription(task=self.task)
|
||||
missing = channel_handler.find_missing()
|
||||
if not missing:
|
||||
return
|
||||
|
||||
for vid_id, vid_type in missing:
|
||||
self.missing_videos.append(
|
||||
{"type": "video", "vid_type": vid_type, "url": vid_id}
|
||||
)
|
||||
|
||||
def scan_playlists(self):
|
||||
"""get missing from playlists"""
|
||||
playlist_handler = PlaylistSubscription(task=self.task)
|
||||
missing = playlist_handler.find_missing()
|
||||
if not missing:
|
||||
return
|
||||
|
||||
for i in missing:
|
||||
self.missing_videos.append(
|
||||
{
|
||||
"type": "video",
|
||||
"vid_type": VideoTypeEnum.VIDEOS.value,
|
||||
"url": i,
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
class SubscriptionHandler:
|
||||
"""subscribe to channels and playlists from url_str"""
|
||||
|
||||
def __init__(self, url_str, task=False):
|
||||
self.url_str = url_str
|
||||
self.task = task
|
||||
self.to_subscribe = False
|
||||
|
||||
def subscribe(self, expected_type=False):
|
||||
"""subscribe to url_str items"""
|
||||
if self.task:
|
||||
self.task.send_progress(["Processing form content."])
|
||||
self.to_subscribe = Parser(self.url_str).parse()
|
||||
|
||||
total = len(self.to_subscribe)
|
||||
for idx, item in enumerate(self.to_subscribe):
|
||||
if self.task:
|
||||
self._notify(idx, item, total)
|
||||
|
||||
self.subscribe_type(item, expected_type=expected_type)
|
||||
|
||||
def subscribe_type(self, item, expected_type):
|
||||
"""process single item"""
|
||||
if item["type"] == "playlist":
|
||||
if expected_type and expected_type != "playlist":
|
||||
raise TypeError(
|
||||
f"expected {expected_type} url but got {item.get('type')}"
|
||||
)
|
||||
|
||||
PlaylistSubscription().process_url_str([item])
|
||||
return
|
||||
|
||||
if item["type"] == "video":
|
||||
# extract channel id from video
|
||||
video = YoutubeVideo(item["url"])
|
||||
video.get_from_youtube()
|
||||
video.process_youtube_meta()
|
||||
channel_id = video.channel_id
|
||||
elif item["type"] == "channel":
|
||||
channel_id = item["url"]
|
||||
else:
|
||||
raise ValueError("failed to subscribe to: " + item["url"])
|
||||
|
||||
if expected_type and expected_type != "channel":
|
||||
raise TypeError(
|
||||
f"expected {expected_type} url but got {item.get('type')}"
|
||||
)
|
||||
|
||||
self._subscribe(channel_id)
|
||||
|
||||
def _subscribe(self, channel_id):
|
||||
"""subscribe to channel"""
|
||||
_ = ChannelSubscription().change_subscribe(
|
||||
channel_id, channel_subscribed=True
|
||||
)
|
||||
|
||||
def _notify(self, idx, item, total):
|
||||
"""send notification message to redis"""
|
||||
subscribe_type = item["type"].title()
|
||||
message_lines = [
|
||||
f"Subscribe to {subscribe_type}",
|
||||
f"Progress: {idx + 1}/{total}",
|
||||
]
|
||||
self.task.send_progress(message_lines, progress=(idx + 1) / total)
|
@ -1,219 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- base class to make all calls to yt-dlp
|
||||
- handle yt-dlp errors
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
from http import cookiejar
|
||||
from io import StringIO
|
||||
|
||||
import yt_dlp
|
||||
from appsettings.src.config import AppConfig
|
||||
from common.src.ta_redis import RedisArchivist
|
||||
from django.conf import settings
|
||||
|
||||
|
||||
class YtWrap:
|
||||
"""wrap calls to yt"""
|
||||
|
||||
OBS_BASE = {
|
||||
"default_search": "ytsearch",
|
||||
"quiet": True,
|
||||
"socket_timeout": 10,
|
||||
"extractor_retries": 3,
|
||||
"retries": 10,
|
||||
}
|
||||
|
||||
def __init__(self, obs_request, config=False):
|
||||
self.obs_request = obs_request
|
||||
self.config = config
|
||||
self.build_obs()
|
||||
|
||||
def build_obs(self):
|
||||
"""build yt-dlp obs"""
|
||||
self.obs = self.OBS_BASE.copy()
|
||||
self.obs.update(self.obs_request)
|
||||
if self.config:
|
||||
self._add_cookie()
|
||||
self._add_potoken()
|
||||
|
||||
if getattr(settings, "DEBUG", False):
|
||||
del self.obs["quiet"]
|
||||
print(self.obs)
|
||||
|
||||
def _add_cookie(self):
|
||||
"""add cookie if enabled"""
|
||||
if self.config["downloads"]["cookie_import"]:
|
||||
cookie_io = CookieHandler(self.config).get()
|
||||
self.obs["cookiefile"] = cookie_io
|
||||
|
||||
def _add_potoken(self):
|
||||
"""add potoken if enabled"""
|
||||
if self.config["downloads"].get("potoken"):
|
||||
potoken = POTokenHandler(self.config).get()
|
||||
self.obs.update(
|
||||
{
|
||||
"extractor_args": {
|
||||
"youtube": {
|
||||
"po_token": [potoken],
|
||||
"player-client": ["web", "default"],
|
||||
},
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
def download(self, url):
|
||||
"""make download request"""
|
||||
self.obs.update({"check_formats": "selected"})
|
||||
with yt_dlp.YoutubeDL(self.obs) as ydl:
|
||||
try:
|
||||
ydl.download([url])
|
||||
except yt_dlp.utils.DownloadError as err:
|
||||
print(f"{url}: failed to download with message {err}")
|
||||
if "Temporary failure in name resolution" in str(err):
|
||||
raise ConnectionError("lost the internet, abort!") from err
|
||||
|
||||
return False, str(err)
|
||||
|
||||
self._validate_cookie()
|
||||
|
||||
return True, True
|
||||
|
||||
def extract(self, url):
|
||||
"""make extract request"""
|
||||
with yt_dlp.YoutubeDL(self.obs) as ydl:
|
||||
try:
|
||||
response = ydl.extract_info(url)
|
||||
except cookiejar.LoadError as err:
|
||||
print(f"cookie file is invalid: {err}")
|
||||
return False
|
||||
except yt_dlp.utils.ExtractorError as err:
|
||||
print(f"{url}: failed to extract: {err}, continue...")
|
||||
return False
|
||||
except yt_dlp.utils.DownloadError as err:
|
||||
if "This channel does not have a" in str(err):
|
||||
return False
|
||||
|
||||
print(f"{url}: failed to get info from youtube: {err}")
|
||||
if "Temporary failure in name resolution" in str(err):
|
||||
raise ConnectionError("lost the internet, abort!") from err
|
||||
|
||||
return False
|
||||
|
||||
self._validate_cookie()
|
||||
|
||||
return response
|
||||
|
||||
def _validate_cookie(self):
|
||||
"""check cookie and write it back for next use"""
|
||||
if not self.obs.get("cookiefile"):
|
||||
return
|
||||
|
||||
new_cookie = self.obs["cookiefile"].read()
|
||||
old_cookie = RedisArchivist().get_message_str("cookie")
|
||||
if new_cookie and old_cookie != new_cookie:
|
||||
print("refreshed stored cookie")
|
||||
RedisArchivist().set_message("cookie", new_cookie, save=True)
|
||||
|
||||
|
||||
class CookieHandler:
|
||||
"""handle youtube cookie for yt-dlp"""
|
||||
|
||||
def __init__(self, config):
|
||||
self.cookie_io = False
|
||||
self.config = config
|
||||
|
||||
def get(self):
|
||||
"""get cookie io stream"""
|
||||
cookie = RedisArchivist().get_message_str("cookie")
|
||||
self.cookie_io = StringIO(cookie)
|
||||
return self.cookie_io
|
||||
|
||||
def set_cookie(self, cookie):
|
||||
"""set cookie str and activate in config"""
|
||||
cookie_clean = cookie.strip("\x00")
|
||||
RedisArchivist().set_message("cookie", cookie_clean, save=True)
|
||||
AppConfig().update_config({"downloads": {"cookie_import": True}})
|
||||
self.config["downloads"]["cookie_import"] = True
|
||||
print("[cookie]: activated and stored in Redis")
|
||||
|
||||
@staticmethod
|
||||
def revoke():
|
||||
"""revoke cookie"""
|
||||
RedisArchivist().del_message("cookie")
|
||||
RedisArchivist().del_message("cookie:valid")
|
||||
AppConfig().update_config({"downloads": {"cookie_import": False}})
|
||||
print("[cookie]: revoked")
|
||||
|
||||
def validate(self):
|
||||
"""validate cookie using the liked videos playlist"""
|
||||
validation = RedisArchivist().get_message_dict("cookie:valid")
|
||||
if validation:
|
||||
print("[cookie]: used cached cookie validation")
|
||||
return True
|
||||
|
||||
print("[cookie] validating cookie")
|
||||
obs_request = {
|
||||
"skip_download": True,
|
||||
"extract_flat": True,
|
||||
}
|
||||
validator = YtWrap(obs_request, self.config)
|
||||
response = bool(validator.extract("LL"))
|
||||
self.store_validation(response)
|
||||
|
||||
# update in redis to avoid expiring
|
||||
modified = validator.obs["cookiefile"].getvalue().strip("\x00")
|
||||
if modified:
|
||||
cookie_clean = modified.strip("\x00")
|
||||
RedisArchivist().set_message("cookie", cookie_clean)
|
||||
|
||||
if not response:
|
||||
mess_dict = {
|
||||
"status": "message:download",
|
||||
"level": "error",
|
||||
"title": "Cookie validation failed, exiting...",
|
||||
"message": "",
|
||||
}
|
||||
RedisArchivist().set_message(
|
||||
"message:download", mess_dict, expire=4
|
||||
)
|
||||
print("[cookie]: validation failed, exiting...")
|
||||
|
||||
print(f"[cookie]: validation success: {response}")
|
||||
return response
|
||||
|
||||
@staticmethod
|
||||
def store_validation(response):
|
||||
"""remember last validation"""
|
||||
now = datetime.now()
|
||||
message = {
|
||||
"status": response,
|
||||
"validated": int(now.timestamp()),
|
||||
"validated_str": now.strftime("%Y-%m-%d %H:%M"),
|
||||
}
|
||||
RedisArchivist().set_message("cookie:valid", message, expire=3600)
|
||||
|
||||
|
||||
class POTokenHandler:
|
||||
"""handle po token"""
|
||||
|
||||
REDIS_KEY = "potoken"
|
||||
|
||||
def __init__(self, config):
|
||||
self.config = config
|
||||
|
||||
def get(self) -> str | None:
|
||||
"""get PO token"""
|
||||
potoken = RedisArchivist().get_message_str(self.REDIS_KEY)
|
||||
return potoken
|
||||
|
||||
def set_token(self, new_token: str) -> None:
|
||||
"""set new PO token"""
|
||||
RedisArchivist().set_message(self.REDIS_KEY, new_token)
|
||||
AppConfig().update_config({"downloads": {"potoken": True}})
|
||||
|
||||
def revoke_token(self) -> None:
|
||||
"""revoke token"""
|
||||
RedisArchivist().del_message(self.REDIS_KEY)
|
||||
AppConfig().update_config({"downloads": {"potoken": False}})
|
@ -1,471 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- handle yt_dlp
|
||||
- build options and post processor
|
||||
- download video files
|
||||
- move to archive
|
||||
"""
|
||||
|
||||
import os
|
||||
import shutil
|
||||
from datetime import datetime
|
||||
|
||||
from appsettings.src.config import AppConfig
|
||||
from channel.src.index import YoutubeChannel
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap, IndexPaginate
|
||||
from common.src.helper import (
|
||||
get_channel_overwrites,
|
||||
ignore_filelist,
|
||||
rand_sleep,
|
||||
)
|
||||
from common.src.ta_redis import RedisQueue
|
||||
from download.src.queue import PendingList
|
||||
from download.src.subscriptions import PlaylistSubscription
|
||||
from download.src.yt_dlp_base import YtWrap
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
from video.src.comments import CommentList
|
||||
from video.src.constants import VideoTypeEnum
|
||||
from video.src.index import YoutubeVideo, index_new_video
|
||||
|
||||
|
||||
class DownloaderBase:
|
||||
"""base class for shared config"""
|
||||
|
||||
CACHE_DIR = EnvironmentSettings.CACHE_DIR
|
||||
MEDIA_DIR = EnvironmentSettings.MEDIA_DIR
|
||||
CHANNEL_QUEUE = "download:channel"
|
||||
PLAYLIST_QUEUE = "download:playlist:full"
|
||||
PLAYLIST_QUICK = "download:playlist:quick"
|
||||
VIDEO_QUEUE = "download:video"
|
||||
|
||||
def __init__(self, task):
|
||||
self.task = task
|
||||
self.config = AppConfig().config
|
||||
self.channel_overwrites = get_channel_overwrites()
|
||||
self.now = int(datetime.now().timestamp())
|
||||
|
||||
|
||||
class VideoDownloader(DownloaderBase):
|
||||
"""handle the video download functionality"""
|
||||
|
||||
def __init__(self, task=False):
|
||||
super().__init__(task)
|
||||
self.obs = False
|
||||
self._build_obs()
|
||||
|
||||
def run_queue(self, auto_only=False) -> tuple[int, int]:
|
||||
"""setup download queue in redis loop until no more items"""
|
||||
downloaded = 0
|
||||
failed = 0
|
||||
while True:
|
||||
video_data = self._get_next(auto_only)
|
||||
if self.task.is_stopped() or not video_data:
|
||||
self._reset_auto()
|
||||
break
|
||||
|
||||
if downloaded > 0:
|
||||
rand_sleep(self.config)
|
||||
|
||||
youtube_id = video_data["youtube_id"]
|
||||
channel_id = video_data["channel_id"]
|
||||
print(f"{youtube_id}: Downloading video")
|
||||
self._notify(video_data, "Validate download format")
|
||||
|
||||
success = self._dl_single_vid(youtube_id, channel_id)
|
||||
if not success:
|
||||
failed += 1
|
||||
continue
|
||||
|
||||
self._notify(video_data, "Add video metadata to index", progress=1)
|
||||
video_type = VideoTypeEnum(video_data["vid_type"])
|
||||
vid_dict = index_new_video(youtube_id, video_type=video_type)
|
||||
RedisQueue(self.CHANNEL_QUEUE).add(channel_id)
|
||||
RedisQueue(self.VIDEO_QUEUE).add(youtube_id)
|
||||
|
||||
self._notify(video_data, "Move downloaded file to archive")
|
||||
self.move_to_archive(vid_dict)
|
||||
self._delete_from_pending(youtube_id)
|
||||
downloaded += 1
|
||||
|
||||
# post processing
|
||||
DownloadPostProcess(self.task).run()
|
||||
|
||||
return downloaded, failed
|
||||
|
||||
def _notify(self, video_data, message, progress=False):
|
||||
"""send progress notification to task"""
|
||||
if not self.task:
|
||||
return
|
||||
|
||||
typ = VideoTypeEnum(video_data["vid_type"]).value.rstrip("s").title()
|
||||
title = video_data.get("title")
|
||||
self.task.send_progress(
|
||||
[f"Processing {typ}: {title}", message], progress=progress
|
||||
)
|
||||
|
||||
def _get_next(self, auto_only):
|
||||
"""get next item in queue"""
|
||||
must_list = [{"term": {"status": {"value": "pending"}}}]
|
||||
must_not_list = [{"exists": {"field": "message"}}]
|
||||
if auto_only:
|
||||
must_list.append({"term": {"auto_start": {"value": True}}})
|
||||
|
||||
data = {
|
||||
"size": 1,
|
||||
"query": {"bool": {"must": must_list, "must_not": must_not_list}},
|
||||
"sort": [
|
||||
{"auto_start": {"order": "desc"}},
|
||||
{"timestamp": {"order": "asc"}},
|
||||
],
|
||||
}
|
||||
path = "ta_download/_search"
|
||||
response, _ = ElasticWrap(path).get(data=data)
|
||||
if not response["hits"]["hits"]:
|
||||
return False
|
||||
|
||||
return response["hits"]["hits"][0]["_source"]
|
||||
|
||||
def _progress_hook(self, response):
|
||||
"""process the progress_hooks from yt_dlp"""
|
||||
progress = False
|
||||
try:
|
||||
size = response.get("_total_bytes_str")
|
||||
if size.strip() == "N/A":
|
||||
size = response.get("_total_bytes_estimate_str", "N/A")
|
||||
|
||||
percent = response["_percent_str"]
|
||||
progress = float(percent.strip("%")) / 100
|
||||
speed = response["_speed_str"]
|
||||
eta = response["_eta_str"]
|
||||
message = f"{percent} of {size} at {speed} - time left: {eta}"
|
||||
except KeyError:
|
||||
message = "processing"
|
||||
|
||||
if self.task:
|
||||
title = response["info_dict"]["title"]
|
||||
self.task.send_progress([title, message], progress=progress)
|
||||
|
||||
def _build_obs(self):
|
||||
"""collection to build all obs passed to yt-dlp"""
|
||||
self._build_obs_basic()
|
||||
self._build_obs_user()
|
||||
self._build_obs_postprocessors()
|
||||
|
||||
def _build_obs_basic(self):
|
||||
"""initial obs"""
|
||||
self.obs = {
|
||||
"merge_output_format": "mp4",
|
||||
"outtmpl": (self.CACHE_DIR + "/download/%(id)s.mp4"),
|
||||
"progress_hooks": [self._progress_hook],
|
||||
"noprogress": True,
|
||||
"continuedl": True,
|
||||
"writethumbnail": False,
|
||||
"noplaylist": True,
|
||||
"color": "no_color",
|
||||
}
|
||||
|
||||
def _build_obs_user(self):
|
||||
"""build user customized options"""
|
||||
if self.config["downloads"]["format"]:
|
||||
self.obs["format"] = self.config["downloads"]["format"]
|
||||
if self.config["downloads"]["format_sort"]:
|
||||
format_sort = self.config["downloads"]["format_sort"]
|
||||
format_sort_list = [i.strip() for i in format_sort.split(",")]
|
||||
self.obs["format_sort"] = format_sort_list
|
||||
if self.config["downloads"]["limit_speed"]:
|
||||
self.obs["ratelimit"] = (
|
||||
self.config["downloads"]["limit_speed"] * 1024
|
||||
)
|
||||
|
||||
throttle = self.config["downloads"]["throttledratelimit"]
|
||||
if throttle:
|
||||
self.obs["throttledratelimit"] = throttle * 1024
|
||||
|
||||
def _build_obs_postprocessors(self):
|
||||
"""add postprocessor to obs"""
|
||||
postprocessors = []
|
||||
|
||||
if self.config["downloads"]["add_metadata"]:
|
||||
postprocessors.append(
|
||||
{
|
||||
"key": "FFmpegMetadata",
|
||||
"add_chapters": True,
|
||||
"add_metadata": True,
|
||||
}
|
||||
)
|
||||
postprocessors.append(
|
||||
{
|
||||
"key": "MetadataFromField",
|
||||
"formats": [
|
||||
"%(title)s:%(meta_title)s",
|
||||
"%(uploader)s:%(meta_artist)s",
|
||||
":(?P<album>)",
|
||||
],
|
||||
"when": "pre_process",
|
||||
}
|
||||
)
|
||||
|
||||
if self.config["downloads"]["add_thumbnail"]:
|
||||
postprocessors.append(
|
||||
{
|
||||
"key": "EmbedThumbnail",
|
||||
"already_have_thumbnail": True,
|
||||
}
|
||||
)
|
||||
self.obs["writethumbnail"] = True
|
||||
|
||||
self.obs["postprocessors"] = postprocessors
|
||||
|
||||
def _set_overwrites(self, obs: dict, channel_id: str) -> None:
|
||||
"""add overwrites to obs"""
|
||||
overwrites = self.channel_overwrites.get(channel_id)
|
||||
if overwrites and overwrites.get("download_format"):
|
||||
obs["format"] = overwrites.get("download_format")
|
||||
|
||||
def _dl_single_vid(self, youtube_id: str, channel_id: str) -> bool:
|
||||
"""download single video"""
|
||||
obs = self.obs.copy()
|
||||
self._set_overwrites(obs, channel_id)
|
||||
dl_cache = os.path.join(self.CACHE_DIR, "download")
|
||||
|
||||
success, message = YtWrap(obs, self.config).download(youtube_id)
|
||||
if not success:
|
||||
self._handle_error(youtube_id, message)
|
||||
|
||||
if self.obs["writethumbnail"]:
|
||||
# webp files don't get cleaned up automatically
|
||||
all_cached = ignore_filelist(os.listdir(dl_cache))
|
||||
to_clean = [i for i in all_cached if not i.endswith(".mp4")]
|
||||
for file_name in to_clean:
|
||||
file_path = os.path.join(dl_cache, file_name)
|
||||
os.remove(file_path)
|
||||
|
||||
return success
|
||||
|
||||
@staticmethod
|
||||
def _handle_error(youtube_id, message):
|
||||
"""store error message"""
|
||||
data = {"doc": {"message": message}}
|
||||
_, _ = ElasticWrap(f"ta_download/_update/{youtube_id}").post(data=data)
|
||||
|
||||
def move_to_archive(self, vid_dict):
|
||||
"""move downloaded video from cache to archive"""
|
||||
host_uid = EnvironmentSettings.HOST_UID
|
||||
host_gid = EnvironmentSettings.HOST_GID
|
||||
# make folder
|
||||
folder = os.path.join(
|
||||
self.MEDIA_DIR, vid_dict["channel"]["channel_id"]
|
||||
)
|
||||
if not os.path.exists(folder):
|
||||
os.makedirs(folder)
|
||||
if host_uid and host_gid:
|
||||
os.chown(folder, host_uid, host_gid)
|
||||
# move media file
|
||||
media_file = vid_dict["youtube_id"] + ".mp4"
|
||||
old_path = os.path.join(self.CACHE_DIR, "download", media_file)
|
||||
new_path = os.path.join(self.MEDIA_DIR, vid_dict["media_url"])
|
||||
# move media file and fix permission
|
||||
shutil.move(old_path, new_path, copy_function=shutil.copyfile)
|
||||
if host_uid and host_gid:
|
||||
os.chown(new_path, host_uid, host_gid)
|
||||
|
||||
@staticmethod
|
||||
def _delete_from_pending(youtube_id):
|
||||
"""delete downloaded video from pending index if its there"""
|
||||
path = f"ta_download/_doc/{youtube_id}?refresh=true"
|
||||
_, _ = ElasticWrap(path).delete()
|
||||
|
||||
def _reset_auto(self):
|
||||
"""reset autostart to defaults after queue stop"""
|
||||
path = "ta_download/_update_by_query"
|
||||
data = {
|
||||
"query": {"term": {"auto_start": {"value": True}}},
|
||||
"script": {
|
||||
"source": "ctx._source.auto_start = false",
|
||||
"lang": "painless",
|
||||
},
|
||||
}
|
||||
response, _ = ElasticWrap(path).post(data=data)
|
||||
updated = response.get("updated")
|
||||
if updated:
|
||||
print(f"[download] reset auto start on {updated} videos.")
|
||||
|
||||
|
||||
class DownloadPostProcess(DownloaderBase):
|
||||
"""handle task to run after download queue finishes"""
|
||||
|
||||
def run(self):
|
||||
"""run all functions"""
|
||||
self.auto_delete_all()
|
||||
self.auto_delete_overwrites()
|
||||
self.refresh_playlist()
|
||||
self.match_videos()
|
||||
self.get_comments()
|
||||
|
||||
def auto_delete_all(self):
|
||||
"""handle auto delete"""
|
||||
autodelete_days = self.config["downloads"]["autodelete_days"]
|
||||
if not autodelete_days:
|
||||
return
|
||||
|
||||
print(f"auto delete older than {autodelete_days} days")
|
||||
now_lte = str(self.now - autodelete_days * 24 * 60 * 60)
|
||||
channel_overwrite = "channel.channel_overwrites.autodelete_days"
|
||||
data = {
|
||||
"query": {
|
||||
"bool": {
|
||||
"must": [
|
||||
{"range": {"player.watched_date": {"lte": now_lte}}},
|
||||
{"term": {"player.watched": True}},
|
||||
],
|
||||
"must_not": [
|
||||
{"exists": {"field": channel_overwrite}},
|
||||
],
|
||||
}
|
||||
},
|
||||
"sort": [{"player.watched_date": {"order": "asc"}}],
|
||||
}
|
||||
self._auto_delete_watched(data)
|
||||
|
||||
def auto_delete_overwrites(self):
|
||||
"""handle per channel auto delete from overwrites"""
|
||||
for channel_id, value in self.channel_overwrites.items():
|
||||
if "autodelete_days" in value:
|
||||
autodelete_days = value.get("autodelete_days")
|
||||
print(f"{channel_id}: delete older than {autodelete_days}d")
|
||||
now_lte = str(self.now - autodelete_days * 24 * 60 * 60)
|
||||
must_list = [
|
||||
{"range": {"player.watched_date": {"lte": now_lte}}},
|
||||
{"term": {"channel.channel_id": {"value": channel_id}}},
|
||||
{"term": {"player.watched": True}},
|
||||
]
|
||||
data = {
|
||||
"query": {"bool": {"must": must_list}},
|
||||
"sort": [{"player.watched_date": {"order": "desc"}}],
|
||||
}
|
||||
self._auto_delete_watched(data)
|
||||
|
||||
@staticmethod
|
||||
def _auto_delete_watched(data):
|
||||
"""delete watched videos after x days"""
|
||||
to_delete = IndexPaginate("ta_video", data).get_results()
|
||||
if not to_delete:
|
||||
return
|
||||
|
||||
for video in to_delete:
|
||||
youtube_id = video["youtube_id"]
|
||||
print(f"{youtube_id}: auto delete video")
|
||||
YoutubeVideo(youtube_id).delete_media_file()
|
||||
|
||||
print("add deleted to ignore list")
|
||||
vids = [{"type": "video", "url": i["youtube_id"]} for i in to_delete]
|
||||
pending = PendingList(youtube_ids=vids)
|
||||
pending.parse_url_list()
|
||||
_ = pending.add_to_pending(status="ignore")
|
||||
|
||||
def refresh_playlist(self) -> None:
|
||||
"""match videos with playlists"""
|
||||
self.add_playlists_to_refresh()
|
||||
|
||||
queue = RedisQueue(self.PLAYLIST_QUEUE)
|
||||
while True:
|
||||
total = queue.max_score()
|
||||
playlist_id, idx = queue.get_next()
|
||||
if not playlist_id or not idx or not total:
|
||||
break
|
||||
|
||||
playlist = YoutubePlaylist(playlist_id)
|
||||
playlist.update_playlist(skip_on_empty=True)
|
||||
|
||||
if not self.task:
|
||||
continue
|
||||
|
||||
channel_name = playlist.json_data["playlist_channel"]
|
||||
playlist_title = playlist.json_data["playlist_name"]
|
||||
message = [
|
||||
f"Post Processing Playlists for: {channel_name}",
|
||||
f"{playlist_title} [{idx}/{total}]",
|
||||
]
|
||||
progress = idx / total
|
||||
self.task.send_progress(message, progress=progress)
|
||||
rand_sleep(self.config)
|
||||
|
||||
def add_playlists_to_refresh(self) -> None:
|
||||
"""add playlists to refresh"""
|
||||
if self.task:
|
||||
message = ["Post Processing Playlists", "Scanning for Playlists"]
|
||||
self.task.send_progress(message)
|
||||
|
||||
self._add_playlist_sub()
|
||||
self._add_channel_playlists()
|
||||
self._add_video_playlists()
|
||||
|
||||
def _add_playlist_sub(self):
|
||||
"""add subscribed playlists to refresh"""
|
||||
subs = PlaylistSubscription().get_playlists()
|
||||
to_add = [i["playlist_id"] for i in subs]
|
||||
RedisQueue(self.PLAYLIST_QUEUE).add_list(to_add)
|
||||
|
||||
def _add_channel_playlists(self):
|
||||
"""add playlists from channels to refresh"""
|
||||
queue = RedisQueue(self.CHANNEL_QUEUE)
|
||||
while True:
|
||||
channel_id, _ = queue.get_next()
|
||||
if not channel_id:
|
||||
break
|
||||
|
||||
channel = YoutubeChannel(channel_id)
|
||||
channel.get_from_es()
|
||||
overwrites = channel.get_overwrites()
|
||||
if "index_playlists" in overwrites:
|
||||
channel.get_all_playlists()
|
||||
to_add = [i[0] for i in channel.all_playlists]
|
||||
RedisQueue(self.PLAYLIST_QUEUE).add_list(to_add)
|
||||
|
||||
def _add_video_playlists(self):
|
||||
"""add other playlists for quick sync"""
|
||||
all_playlists = RedisQueue(self.PLAYLIST_QUEUE).get_all()
|
||||
must_not = [{"terms": {"playlist_id": all_playlists}}]
|
||||
video_ids = RedisQueue(self.VIDEO_QUEUE).get_all()
|
||||
must = [{"terms": {"playlist_entries.youtube_id": video_ids}}]
|
||||
data = {
|
||||
"query": {"bool": {"must_not": must_not, "must": must}},
|
||||
"_source": ["playlist_id"],
|
||||
}
|
||||
playlists = IndexPaginate("ta_playlist", data).get_results()
|
||||
to_add = [i["playlist_id"] for i in playlists]
|
||||
RedisQueue(self.PLAYLIST_QUICK).add_list(to_add)
|
||||
|
||||
def match_videos(self) -> None:
|
||||
"""scan rest of indexed playlists to match videos"""
|
||||
queue = RedisQueue(self.PLAYLIST_QUICK)
|
||||
while True:
|
||||
total = queue.max_score()
|
||||
playlist_id, idx = queue.get_next()
|
||||
if not playlist_id or not idx or not total:
|
||||
break
|
||||
|
||||
playlist = YoutubePlaylist(playlist_id)
|
||||
playlist.get_from_es()
|
||||
playlist.add_vids_to_playlist()
|
||||
playlist.remove_vids_from_playlist()
|
||||
|
||||
if not self.task:
|
||||
continue
|
||||
|
||||
message = [
|
||||
"Post Processing Playlists.",
|
||||
f"Validate Playlists: - {idx}/{total}",
|
||||
]
|
||||
progress = idx / total
|
||||
self.task.send_progress(message, progress=progress)
|
||||
|
||||
def get_comments(self):
|
||||
"""get comments from youtube"""
|
||||
video_queue = RedisQueue(self.VIDEO_QUEUE)
|
||||
comment_list = CommentList(task=self.task)
|
||||
comment_list.add(video_ids=video_queue.get_all())
|
||||
|
||||
video_queue.clear()
|
||||
comment_list.index()
|
@ -1,18 +0,0 @@
|
||||
"""all download API urls"""
|
||||
|
||||
from django.urls import path
|
||||
from download import views
|
||||
|
||||
urlpatterns = [
|
||||
path("", views.DownloadApiListView.as_view(), name="api-download-list"),
|
||||
path(
|
||||
"aggs/",
|
||||
views.DownloadAggsApiView.as_view(),
|
||||
name="api-download-aggs",
|
||||
),
|
||||
path(
|
||||
"<slug:video_id>/",
|
||||
views.DownloadApiView.as_view(),
|
||||
name="api-download",
|
||||
),
|
||||
]
|
@ -1,292 +0,0 @@
|
||||
"""all download API views"""
|
||||
|
||||
from common.serializers import (
|
||||
AsyncTaskResponseSerializer,
|
||||
ErrorResponseSerializer,
|
||||
)
|
||||
from common.views_base import AdminOnly, ApiBaseView
|
||||
from download.serializers import (
|
||||
AddToDownloadListSerializer,
|
||||
AddToDownloadQuerySerializer,
|
||||
DownloadAggsSerializer,
|
||||
DownloadItemSerializer,
|
||||
DownloadListQuerySerializer,
|
||||
DownloadListQueueDeleteQuerySerializer,
|
||||
DownloadListSerializer,
|
||||
DownloadQueueItemUpdateSerializer,
|
||||
)
|
||||
from download.src.queue import PendingInteract
|
||||
from drf_spectacular.utils import OpenApiResponse, extend_schema
|
||||
from rest_framework.response import Response
|
||||
from task.tasks import download_pending, extrac_dl
|
||||
|
||||
|
||||
class DownloadApiListView(ApiBaseView):
|
||||
"""resolves to /api/download/
|
||||
GET: returns latest videos in the download queue
|
||||
POST: add a list of videos to download queue
|
||||
DELETE: remove items based on query filter
|
||||
"""
|
||||
|
||||
search_base = "ta_download/_search/"
|
||||
valid_filter = ["pending", "ignore"]
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(DownloadListSerializer()),
|
||||
},
|
||||
parameters=[DownloadListQuerySerializer()],
|
||||
)
|
||||
def get(self, request):
|
||||
"""get download queue list"""
|
||||
query_filter = request.GET.get("filter", False)
|
||||
self.data.update(
|
||||
{
|
||||
"sort": [
|
||||
{"auto_start": {"order": "desc"}},
|
||||
{"timestamp": {"order": "asc"}},
|
||||
],
|
||||
}
|
||||
)
|
||||
|
||||
serializer = DownloadListQuerySerializer(data=request.query_params)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
|
||||
must_list = []
|
||||
query_filter = validated_data.get("filter")
|
||||
if query_filter:
|
||||
must_list.append({"term": {"status": {"value": query_filter}}})
|
||||
|
||||
filter_channel = validated_data.get("channel")
|
||||
if filter_channel:
|
||||
must_list.append(
|
||||
{"term": {"channel_id": {"value": filter_channel}}}
|
||||
)
|
||||
|
||||
self.data["query"] = {"bool": {"must": must_list}}
|
||||
|
||||
self.get_document_list(request)
|
||||
serializer = DownloadListSerializer(self.response)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
request=AddToDownloadListSerializer(),
|
||||
parameters=[AddToDownloadQuerySerializer()],
|
||||
responses={
|
||||
200: OpenApiResponse(
|
||||
AsyncTaskResponseSerializer(),
|
||||
description="New async task started",
|
||||
),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(request):
|
||||
"""add list of videos to download queue"""
|
||||
data_serializer = AddToDownloadListSerializer(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
|
||||
query_serializer = AddToDownloadQuerySerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
|
||||
auto_start = validated_query.get("autostart")
|
||||
print(f"auto_start: {auto_start}")
|
||||
to_add = validated_data["data"]
|
||||
|
||||
pending = [i["youtube_id"] for i in to_add if i["status"] == "pending"]
|
||||
url_str = " ".join(pending)
|
||||
task = extrac_dl.delay(url_str, auto_start=auto_start)
|
||||
|
||||
message = {
|
||||
"message": "add to queue task started",
|
||||
"task_id": task.id,
|
||||
}
|
||||
response_serializer = AsyncTaskResponseSerializer(message)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
parameters=[DownloadListQueueDeleteQuerySerializer()],
|
||||
responses={
|
||||
204: OpenApiResponse(description="Download items deleted"),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def delete(self, request):
|
||||
"""bulk delete download queue items by filter"""
|
||||
serializer = DownloadListQueueDeleteQuerySerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_query = serializer.validated_data
|
||||
|
||||
query_filter = validated_query["filter"]
|
||||
message = f"delete queue by status: {query_filter}"
|
||||
print(message)
|
||||
PendingInteract(status=query_filter).delete_by_status()
|
||||
|
||||
return Response(status=204)
|
||||
|
||||
|
||||
class DownloadApiView(ApiBaseView):
|
||||
"""resolves to /api/download/<video_id>/
|
||||
GET: returns metadata dict of an item in the download queue
|
||||
POST: update status of item to pending or ignore
|
||||
DELETE: forget from download queue
|
||||
"""
|
||||
|
||||
search_base = "ta_download/_doc/"
|
||||
valid_status = ["pending", "ignore", "ignore-force", "priority"]
|
||||
permission_classes = [AdminOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(DownloadItemSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(),
|
||||
description="Download item not found",
|
||||
),
|
||||
},
|
||||
)
|
||||
def get(self, request, video_id):
|
||||
# pylint: disable=unused-argument
|
||||
"""get download queue item"""
|
||||
self.get_document(video_id)
|
||||
if not self.response:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "Download item not found"}
|
||||
)
|
||||
return Response(error.data, status=404)
|
||||
|
||||
response_serializer = DownloadItemSerializer(self.response)
|
||||
|
||||
return Response(response_serializer.data, status=self.status_code)
|
||||
|
||||
@extend_schema(
|
||||
request=DownloadQueueItemUpdateSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(
|
||||
DownloadQueueItemUpdateSerializer(),
|
||||
description="Download item update",
|
||||
),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(),
|
||||
description="Download item not found",
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request, video_id):
|
||||
"""post to video to change status"""
|
||||
data_serializer = DownloadQueueItemUpdateSerializer(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
item_status = validated_data["status"]
|
||||
|
||||
if item_status == "ignore-force":
|
||||
extrac_dl.delay(video_id, status="ignore")
|
||||
return Response(data_serializer.data)
|
||||
|
||||
_, status_code = PendingInteract(video_id).get_item()
|
||||
if status_code == 404:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": "Download item not found"}
|
||||
)
|
||||
return Response(error.data, status=404)
|
||||
|
||||
print(f"{video_id}: change status to {item_status}")
|
||||
PendingInteract(video_id, item_status).update_status()
|
||||
if item_status == "priority":
|
||||
download_pending.delay(auto_only=True)
|
||||
|
||||
return Response(data_serializer.data)
|
||||
|
||||
@staticmethod
|
||||
@extend_schema(
|
||||
responses={
|
||||
204: OpenApiResponse(description="delete download item"),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(),
|
||||
description="Download item not found",
|
||||
),
|
||||
},
|
||||
)
|
||||
def delete(request, video_id):
|
||||
# pylint: disable=unused-argument
|
||||
"""delete single video from queue"""
|
||||
print(f"{video_id}: delete from queue")
|
||||
PendingInteract(video_id).delete_item()
|
||||
|
||||
return Response(status=204)
|
||||
|
||||
|
||||
class DownloadAggsApiView(ApiBaseView):
|
||||
"""resolves to /api/download/aggs/
|
||||
GET: get download aggregations
|
||||
"""
|
||||
|
||||
search_base = "ta_download/_search"
|
||||
valid_filter_view = ["ignore", "pending"]
|
||||
|
||||
@extend_schema(
|
||||
parameters=[DownloadListQueueDeleteQuerySerializer()],
|
||||
responses={
|
||||
200: OpenApiResponse(DownloadAggsSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def get(self, request):
|
||||
"""get aggs"""
|
||||
serializer = DownloadListQueueDeleteQuerySerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_query = serializer.validated_data
|
||||
|
||||
filter_view = validated_query.get("filter")
|
||||
if filter_view:
|
||||
if filter_view not in self.valid_filter_view:
|
||||
message = f"invalid filter: {filter_view}"
|
||||
return Response({"message": message}, status=400)
|
||||
|
||||
self.data.update(
|
||||
{
|
||||
"query": {"term": {"status": {"value": filter_view}}},
|
||||
}
|
||||
)
|
||||
|
||||
self.data.update(
|
||||
{
|
||||
"aggs": {
|
||||
"channel_downloads": {
|
||||
"multi_terms": {
|
||||
"size": 30,
|
||||
"terms": [
|
||||
{"field": "channel_name.keyword"},
|
||||
{"field": "channel_id"},
|
||||
],
|
||||
"order": {"_count": "desc"},
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
self.get_aggs()
|
||||
serializer = DownloadAggsSerializer(self.response["channel_downloads"])
|
||||
|
||||
return Response(serializer.data)
|
@ -1,92 +0,0 @@
|
||||
"""playlist serializers"""
|
||||
|
||||
# pylint: disable=abstract-method
|
||||
|
||||
from common.serializers import PaginationSerializer
|
||||
from rest_framework import serializers
|
||||
|
||||
|
||||
class PlaylistEntrySerializer(serializers.Serializer):
|
||||
"""serialize single playlist entry"""
|
||||
|
||||
youtube_id = serializers.CharField()
|
||||
title = serializers.CharField()
|
||||
uploader = serializers.CharField()
|
||||
idx = serializers.IntegerField()
|
||||
downloaded = serializers.BooleanField()
|
||||
|
||||
|
||||
class PlaylistSerializer(serializers.Serializer):
|
||||
"""serialize playlist"""
|
||||
|
||||
playlist_active = serializers.BooleanField()
|
||||
playlist_channel = serializers.CharField()
|
||||
playlist_channel_id = serializers.CharField()
|
||||
playlist_description = serializers.CharField()
|
||||
playlist_entries = PlaylistEntrySerializer(many=True)
|
||||
playlist_id = serializers.CharField()
|
||||
playlist_last_refresh = serializers.CharField()
|
||||
playlist_name = serializers.CharField()
|
||||
playlist_subscribed = serializers.BooleanField()
|
||||
playlist_thumbnail = serializers.CharField()
|
||||
playlist_type = serializers.ChoiceField(choices=["regular", "custom"])
|
||||
_index = serializers.CharField(required=False)
|
||||
_score = serializers.IntegerField(required=False)
|
||||
|
||||
|
||||
class PlaylistListSerializer(serializers.Serializer):
|
||||
"""serialize list of playlists"""
|
||||
|
||||
data = PlaylistSerializer(many=True)
|
||||
paginate = PaginationSerializer()
|
||||
|
||||
|
||||
class PlaylistListQuerySerializer(serializers.Serializer):
|
||||
"""serialize playlist list query params"""
|
||||
|
||||
channel = serializers.CharField(required=False)
|
||||
subscribed = serializers.BooleanField(required=False)
|
||||
type = serializers.ChoiceField(
|
||||
choices=["regular", "custom"], required=False
|
||||
)
|
||||
page = serializers.IntegerField(required=False)
|
||||
|
||||
|
||||
class PlaylistSingleAddSerializer(serializers.Serializer):
|
||||
"""single item to add"""
|
||||
|
||||
playlist_id = serializers.CharField()
|
||||
playlist_subscribed = serializers.ChoiceField(choices=[True])
|
||||
|
||||
|
||||
class PlaylistBulkAddSerializer(serializers.Serializer):
|
||||
"""bulk add playlists serializers"""
|
||||
|
||||
data = PlaylistSingleAddSerializer(many=True)
|
||||
|
||||
|
||||
class PlaylistSingleUpdate(serializers.Serializer):
|
||||
"""update state of single playlist"""
|
||||
|
||||
playlist_subscribed = serializers.BooleanField()
|
||||
|
||||
|
||||
class PlaylistListCustomPostSerializer(serializers.Serializer):
|
||||
"""serialize list post custom playlist"""
|
||||
|
||||
playlist_name = serializers.CharField()
|
||||
|
||||
|
||||
class PlaylistCustomPostSerializer(serializers.Serializer):
|
||||
"""serialize playlist custom action"""
|
||||
|
||||
action = serializers.ChoiceField(
|
||||
choices=["create", "remove", "up", "down", "top", "bottom"]
|
||||
)
|
||||
video_id = serializers.CharField()
|
||||
|
||||
|
||||
class PlaylistDeleteQuerySerializer(serializers.Serializer):
|
||||
"""serialize playlist delete query params"""
|
||||
|
||||
delete_videos = serializers.BooleanField(required=False)
|
@ -1,10 +0,0 @@
|
||||
"""playlist constants"""
|
||||
|
||||
import enum
|
||||
|
||||
|
||||
class PlaylistTypesEnum(enum.Enum):
|
||||
"""all playlist_type options"""
|
||||
|
||||
REGULAR = "regular"
|
||||
CUSTOM = "custom"
|
@ -1,444 +0,0 @@
|
||||
"""
|
||||
functionality:
|
||||
- get metadata from youtube for a playlist
|
||||
- index and update in es
|
||||
"""
|
||||
|
||||
import json
|
||||
from datetime import datetime
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap, IndexPaginate
|
||||
from common.src.index_generic import YouTubeItem
|
||||
from download.src.thumbnails import ThumbManager
|
||||
from video.src import index as ta_video
|
||||
|
||||
|
||||
class YoutubePlaylist(YouTubeItem):
|
||||
"""represents a single youtube playlist"""
|
||||
|
||||
es_path = False
|
||||
index_name = "ta_playlist"
|
||||
yt_obs = {
|
||||
"extract_flat": True,
|
||||
"allow_playlist_files": True,
|
||||
}
|
||||
yt_base = "https://www.youtube.com/playlist?list="
|
||||
|
||||
def __init__(self, youtube_id):
|
||||
super().__init__(youtube_id)
|
||||
self.all_members = False
|
||||
self.nav = False
|
||||
|
||||
def build_json(self, scrape=False):
|
||||
"""collection to create json_data"""
|
||||
self.get_from_es()
|
||||
if self.json_data:
|
||||
subscribed = self.json_data.get("playlist_subscribed")
|
||||
else:
|
||||
subscribed = False
|
||||
|
||||
if scrape or not self.json_data:
|
||||
self.get_from_youtube()
|
||||
if not self.youtube_meta:
|
||||
self.json_data = False
|
||||
return
|
||||
|
||||
self.process_youtube_meta()
|
||||
self._ensure_channel()
|
||||
ids_found = self.get_local_vids()
|
||||
self.get_entries(ids_found)
|
||||
self.json_data["playlist_entries"] = self.all_members
|
||||
self.json_data["playlist_subscribed"] = subscribed
|
||||
|
||||
def process_youtube_meta(self):
|
||||
"""extract relevant fields from youtube"""
|
||||
try:
|
||||
playlist_thumbnail = self.youtube_meta["thumbnails"][-1]["url"]
|
||||
except IndexError:
|
||||
print(f"{self.youtube_id}: thumbnail extraction failed")
|
||||
playlist_thumbnail = False
|
||||
|
||||
self.json_data = {
|
||||
"playlist_id": self.youtube_id,
|
||||
"playlist_active": True,
|
||||
"playlist_name": self.youtube_meta["title"],
|
||||
"playlist_channel": self.youtube_meta["channel"],
|
||||
"playlist_channel_id": self.youtube_meta["channel_id"],
|
||||
"playlist_thumbnail": playlist_thumbnail,
|
||||
"playlist_description": self.youtube_meta["description"] or False,
|
||||
"playlist_last_refresh": int(datetime.now().timestamp()),
|
||||
"playlist_type": "regular",
|
||||
}
|
||||
|
||||
def _ensure_channel(self):
|
||||
"""make sure channel is indexed"""
|
||||
from channel.src.index import YoutubeChannel
|
||||
|
||||
channel_id = self.json_data["playlist_channel_id"]
|
||||
channel_handler = YoutubeChannel(channel_id)
|
||||
channel_handler.build_json(upload=True)
|
||||
|
||||
def get_local_vids(self) -> list[str]:
|
||||
"""get local video ids from youtube entries"""
|
||||
entries = self.youtube_meta["entries"]
|
||||
data = {
|
||||
"query": {"terms": {"youtube_id": [i["id"] for i in entries]}},
|
||||
"_source": ["youtube_id"],
|
||||
}
|
||||
indexed_vids = IndexPaginate("ta_video", data).get_results()
|
||||
ids_found = [i["youtube_id"] for i in indexed_vids]
|
||||
|
||||
return ids_found
|
||||
|
||||
def get_entries(self, ids_found) -> None:
|
||||
"""get all videos in playlist, match downloaded with ids_found"""
|
||||
all_members = []
|
||||
for idx, entry in enumerate(self.youtube_meta["entries"]):
|
||||
to_append = {
|
||||
"youtube_id": entry["id"],
|
||||
"title": entry["title"],
|
||||
"uploader": entry.get("channel"),
|
||||
"idx": idx,
|
||||
"downloaded": entry["id"] in ids_found,
|
||||
}
|
||||
all_members.append(to_append)
|
||||
|
||||
self.all_members = all_members
|
||||
|
||||
def get_playlist_art(self):
|
||||
"""download artwork of playlist"""
|
||||
url = self.json_data["playlist_thumbnail"]
|
||||
ThumbManager(self.youtube_id, item_type="playlist").download(url)
|
||||
|
||||
def add_vids_to_playlist(self):
|
||||
"""sync the playlist id to videos"""
|
||||
script = (
|
||||
'if (!ctx._source.containsKey("playlist")) '
|
||||
+ "{ctx._source.playlist = [params.playlist]} "
|
||||
+ "else if (!ctx._source.playlist.contains(params.playlist)) "
|
||||
+ "{ctx._source.playlist.add(params.playlist)} "
|
||||
+ "else {ctx.op = 'none'}"
|
||||
)
|
||||
|
||||
bulk_list = []
|
||||
for entry in self.json_data["playlist_entries"]:
|
||||
video_id = entry["youtube_id"]
|
||||
action = {"update": {"_id": video_id, "_index": "ta_video"}}
|
||||
source = {
|
||||
"script": {
|
||||
"source": script,
|
||||
"lang": "painless",
|
||||
"params": {"playlist": self.youtube_id},
|
||||
}
|
||||
}
|
||||
bulk_list.append(json.dumps(action))
|
||||
bulk_list.append(json.dumps(source))
|
||||
|
||||
# add last newline
|
||||
bulk_list.append("\n")
|
||||
query_str = "\n".join(bulk_list)
|
||||
|
||||
ElasticWrap("_bulk").post(query_str, ndjson=True)
|
||||
|
||||
def remove_vids_from_playlist(self):
|
||||
"""remove playlist ids from videos if needed"""
|
||||
needed = [i["youtube_id"] for i in self.json_data["playlist_entries"]]
|
||||
data = {
|
||||
"query": {"match": {"playlist": self.youtube_id}},
|
||||
"_source": ["youtube_id"],
|
||||
}
|
||||
data = {
|
||||
"query": {"term": {"playlist.keyword": {"value": self.youtube_id}}}
|
||||
}
|
||||
result = IndexPaginate("ta_video", data).get_results()
|
||||
to_remove = [
|
||||
i["youtube_id"] for i in result if i["youtube_id"] not in needed
|
||||
]
|
||||
s = "ctx._source.playlist.removeAll(Collections.singleton(params.rm))"
|
||||
for video_id in to_remove:
|
||||
query = {
|
||||
"script": {
|
||||
"source": s,
|
||||
"lang": "painless",
|
||||
"params": {"rm": self.youtube_id},
|
||||
},
|
||||
"query": {"match": {"youtube_id": video_id}},
|
||||
}
|
||||
path = "ta_video/_update_by_query"
|
||||
_, status_code = ElasticWrap(path).post(query)
|
||||
if status_code == 200:
|
||||
print(f"{self.youtube_id}: removed {video_id} from playlist")
|
||||
|
||||
def update_playlist(self, skip_on_empty=False):
|
||||
"""update metadata for playlist with data from YouTube"""
|
||||
self.build_json(scrape=True)
|
||||
if not self.json_data:
|
||||
# return false to deactivate
|
||||
return False
|
||||
|
||||
if skip_on_empty:
|
||||
has_item_downloaded = any(
|
||||
i["downloaded"] for i in self.json_data["playlist_entries"]
|
||||
)
|
||||
if not has_item_downloaded:
|
||||
return True
|
||||
|
||||
self.upload_to_es()
|
||||
self.add_vids_to_playlist()
|
||||
self.remove_vids_from_playlist()
|
||||
self.get_playlist_art()
|
||||
return True
|
||||
|
||||
def build_nav(self, youtube_id):
|
||||
"""find next and previous in playlist of a given youtube_id"""
|
||||
cache_root = EnvironmentSettings().get_cache_root()
|
||||
all_entries_available = self.json_data["playlist_entries"]
|
||||
all_entries = [i for i in all_entries_available if i["downloaded"]]
|
||||
current = [i for i in all_entries if i["youtube_id"] == youtube_id]
|
||||
# stop if not found or playlist of 1
|
||||
if not current or not len(all_entries) > 1:
|
||||
return
|
||||
|
||||
current_idx = all_entries.index(current[0])
|
||||
if current_idx == 0:
|
||||
previous_item = None
|
||||
else:
|
||||
previous_item = all_entries[current_idx - 1]
|
||||
prev_id = previous_item["youtube_id"]
|
||||
prev_thumb_path = ThumbManager(prev_id).vid_thumb_path()
|
||||
previous_item["vid_thumb"] = f"{cache_root}/{prev_thumb_path}"
|
||||
|
||||
if current_idx == len(all_entries) - 1:
|
||||
next_item = None
|
||||
else:
|
||||
next_item = all_entries[current_idx + 1]
|
||||
next_id = next_item["youtube_id"]
|
||||
next_thumb_path = ThumbManager(next_id).vid_thumb_path()
|
||||
next_item["vid_thumb"] = f"{cache_root}/{next_thumb_path}"
|
||||
|
||||
self.nav = {
|
||||
"playlist_meta": {
|
||||
"current_idx": current[0]["idx"],
|
||||
"playlist_id": self.youtube_id,
|
||||
"playlist_name": self.json_data["playlist_name"],
|
||||
"playlist_channel": self.json_data["playlist_channel"],
|
||||
},
|
||||
"playlist_previous": previous_item,
|
||||
"playlist_next": next_item,
|
||||
}
|
||||
return
|
||||
|
||||
def delete_metadata(self):
|
||||
"""delete metadata for playlist"""
|
||||
self.delete_videos_metadata()
|
||||
script = (
|
||||
"ctx._source.playlist.removeAll("
|
||||
+ "Collections.singleton(params.playlist)) "
|
||||
)
|
||||
data = {
|
||||
"query": {
|
||||
"term": {"playlist.keyword": {"value": self.youtube_id}}
|
||||
},
|
||||
"script": {
|
||||
"source": script,
|
||||
"lang": "painless",
|
||||
"params": {"playlist": self.youtube_id},
|
||||
},
|
||||
}
|
||||
_, _ = ElasticWrap("ta_video/_update_by_query").post(data)
|
||||
self.del_in_es()
|
||||
|
||||
def is_custom_playlist(self):
|
||||
self.get_from_es()
|
||||
return self.json_data["playlist_type"] == "custom"
|
||||
|
||||
def delete_videos_metadata(self, channel_id=None):
|
||||
"""delete video metadata for a specific channel"""
|
||||
self.get_from_es()
|
||||
playlist = self.json_data["playlist_entries"]
|
||||
i = 0
|
||||
while i < len(playlist):
|
||||
video_id = playlist[i]["youtube_id"]
|
||||
video = ta_video.YoutubeVideo(video_id)
|
||||
video.get_from_es()
|
||||
if (
|
||||
channel_id is None
|
||||
or video.json_data["channel"]["channel_id"] == channel_id
|
||||
):
|
||||
playlist.pop(i)
|
||||
self.remove_playlist_from_video(video_id)
|
||||
i -= 1
|
||||
i += 1
|
||||
self.set_playlist_thumbnail()
|
||||
self.upload_to_es()
|
||||
|
||||
def delete_videos_playlist(self):
|
||||
"""delete playlist with all videos"""
|
||||
print(f"{self.youtube_id}: delete playlist")
|
||||
self.get_from_es()
|
||||
all_youtube_id = [
|
||||
i["youtube_id"]
|
||||
for i in self.json_data["playlist_entries"]
|
||||
if i["downloaded"]
|
||||
]
|
||||
for youtube_id in all_youtube_id:
|
||||
ta_video.YoutubeVideo(youtube_id).delete_media_file()
|
||||
|
||||
self.delete_metadata()
|
||||
|
||||
def create(self, name):
|
||||
self.json_data = {
|
||||
"playlist_id": self.youtube_id,
|
||||
"playlist_active": False,
|
||||
"playlist_name": name,
|
||||
"playlist_last_refresh": int(datetime.now().timestamp()),
|
||||
"playlist_entries": [],
|
||||
"playlist_type": "custom",
|
||||
"playlist_channel": None,
|
||||
"playlist_channel_id": None,
|
||||
"playlist_description": False,
|
||||
"playlist_thumbnail": False,
|
||||
"playlist_subscribed": False,
|
||||
}
|
||||
self.upload_to_es()
|
||||
self.get_playlist_art()
|
||||
return True
|
||||
|
||||
def add_video_to_playlist(self, video_id):
|
||||
self.get_from_es()
|
||||
video_metadata = self.get_video_metadata(video_id)
|
||||
video_metadata["idx"] = len(self.json_data["playlist_entries"])
|
||||
|
||||
if not self.playlist_entries_contains(video_id):
|
||||
self.json_data["playlist_entries"].append(video_metadata)
|
||||
self.json_data["playlist_last_refresh"] = int(
|
||||
datetime.now().timestamp()
|
||||
)
|
||||
self.set_playlist_thumbnail()
|
||||
self.upload_to_es()
|
||||
video = ta_video.YoutubeVideo(video_id)
|
||||
video.get_from_es()
|
||||
if "playlist" not in video.json_data:
|
||||
video.json_data["playlist"] = []
|
||||
video.json_data["playlist"].append(self.youtube_id)
|
||||
video.upload_to_es()
|
||||
return True
|
||||
|
||||
def remove_playlist_from_video(self, video_id):
|
||||
video = ta_video.YoutubeVideo(video_id)
|
||||
video.get_from_es()
|
||||
if video.json_data is not None and "playlist" in video.json_data:
|
||||
video.json_data["playlist"].remove(self.youtube_id)
|
||||
video.upload_to_es()
|
||||
|
||||
def move_video(self, video_id, action, hide_watched=False):
|
||||
self.get_from_es()
|
||||
video_index = self.get_video_index(video_id)
|
||||
playlist = self.json_data["playlist_entries"]
|
||||
item = playlist[video_index]
|
||||
playlist.pop(video_index)
|
||||
if action == "remove":
|
||||
self.remove_playlist_from_video(item["youtube_id"])
|
||||
else:
|
||||
if action == "up":
|
||||
while True:
|
||||
video_index = max(0, video_index - 1)
|
||||
if (
|
||||
not hide_watched
|
||||
or video_index == 0
|
||||
or (
|
||||
not self.get_video_is_watched(
|
||||
playlist[video_index]["youtube_id"]
|
||||
)
|
||||
)
|
||||
):
|
||||
break
|
||||
elif action == "down":
|
||||
while True:
|
||||
video_index = min(len(playlist), video_index + 1)
|
||||
if (
|
||||
not hide_watched
|
||||
or video_index == len(playlist)
|
||||
or (
|
||||
not self.get_video_is_watched(
|
||||
playlist[video_index - 1]["youtube_id"]
|
||||
)
|
||||
)
|
||||
):
|
||||
break
|
||||
elif action == "top":
|
||||
video_index = 0
|
||||
else:
|
||||
video_index = len(playlist)
|
||||
playlist.insert(video_index, item)
|
||||
self.json_data["playlist_last_refresh"] = int(
|
||||
datetime.now().timestamp()
|
||||
)
|
||||
|
||||
for i, item in enumerate(playlist):
|
||||
item["idx"] = i
|
||||
|
||||
self.set_playlist_thumbnail()
|
||||
self.upload_to_es()
|
||||
|
||||
return True
|
||||
|
||||
def del_video(self, video_id):
|
||||
playlist = self.json_data["playlist_entries"]
|
||||
|
||||
i = 0
|
||||
while i < len(playlist):
|
||||
if video_id == playlist[i]["youtube_id"]:
|
||||
playlist.pop(i)
|
||||
self.set_playlist_thumbnail()
|
||||
i -= 1
|
||||
i += 1
|
||||
|
||||
def get_video_index(self, video_id):
|
||||
for i, child in enumerate(self.json_data["playlist_entries"]):
|
||||
if child["youtube_id"] == video_id:
|
||||
return i
|
||||
return -1
|
||||
|
||||
def playlist_entries_contains(self, video_id):
|
||||
return (
|
||||
len(
|
||||
list(
|
||||
filter(
|
||||
lambda x: x["youtube_id"] == video_id,
|
||||
self.json_data["playlist_entries"],
|
||||
)
|
||||
)
|
||||
)
|
||||
> 0
|
||||
)
|
||||
|
||||
def get_video_is_watched(self, video_id):
|
||||
video = ta_video.YoutubeVideo(video_id)
|
||||
video.get_from_es()
|
||||
return video.json_data["player"]["watched"]
|
||||
|
||||
def set_playlist_thumbnail(self):
|
||||
playlist = self.json_data["playlist_entries"]
|
||||
self.json_data["playlist_thumbnail"] = False
|
||||
|
||||
for video in playlist:
|
||||
url = ThumbManager(video["youtube_id"]).vid_thumb_path()
|
||||
if url is not None:
|
||||
self.json_data["playlist_thumbnail"] = url
|
||||
break
|
||||
self.get_playlist_art()
|
||||
|
||||
def get_video_metadata(self, video_id):
|
||||
video = ta_video.YoutubeVideo(video_id)
|
||||
video.get_from_es()
|
||||
video_json_data = {
|
||||
"youtube_id": video.json_data["youtube_id"],
|
||||
"title": video.json_data["title"],
|
||||
"uploader": video.json_data["channel"]["channel_name"],
|
||||
"idx": 0,
|
||||
"downloaded": "date_downloaded" in video.json_data
|
||||
and video.json_data["date_downloaded"] > 0,
|
||||
}
|
||||
return video_json_data
|
@ -1,52 +0,0 @@
|
||||
"""build query for playlists"""
|
||||
|
||||
from playlist.src.constants import PlaylistTypesEnum
|
||||
|
||||
|
||||
class QueryBuilder:
|
||||
"""contain functionality"""
|
||||
|
||||
def __init__(self, **kwargs):
|
||||
self.request_params = kwargs
|
||||
|
||||
def build_data(self) -> dict:
|
||||
"""build data dict"""
|
||||
data = {}
|
||||
data["query"] = self.build_query()
|
||||
if sort := self.parse_sort():
|
||||
data.update(sort)
|
||||
|
||||
return data
|
||||
|
||||
def build_query(self) -> dict:
|
||||
"""build query key"""
|
||||
must_list = []
|
||||
channel = self.request_params.get("channel")
|
||||
if channel:
|
||||
must_list.append({"match": {"playlist_channel_id": channel}})
|
||||
|
||||
subscribed = self.request_params.get("subscribed")
|
||||
if subscribed:
|
||||
must_list.append({"match": {"playlist_subscribed": subscribed}})
|
||||
|
||||
playlist_type = self.request_params.get("type")
|
||||
if playlist_type:
|
||||
type_list = self.parse_type(playlist_type)
|
||||
must_list.append(type_list)
|
||||
|
||||
query = {"bool": {"must": must_list}}
|
||||
|
||||
return query
|
||||
|
||||
def parse_type(self, playlist_type: str) -> dict:
|
||||
"""parse playlist type"""
|
||||
if not hasattr(PlaylistTypesEnum, playlist_type.upper()):
|
||||
raise ValueError(f"'{playlist_type}' not in PlaylistTypesEnum")
|
||||
|
||||
type_parsed = getattr(PlaylistTypesEnum, playlist_type.upper()).value
|
||||
|
||||
return {"match": {"playlist_type.keyword": type_parsed}}
|
||||
|
||||
def parse_sort(self) -> dict:
|
||||
"""return sort"""
|
||||
return {"sort": [{"playlist_name.keyword": {"order": "asc"}}]}
|
@ -1,30 +0,0 @@
|
||||
"""test playlist query building"""
|
||||
|
||||
import pytest
|
||||
from playlist.src.query_building import QueryBuilder
|
||||
|
||||
|
||||
def test_build_data():
|
||||
"""test for correct key building"""
|
||||
qb = QueryBuilder(
|
||||
channel="test_channel",
|
||||
subscribed=True,
|
||||
type="regular",
|
||||
)
|
||||
result = qb.build_data()
|
||||
must_list = result["query"]["bool"]["must"]
|
||||
assert "query" in result
|
||||
assert "sort" in result
|
||||
assert result["sort"] == [{"playlist_name.keyword": {"order": "asc"}}]
|
||||
assert {"match": {"playlist_channel_id": "test_channel"}} in must_list
|
||||
assert {"match": {"playlist_subscribed": True}} in must_list
|
||||
|
||||
|
||||
def test_parse_type():
|
||||
"""validate type"""
|
||||
qb = QueryBuilder(type="regular")
|
||||
with pytest.raises(ValueError):
|
||||
qb.parse_type("invalid")
|
||||
|
||||
result = qb.parse_type("custom")
|
||||
assert result == {"match": {"playlist_type.keyword": "custom"}}
|
@ -1,27 +0,0 @@
|
||||
"""all playlist API urls"""
|
||||
|
||||
from django.urls import path
|
||||
from playlist import views
|
||||
|
||||
urlpatterns = [
|
||||
path(
|
||||
"",
|
||||
views.PlaylistApiListView.as_view(),
|
||||
name="api-playlist-list",
|
||||
),
|
||||
path(
|
||||
"custom/",
|
||||
views.PlaylistCustomApiListView.as_view(),
|
||||
name="api-custom-playlist-list",
|
||||
),
|
||||
path(
|
||||
"custom/<slug:playlist_id>/",
|
||||
views.PlaylistCustomApiView.as_view(),
|
||||
name="api-custom-playlist",
|
||||
),
|
||||
path(
|
||||
"<slug:playlist_id>/",
|
||||
views.PlaylistApiView.as_view(),
|
||||
name="api-playlist",
|
||||
),
|
||||
]
|
@ -1,273 +0,0 @@
|
||||
"""all playlist API views"""
|
||||
|
||||
import uuid
|
||||
|
||||
from common.serializers import (
|
||||
AsyncTaskResponseSerializer,
|
||||
ErrorResponseSerializer,
|
||||
)
|
||||
from common.views_base import AdminWriteOnly, ApiBaseView
|
||||
from download.src.subscriptions import PlaylistSubscription
|
||||
from drf_spectacular.utils import OpenApiResponse, extend_schema
|
||||
from playlist.serializers import (
|
||||
PlaylistBulkAddSerializer,
|
||||
PlaylistCustomPostSerializer,
|
||||
PlaylistDeleteQuerySerializer,
|
||||
PlaylistListCustomPostSerializer,
|
||||
PlaylistListQuerySerializer,
|
||||
PlaylistListSerializer,
|
||||
PlaylistSerializer,
|
||||
PlaylistSingleUpdate,
|
||||
)
|
||||
from playlist.src.index import YoutubePlaylist
|
||||
from playlist.src.query_building import QueryBuilder
|
||||
from rest_framework.response import Response
|
||||
from task.tasks import subscribe_to
|
||||
from user.src.user_config import UserConfig
|
||||
|
||||
|
||||
class PlaylistApiListView(ApiBaseView):
|
||||
"""resolves to /api/playlist/
|
||||
GET: returns list of indexed playlists
|
||||
params:
|
||||
- channel:str=<channel-id>
|
||||
- subscribed: bool
|
||||
- type:enum=regular|custom
|
||||
POST: change subscribe state
|
||||
"""
|
||||
|
||||
search_base = "ta_playlist/_search/"
|
||||
permission_classes = [AdminWriteOnly]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(PlaylistListSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
parameters=[PlaylistListQuerySerializer],
|
||||
)
|
||||
def get(self, request):
|
||||
"""get playlist list"""
|
||||
query_serializer = PlaylistListQuerySerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
try:
|
||||
data = QueryBuilder(**validated_query).build_data()
|
||||
except ValueError as err:
|
||||
error = ErrorResponseSerializer({"error": str(err)})
|
||||
return Response(error.data, status=400)
|
||||
|
||||
self.data = data
|
||||
self.get_document_list(request)
|
||||
|
||||
response_serializer = PlaylistListSerializer(self.response)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
request=PlaylistBulkAddSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(AsyncTaskResponseSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request):
|
||||
"""async subscribe to list of playlists"""
|
||||
data_serializer = PlaylistBulkAddSerializer(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
|
||||
pending = [i["playlist_id"] for i in validated_data["data"]]
|
||||
if not pending:
|
||||
error = ErrorResponseSerializer({"error": "nothing to subscribe"})
|
||||
return Response(error.data, status=400)
|
||||
|
||||
url_str = " ".join(pending)
|
||||
task = subscribe_to.delay(url_str, expected_type="playlist")
|
||||
|
||||
message = {
|
||||
"message": "playlist subscribe task started",
|
||||
"task_id": task.id,
|
||||
}
|
||||
serializer = AsyncTaskResponseSerializer(message)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class PlaylistCustomApiListView(ApiBaseView):
|
||||
"""resolves to /api/playlist/custom/
|
||||
POST: Create new custom playlist
|
||||
"""
|
||||
|
||||
search_base = "ta_playlist/_search/"
|
||||
permission_classes = [AdminWriteOnly]
|
||||
|
||||
@extend_schema(
|
||||
request=PlaylistListCustomPostSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(PlaylistSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request):
|
||||
"""create new custom playlist"""
|
||||
serializer = PlaylistListCustomPostSerializer(data=request.data)
|
||||
serializer.is_valid(raise_exception=True)
|
||||
validated_data = serializer.validated_data
|
||||
|
||||
custom_name = validated_data["playlist_name"]
|
||||
playlist_id = f"TA_playlist_{uuid.uuid4()}"
|
||||
custom_playlist = YoutubePlaylist(playlist_id)
|
||||
custom_playlist.create(custom_name)
|
||||
|
||||
response_serializer = PlaylistSerializer(custom_playlist.json_data)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
|
||||
class PlaylistCustomApiView(ApiBaseView):
|
||||
"""resolves to /api/playlist/custom/<playlist_id>/
|
||||
POST: modify custom playlist
|
||||
"""
|
||||
|
||||
search_base = "ta_playlist/_doc/"
|
||||
permission_classes = [AdminWriteOnly]
|
||||
|
||||
@extend_schema(
|
||||
request=PlaylistCustomPostSerializer(),
|
||||
responses={
|
||||
200: OpenApiResponse(PlaylistSerializer()),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="bad request"
|
||||
),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="playlist not found"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request, playlist_id):
|
||||
"""modify custom playlist"""
|
||||
data_serializer = PlaylistCustomPostSerializer(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
|
||||
self.get_document(playlist_id)
|
||||
if not self.response:
|
||||
error = ErrorResponseSerializer({"error": "playlist not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
if not self.response["playlist_type"] == "custom":
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": f"playlist with ID {playlist_id} is not custom"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
|
||||
action = validated_data.get("action")
|
||||
video_id = validated_data.get("video_id")
|
||||
|
||||
playlist = YoutubePlaylist(playlist_id)
|
||||
if action == "create":
|
||||
try:
|
||||
playlist.add_video_to_playlist(video_id)
|
||||
except TypeError:
|
||||
error = ErrorResponseSerializer(
|
||||
{"error": f"failed to add video {video_id} to playlist"}
|
||||
)
|
||||
return Response(error.data, status=400)
|
||||
else:
|
||||
hide = UserConfig(request.user.id).get_value("hide_watched")
|
||||
playlist.move_video(video_id, action, hide_watched=hide)
|
||||
|
||||
response_serializer = PlaylistSerializer(playlist.json_data)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
|
||||
class PlaylistApiView(ApiBaseView):
|
||||
"""resolves to /api/playlist/<playlist_id>/
|
||||
GET: returns metadata dict of playlist
|
||||
"""
|
||||
|
||||
search_base = "ta_playlist/_doc/"
|
||||
permission_classes = [AdminWriteOnly]
|
||||
valid_custom_actions = ["create", "remove", "up", "down", "top", "bottom"]
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(PlaylistSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="playlist not found"
|
||||
),
|
||||
},
|
||||
)
|
||||
def get(self, request, playlist_id):
|
||||
# pylint: disable=unused-argument
|
||||
"""get playlist"""
|
||||
self.get_document(playlist_id)
|
||||
if not self.response:
|
||||
error = ErrorResponseSerializer({"error": "playlist not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
response_serializer = PlaylistSerializer(self.response)
|
||||
|
||||
return Response(response_serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
request=PlaylistSingleUpdate(),
|
||||
responses={
|
||||
200: OpenApiResponse(PlaylistSerializer()),
|
||||
404: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="playlist not found"
|
||||
),
|
||||
},
|
||||
)
|
||||
def post(self, request, playlist_id):
|
||||
"""update subscribed state of playlist"""
|
||||
data_serializer = PlaylistSingleUpdate(data=request.data)
|
||||
data_serializer.is_valid(raise_exception=True)
|
||||
validated_data = data_serializer.validated_data
|
||||
|
||||
self.get_document(playlist_id)
|
||||
if not self.response:
|
||||
error = ErrorResponseSerializer({"error": "playlist not found"})
|
||||
return Response(error.data, status=404)
|
||||
|
||||
subscribed = validated_data["playlist_subscribed"]
|
||||
playlist_sub = PlaylistSubscription()
|
||||
json_data = playlist_sub.change_subscribe(playlist_id, subscribed)
|
||||
|
||||
response_serializer = PlaylistSerializer(json_data)
|
||||
return Response(response_serializer.data)
|
||||
|
||||
@extend_schema(
|
||||
parameters=[PlaylistDeleteQuerySerializer],
|
||||
responses={
|
||||
204: OpenApiResponse(description="playlist deleted"),
|
||||
},
|
||||
)
|
||||
def delete(self, request, playlist_id):
|
||||
"""delete playlist"""
|
||||
print(f"{playlist_id}: delete playlist")
|
||||
|
||||
query_serializer = PlaylistDeleteQuerySerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
|
||||
delete_videos = validated_query.get("delete_videos", False)
|
||||
|
||||
if delete_videos:
|
||||
YoutubePlaylist(playlist_id).delete_videos_playlist()
|
||||
else:
|
||||
YoutubePlaylist(playlist_id).delete_metadata()
|
||||
|
||||
return Response(status=204)
|
@ -1,10 +0,0 @@
|
||||
-r requirements.txt
|
||||
ipython==9.3.0
|
||||
pre-commit==4.2.0
|
||||
pylint-django==2.6.1
|
||||
pylint==3.3.7
|
||||
pytest-django==4.11.1
|
||||
pytest==8.4.1
|
||||
python-dotenv==1.1.1
|
||||
requirementscheck==0.0.6
|
||||
types-requests==2.32.4.20250611
|
@ -1,15 +0,0 @@
|
||||
apprise==1.9.3
|
||||
celery==5.5.3
|
||||
django-auth-ldap==5.2.0
|
||||
django-celery-beat==2.8.1
|
||||
django-cors-headers==4.7.0
|
||||
Django==5.2.3
|
||||
djangorestframework==3.16.0
|
||||
drf-spectacular==0.28.0
|
||||
Pillow==11.2.1
|
||||
redis==6.2.0
|
||||
requests==2.32.4
|
||||
ryd-client==0.0.6
|
||||
uvicorn==0.35.0
|
||||
whitenoise==6.9.0
|
||||
yt-dlp[default]==2025.6.30
|
@ -1,110 +0,0 @@
|
||||
"""serializers for stats"""
|
||||
|
||||
# pylint: disable=abstract-method
|
||||
|
||||
from rest_framework import serializers
|
||||
|
||||
|
||||
class VideoStatsItemSerializer(serializers.Serializer):
|
||||
"""serialize video stats item"""
|
||||
|
||||
doc_count = serializers.IntegerField()
|
||||
media_size = serializers.IntegerField()
|
||||
duration = serializers.IntegerField()
|
||||
duration_str = serializers.CharField()
|
||||
|
||||
|
||||
class VideoStatsSerializer(serializers.Serializer):
|
||||
"""serialize video stats"""
|
||||
|
||||
doc_count = serializers.IntegerField()
|
||||
media_size = serializers.IntegerField()
|
||||
duration = serializers.IntegerField()
|
||||
duration_str = serializers.CharField()
|
||||
type_videos = VideoStatsItemSerializer(allow_null=True)
|
||||
type_shorts = VideoStatsItemSerializer(allow_null=True)
|
||||
type_streams = VideoStatsItemSerializer(allow_null=True)
|
||||
active_true = VideoStatsItemSerializer(allow_null=True)
|
||||
active_false = VideoStatsItemSerializer(allow_null=True)
|
||||
|
||||
|
||||
class ChannelStatsSerializer(serializers.Serializer):
|
||||
"""serialize channel stats"""
|
||||
|
||||
doc_count = serializers.IntegerField(allow_null=True)
|
||||
active_true = serializers.IntegerField(allow_null=True)
|
||||
active_false = serializers.IntegerField(allow_null=True)
|
||||
subscribed_true = serializers.IntegerField(allow_null=True)
|
||||
subscribed_false = serializers.IntegerField(allow_null=True)
|
||||
|
||||
|
||||
class PlaylistStatsSerializer(serializers.Serializer):
|
||||
"""serialize playlists stats"""
|
||||
|
||||
doc_count = serializers.IntegerField(allow_null=True)
|
||||
active_true = serializers.IntegerField(allow_null=True)
|
||||
active_false = serializers.IntegerField(allow_null=True)
|
||||
subscribed_false = serializers.IntegerField(allow_null=True)
|
||||
subscribed_true = serializers.IntegerField(allow_null=True)
|
||||
|
||||
|
||||
class DownloadStatsSerializer(serializers.Serializer):
|
||||
"""serialize download stats"""
|
||||
|
||||
pending = serializers.IntegerField(allow_null=True)
|
||||
ignore = serializers.IntegerField(allow_null=True)
|
||||
pending_videos = serializers.IntegerField(allow_null=True)
|
||||
pending_shorts = serializers.IntegerField(allow_null=True)
|
||||
pending_streams = serializers.IntegerField(allow_null=True)
|
||||
|
||||
|
||||
class WatchTotalStatsSerializer(serializers.Serializer):
|
||||
"""serialize total watch stats"""
|
||||
|
||||
duration = serializers.IntegerField()
|
||||
duration_str = serializers.CharField()
|
||||
items = serializers.IntegerField()
|
||||
|
||||
|
||||
class WatchItemStatsSerializer(serializers.Serializer):
|
||||
"""serialize watch item stats"""
|
||||
|
||||
duration = serializers.IntegerField()
|
||||
duration_str = serializers.CharField()
|
||||
progress = serializers.FloatField()
|
||||
items = serializers.IntegerField()
|
||||
|
||||
|
||||
class WatchStatsSerializer(serializers.Serializer):
|
||||
"""serialize watch stats"""
|
||||
|
||||
total = WatchTotalStatsSerializer(allow_null=True)
|
||||
unwatched = WatchItemStatsSerializer(allow_null=True)
|
||||
watched = WatchItemStatsSerializer(allow_null=True)
|
||||
|
||||
|
||||
class DownloadHistItemSerializer(serializers.Serializer):
|
||||
"""serialize download hist item"""
|
||||
|
||||
date = serializers.CharField()
|
||||
count = serializers.IntegerField()
|
||||
media_size = serializers.IntegerField()
|
||||
|
||||
|
||||
class BiggestChannelQuerySerializer(serializers.Serializer):
|
||||
"""serialize biggest channel query"""
|
||||
|
||||
order = serializers.ChoiceField(
|
||||
choices=["doc_count", "duration", "media_size"], default="doc_count"
|
||||
)
|
||||
|
||||
|
||||
class BiggestChannelItemSerializer(serializers.Serializer):
|
||||
"""serialize biggest channel item"""
|
||||
|
||||
id = serializers.CharField()
|
||||
name = serializers.CharField()
|
||||
doc_count = serializers.IntegerField()
|
||||
duration = serializers.IntegerField()
|
||||
duration_str = serializers.CharField()
|
||||
media_size = serializers.IntegerField()
|
@ -1,369 +0,0 @@
|
||||
"""aggregations"""
|
||||
|
||||
from common.src.env_settings import EnvironmentSettings
|
||||
from common.src.es_connect import ElasticWrap
|
||||
from common.src.helper import get_duration_str
|
||||
|
||||
|
||||
class AggBase:
|
||||
"""base class for aggregation calls"""
|
||||
|
||||
path: str = ""
|
||||
data: dict = {}
|
||||
name: str = ""
|
||||
|
||||
def get(self):
|
||||
"""make get call"""
|
||||
response, _ = ElasticWrap(self.path).get(self.data)
|
||||
print(f"[agg][{self.name}] took {response.get('took')} ms to process")
|
||||
|
||||
return response.get("aggregations")
|
||||
|
||||
def process(self):
|
||||
"""implement in subclassess"""
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
class Video(AggBase):
|
||||
"""get video stats"""
|
||||
|
||||
name = "video_stats"
|
||||
path = "ta_video/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
"video_type": {
|
||||
"terms": {"field": "vid_type"},
|
||||
"aggs": {
|
||||
"media_size": {"sum": {"field": "media_size"}},
|
||||
"duration": {"sum": {"field": "player.duration"}},
|
||||
},
|
||||
},
|
||||
"video_active": {
|
||||
"terms": {"field": "active"},
|
||||
"aggs": {
|
||||
"media_size": {"sum": {"field": "media_size"}},
|
||||
"duration": {"sum": {"field": "player.duration"}},
|
||||
},
|
||||
},
|
||||
"video_media_size": {"sum": {"field": "media_size"}},
|
||||
"video_count": {"value_count": {"field": "youtube_id"}},
|
||||
"duration": {"sum": {"field": "player.duration"}},
|
||||
},
|
||||
}
|
||||
|
||||
def process(self):
|
||||
"""process aggregation"""
|
||||
aggregations = self.get()
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
duration = int(aggregations["duration"]["value"])
|
||||
response = {
|
||||
"doc_count": aggregations["video_count"]["value"],
|
||||
"media_size": int(aggregations["video_media_size"]["value"]),
|
||||
"duration": duration,
|
||||
"duration_str": get_duration_str(duration),
|
||||
}
|
||||
for bucket in aggregations["video_type"]["buckets"]:
|
||||
duration = int(bucket["duration"].get("value"))
|
||||
response.update(
|
||||
{
|
||||
f"type_{bucket['key']}": {
|
||||
"doc_count": bucket.get("doc_count"),
|
||||
"media_size": int(bucket["media_size"].get("value")),
|
||||
"duration": duration,
|
||||
"duration_str": get_duration_str(duration),
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
for bucket in aggregations["video_active"]["buckets"]:
|
||||
duration = int(bucket["duration"].get("value"))
|
||||
response.update(
|
||||
{
|
||||
f"active_{bucket['key_as_string']}": {
|
||||
"doc_count": bucket.get("doc_count"),
|
||||
"media_size": int(bucket["media_size"].get("value")),
|
||||
"duration": duration,
|
||||
"duration_str": get_duration_str(duration),
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
return response
|
||||
|
||||
|
||||
class Channel(AggBase):
|
||||
"""get channel stats"""
|
||||
|
||||
name = "channel_stats"
|
||||
path = "ta_channel/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
"channel_count": {"value_count": {"field": "channel_id"}},
|
||||
"channel_active": {"terms": {"field": "channel_active"}},
|
||||
"channel_subscribed": {"terms": {"field": "channel_subscribed"}},
|
||||
},
|
||||
}
|
||||
|
||||
def process(self):
|
||||
"""process aggregation"""
|
||||
aggregations = self.get()
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
response = {
|
||||
"doc_count": aggregations["channel_count"].get("value"),
|
||||
}
|
||||
for bucket in aggregations["channel_active"]["buckets"]:
|
||||
key = f"active_{bucket['key_as_string']}"
|
||||
response.update({key: bucket.get("doc_count")})
|
||||
for bucket in aggregations["channel_subscribed"]["buckets"]:
|
||||
key = f"subscribed_{bucket['key_as_string']}"
|
||||
response.update({key: bucket.get("doc_count")})
|
||||
|
||||
return response
|
||||
|
||||
|
||||
class Playlist(AggBase):
|
||||
"""get playlist stats"""
|
||||
|
||||
name = "playlist_stats"
|
||||
path = "ta_playlist/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
"playlist_count": {"value_count": {"field": "playlist_id"}},
|
||||
"playlist_active": {"terms": {"field": "playlist_active"}},
|
||||
"playlist_subscribed": {"terms": {"field": "playlist_subscribed"}},
|
||||
},
|
||||
}
|
||||
|
||||
def process(self):
|
||||
"""process aggregation"""
|
||||
aggregations = self.get()
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
response = {"doc_count": aggregations["playlist_count"].get("value")}
|
||||
for bucket in aggregations["playlist_active"]["buckets"]:
|
||||
key = f"active_{bucket['key_as_string']}"
|
||||
response.update({key: bucket.get("doc_count")})
|
||||
for bucket in aggregations["playlist_subscribed"]["buckets"]:
|
||||
key = f"subscribed_{bucket['key_as_string']}"
|
||||
response.update({key: bucket.get("doc_count")})
|
||||
|
||||
return response
|
||||
|
||||
|
||||
class Download(AggBase):
|
||||
"""get downloads queue stats"""
|
||||
|
||||
name = "download_queue_stats"
|
||||
path = "ta_download/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
"status": {"terms": {"field": "status"}},
|
||||
"video_type": {
|
||||
"filter": {"term": {"status": "pending"}},
|
||||
"aggs": {"type_pending": {"terms": {"field": "vid_type"}}},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
def process(self):
|
||||
"""process aggregation"""
|
||||
aggregations = self.get()
|
||||
response = {}
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
for bucket in aggregations["status"]["buckets"]:
|
||||
response.update({bucket["key"]: bucket.get("doc_count")})
|
||||
|
||||
for bucket in aggregations["video_type"]["type_pending"]["buckets"]:
|
||||
key = f"pending_{bucket['key']}"
|
||||
response.update({key: bucket.get("doc_count")})
|
||||
|
||||
return response
|
||||
|
||||
|
||||
class WatchProgress(AggBase):
|
||||
"""get watch progress"""
|
||||
|
||||
name = "watch_progress"
|
||||
path = "ta_video/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
name: {
|
||||
"terms": {"field": "player.watched"},
|
||||
"aggs": {
|
||||
"watch_docs": {
|
||||
"filter": {"terms": {"player.watched": [True, False]}},
|
||||
"aggs": {
|
||||
"true_count": {"value_count": {"field": "_index"}},
|
||||
"duration": {"sum": {"field": "player.duration"}},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
"total_duration": {"sum": {"field": "player.duration"}},
|
||||
"total_vids": {"value_count": {"field": "_index"}},
|
||||
},
|
||||
}
|
||||
|
||||
def process(self):
|
||||
"""make the call"""
|
||||
aggregations = self.get()
|
||||
response = {}
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
buckets = aggregations[self.name]["buckets"]
|
||||
all_duration = int(aggregations["total_duration"].get("value"))
|
||||
response.update(
|
||||
{
|
||||
"total": {
|
||||
"duration": all_duration,
|
||||
"duration_str": get_duration_str(all_duration),
|
||||
"items": aggregations["total_vids"].get("value"),
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
for bucket in buckets:
|
||||
response.update(self._build_bucket(bucket, all_duration))
|
||||
|
||||
return response
|
||||
|
||||
@staticmethod
|
||||
def _build_bucket(bucket, all_duration):
|
||||
"""parse bucket"""
|
||||
|
||||
duration = int(bucket["watch_docs"]["duration"]["value"])
|
||||
duration_str = get_duration_str(duration)
|
||||
items = bucket["watch_docs"]["true_count"]["value"]
|
||||
if bucket["key_as_string"] == "false":
|
||||
key = "unwatched"
|
||||
else:
|
||||
key = "watched"
|
||||
|
||||
bucket_parsed = {
|
||||
key: {
|
||||
"duration": duration,
|
||||
"duration_str": duration_str,
|
||||
"progress": duration / all_duration if all_duration else 0,
|
||||
"items": items,
|
||||
}
|
||||
}
|
||||
|
||||
return bucket_parsed
|
||||
|
||||
|
||||
class DownloadHist(AggBase):
|
||||
"""get downloads histogram last week"""
|
||||
|
||||
name = "videos_last_week"
|
||||
path = "ta_video/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
name: {
|
||||
"date_histogram": {
|
||||
"field": "date_downloaded",
|
||||
"calendar_interval": "day",
|
||||
"format": "yyyy-MM-dd",
|
||||
"order": {"_key": "desc"},
|
||||
"time_zone": EnvironmentSettings.TZ,
|
||||
},
|
||||
"aggs": {
|
||||
"total_videos": {"value_count": {"field": "youtube_id"}},
|
||||
"media_size": {"sum": {"field": "media_size"}},
|
||||
},
|
||||
}
|
||||
},
|
||||
"query": {
|
||||
"range": {
|
||||
"date_downloaded": {
|
||||
"gte": "now-7d/d",
|
||||
"time_zone": EnvironmentSettings.TZ,
|
||||
}
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
def process(self):
|
||||
"""process query"""
|
||||
aggregations = self.get()
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
buckets = aggregations[self.name]["buckets"]
|
||||
|
||||
response = [
|
||||
{
|
||||
"date": i.get("key_as_string"),
|
||||
"count": i.get("doc_count"),
|
||||
"media_size": i["media_size"].get("value"),
|
||||
}
|
||||
for i in buckets
|
||||
]
|
||||
|
||||
return response
|
||||
|
||||
|
||||
class BiggestChannel(AggBase):
|
||||
"""get channel aggregations"""
|
||||
|
||||
def __init__(self, order):
|
||||
self.data["aggs"][self.name]["multi_terms"]["order"] = {order: "desc"}
|
||||
|
||||
name = "channel_stats"
|
||||
path = "ta_video/_search"
|
||||
data = {
|
||||
"size": 0,
|
||||
"aggs": {
|
||||
name: {
|
||||
"multi_terms": {
|
||||
"terms": [
|
||||
{"field": "channel.channel_name.keyword"},
|
||||
{"field": "channel.channel_id"},
|
||||
],
|
||||
"order": {"doc_count": "desc"},
|
||||
},
|
||||
"aggs": {
|
||||
"doc_count": {"value_count": {"field": "_index"}},
|
||||
"duration": {"sum": {"field": "player.duration"}},
|
||||
"media_size": {"sum": {"field": "media_size"}},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
order_choices = ["doc_count", "duration", "media_size"]
|
||||
|
||||
def process(self):
|
||||
"""process aggregation, order_by validated in the view"""
|
||||
|
||||
aggregations = self.get()
|
||||
if not aggregations:
|
||||
return None
|
||||
|
||||
buckets = aggregations[self.name]["buckets"]
|
||||
|
||||
response = [
|
||||
{
|
||||
"id": i["key"][1],
|
||||
"name": i["key"][0].title(),
|
||||
"doc_count": i["doc_count"]["value"],
|
||||
"duration": i["duration"]["value"],
|
||||
"duration_str": get_duration_str(int(i["duration"]["value"])),
|
||||
"media_size": i["media_size"]["value"],
|
||||
}
|
||||
for i in buckets
|
||||
]
|
||||
|
||||
return response
|
@ -1,42 +0,0 @@
|
||||
"""all stats API urls"""
|
||||
|
||||
from django.urls import path
|
||||
from stats import views
|
||||
|
||||
urlpatterns = [
|
||||
path(
|
||||
"video/",
|
||||
views.StatVideoView.as_view(),
|
||||
name="api-stats-video",
|
||||
),
|
||||
path(
|
||||
"channel/",
|
||||
views.StatChannelView.as_view(),
|
||||
name="api-stats-channel",
|
||||
),
|
||||
path(
|
||||
"playlist/",
|
||||
views.StatPlaylistView.as_view(),
|
||||
name="api-stats-playlist",
|
||||
),
|
||||
path(
|
||||
"download/",
|
||||
views.StatDownloadView.as_view(),
|
||||
name="api-stats-download",
|
||||
),
|
||||
path(
|
||||
"watch/",
|
||||
views.StatWatchProgress.as_view(),
|
||||
name="api-stats-watch",
|
||||
),
|
||||
path(
|
||||
"downloadhist/",
|
||||
views.StatDownloadHist.as_view(),
|
||||
name="api-stats-downloadhist",
|
||||
),
|
||||
path(
|
||||
"biggestchannels/",
|
||||
views.StatBiggestChannel.as_view(),
|
||||
name="api-stats-biggestchannels",
|
||||
),
|
||||
]
|
@ -1,139 +0,0 @@
|
||||
"""all stats API views"""
|
||||
|
||||
from common.serializers import ErrorResponseSerializer
|
||||
from common.views_base import ApiBaseView
|
||||
from drf_spectacular.utils import OpenApiResponse, extend_schema
|
||||
from rest_framework.response import Response
|
||||
from stats.serializers import (
|
||||
BiggestChannelItemSerializer,
|
||||
BiggestChannelQuerySerializer,
|
||||
ChannelStatsSerializer,
|
||||
DownloadHistItemSerializer,
|
||||
DownloadStatsSerializer,
|
||||
PlaylistStatsSerializer,
|
||||
VideoStatsSerializer,
|
||||
WatchStatsSerializer,
|
||||
)
|
||||
from stats.src.aggs import (
|
||||
BiggestChannel,
|
||||
Channel,
|
||||
Download,
|
||||
DownloadHist,
|
||||
Playlist,
|
||||
Video,
|
||||
WatchProgress,
|
||||
)
|
||||
|
||||
|
||||
class StatVideoView(ApiBaseView):
|
||||
"""resolves to /api/stats/video/
|
||||
GET: return video stats
|
||||
"""
|
||||
|
||||
@extend_schema(responses=VideoStatsSerializer())
|
||||
def get(self, request):
|
||||
"""get video stats"""
|
||||
# pylint: disable=unused-argument
|
||||
serializer = VideoStatsSerializer(Video().process())
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class StatChannelView(ApiBaseView):
|
||||
"""resolves to /api/stats/channel/
|
||||
GET: return channel stats
|
||||
"""
|
||||
|
||||
@extend_schema(responses=ChannelStatsSerializer())
|
||||
def get(self, request):
|
||||
"""get channel stats"""
|
||||
# pylint: disable=unused-argument
|
||||
serializer = ChannelStatsSerializer(Channel().process())
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class StatPlaylistView(ApiBaseView):
|
||||
"""resolves to /api/stats/playlist/
|
||||
GET: return playlist stats
|
||||
"""
|
||||
|
||||
@extend_schema(responses=PlaylistStatsSerializer())
|
||||
def get(self, request):
|
||||
"""get playlist stats"""
|
||||
# pylint: disable=unused-argument
|
||||
serializer = PlaylistStatsSerializer(Playlist().process())
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class StatDownloadView(ApiBaseView):
|
||||
"""resolves to /api/stats/download/
|
||||
GET: return download stats
|
||||
"""
|
||||
|
||||
@extend_schema(responses=DownloadStatsSerializer())
|
||||
def get(self, request):
|
||||
"""get download stats"""
|
||||
# pylint: disable=unused-argument
|
||||
serializer = DownloadStatsSerializer(Download().process())
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class StatWatchProgress(ApiBaseView):
|
||||
"""resolves to /api/stats/watch/
|
||||
GET: return watch/unwatch progress stats
|
||||
"""
|
||||
|
||||
@extend_schema(responses=WatchStatsSerializer())
|
||||
def get(self, request):
|
||||
"""get watched stats"""
|
||||
# pylint: disable=unused-argument
|
||||
serializer = WatchStatsSerializer(WatchProgress().process())
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class StatDownloadHist(ApiBaseView):
|
||||
"""resolves to /api/stats/downloadhist/
|
||||
GET: return download video count histogram for last days
|
||||
"""
|
||||
|
||||
@extend_schema(responses=DownloadHistItemSerializer(many=True))
|
||||
def get(self, request):
|
||||
"""get download hist items"""
|
||||
# pylint: disable=unused-argument
|
||||
download_items = DownloadHist().process()
|
||||
serializer = DownloadHistItemSerializer(download_items, many=True)
|
||||
|
||||
return Response(serializer.data)
|
||||
|
||||
|
||||
class StatBiggestChannel(ApiBaseView):
|
||||
"""resolves to /api/stats/biggestchannels/
|
||||
GET: return biggest channels
|
||||
param: order
|
||||
"""
|
||||
|
||||
@extend_schema(
|
||||
responses={
|
||||
200: OpenApiResponse(BiggestChannelItemSerializer(many=True)),
|
||||
400: OpenApiResponse(
|
||||
ErrorResponseSerializer(), description="Bad request"
|
||||
),
|
||||
},
|
||||
)
|
||||
def get(self, request):
|
||||
"""get biggest channels stats"""
|
||||
query_serializer = BiggestChannelQuerySerializer(
|
||||
data=request.query_params
|
||||
)
|
||||
query_serializer.is_valid(raise_exception=True)
|
||||
validated_query = query_serializer.validated_data
|
||||
order = validated_query["order"]
|
||||
|
||||
channel_items = BiggestChannel(order).process()
|
||||
serializer = BiggestChannelItemSerializer(channel_items, many=True)
|
||||
|
||||
return Response(serializer.data)
|