-
-Tube Archivist has a new home: https://github.com/tubearchivist/tubearchivist
-
-## Table of contents:
-* [Wiki](https://github.com/tubearchivist/tubearchivist/wiki) for a detailed documentation, with [FAQ](https://github.com/tubearchivist/tubearchivist/wiki/FAQ)
-* [Core functionality](#core-functionality)
-* [Screenshots](#screenshots)
-* [Problem Tube Archivist tries to solve](#problem-tube-archivist-tries-to-solve)
-* [Connect](#connect)
-* [Installing and updating](#installing-and-updating)
-* [Getting Started](#getting-started)
-* [Potential pitfalls](#potential-pitfalls)
-* [Roadmap](#roadmap)
-* [Known limitations](#known-limitations)
-* [Donate](#donate)
-
-------------------------
-
-## Core functionality
-* Subscribe to your favorite YouTube channels
-* Download Videos using **yt-dlp**
-* Index and make videos searchable
-* Play videos
-* Keep track of viewed and unviewed videos
-
-## Tube Archivist on YouTube
-[![ibracorp-youtube-video-thumb](assets/tube-archivist-ibracorp-O8H8Z01c0Ys.jpg)](https://www.youtube.com/watch?v=O8H8Z01c0Ys)
-
-## Screenshots
-![home screenshot](assets/tube-archivist-screenshot-home.png?raw=true "Tube Archivist Home")
-*Home Page*
-
-![channels screenshot](assets/tube-archivist-screenshot-channels.png?raw=true "Tube Archivist Channels")
-*All Channels*
-
-![single channel screenshot](assets/tube-archivist-screenshot-single-channel.png?raw=true "Tube Archivist Single Channel")
-*Single Channel*
-
-![video page screenshot](assets/tube-archivist-screenshot-video.png?raw=true "Tube Archivist Video Page")
-*Video Page*
-
-![video page screenshot](assets/tube-archivist-screenshot-download.png?raw=true "Tube Archivist Video Page")
-*Downloads Page*
-
-## Problem Tube Archivist tries to solve
-Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.
-
-## Connect
-- [Discord](https://discord.gg/AFwz8nE7BK): Connect with us on our Discord server.
-- [r/TubeArchivist](https://www.reddit.com/r/TubeArchivist/): Join our Subreddit.
-
-## Installing and updating
-Take a look at the example `docker-compose.yml` file provided. Use the *latest* or the named semantic version tag. The *unstable* tag is for intermediate testing and as the name implies, is **unstable** and not be used on your main installation but in a [testing environment](CONTRIBUTING.md).
-
-Tube Archivist depends on three main components split up into separate docker containers:
-
-### Tube Archivist
-The main Python application that displays and serves your video collection, built with Django.
- - Serves the interface on port `8000`
- - Needs a volume for the video archive at **/youtube**
- - And another volume to save application data at **/cache**.
- - The environment variables `ES_URL` and `REDIS_HOST` are needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
- - The environment variables `HOST_UID` and `HOST_GID` allows Tube Archivist to `chown` the video files to the main host system user instead of the container user. Those two variables are optional, not setting them will disable that functionality. That might be needed if the underlying filesystem doesn't support `chown` like *NFS*.
- - Change the environment variables `TA_USERNAME` and `TA_PASSWORD` to create the initial credentials.
- - `ELASTIC_PASSWORD` is for the password for Elasticsearch. The environment variable `ELASTIC_USER` is optional, should you want to change the username from the default *elastic*.
- - For the scheduler to know what time it is, set your timezone with the `TZ` environment variable, defaults to *UTC*.
-
-### Port collisions
-If you have a collision on port `8000`, best solution is to use dockers *HOST_PORT* and *CONTAINER_PORT* distinction: To for example change the interface to port 9000 use `9000:8000` in your docker-compose file.
-
-Should that not be an option, the Tube Archivist container takes these two additional environment variables:
-- **TA_PORT**: To actually change the port where nginx listens, make sure to also change the ports value in your docker-compose file.
-- **TA_UWSGI_PORT**: To change the default uwsgi port 8080 used for container internal networking between uwsgi serving the django application and nginx.
-
-Changing any of these two environment variables will change the files *nginx.conf* and *uwsgi.ini* at startup using `sed` in your container.
-
-### Elasticsearch
-**Note**: Newest Tube Archivist depends on Elasticsearch version 7.17 to provide an automatic updatepath in the future.
-
-Use `bbilly1/tubearchivist-es` to automatically get the recommended version, or use the official image with the version tag in the docker-compose file.
-
-Stores video meta data and makes everything searchable. Also keeps track of the download queue.
- - Needs to be accessible over the default port `9200`
- - Needs a volume at **/usr/share/elasticsearch/data** to store data
-
-Follow the [documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) for additional installation details.
-
-### Redis JSON
-Functions as a cache and temporary link between the application and the file system. Used to store and display messages and configuration variables.
- - Needs to be accessible over the default port `6379`
- - Needs a volume at **/data** to make your configuration changes permanent.
-
-### Redis on a custom port
-For some architectures it might be required to run Redis JSON on a nonstandard port. To for example change the Redis port to **6380**, set the following values:
-- Set the environment variable `REDIS_PORT=6380` to the *tubearchivist* service.
-- For the *archivist-redis* service, change the ports to `6380:6380`
-- Additionally set the following value to the *archivist-redis* service: `command: --port 6380 --loadmodule /usr/lib/redis/modules/rejson.so`
-
-### Updating Tube Archivist
-You will see the current version number of **Tube Archivist** in the footer of the interface so you can compare it with the latest release to make sure you are running the *latest and greatest*.
-* There can be breaking changes between updates, particularly as the application grows, new environment variables or settings might be required for you to set in the your docker-compose file. *Always* check the **release notes**: Any breaking changes will be marked there.
-* All testing and development is done with the Elasticsearch version number as mentioned in the provided *docker-compose.yml* file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned. Use `bbilly1/tubearchivist-es` to automatically get the recommended version.
-
-### Alternative installation instructions:
-- **arm64**: The Tube Archivist container is multi arch, so is Elasticsearch. RedisJSON doesn't offer arm builds, you can use `bbilly1/rejson`, an unofficial rebuild for arm64.
-- **Synology**: There is a [discussion thread](https://github.com/tubearchivist/tubearchivist/discussions/48) with Synology installation instructions.
-- **Unraid**: The three containers needed are all in the Community Applications. First install `TubeArchivist RedisJSON` followed by `TubeArchivist ES`, and finally you can install `TubeArchivist`. If you have unraid specific issues, report those to the [support thread](https://forums.unraid.net/topic/114073-support-crocs-tube-archivist/ "support thread").
-- **Helm Chart**: There is a Helm Chart available at https://github.com/insuusvenerati/helm-charts. Mostly self-explanatory but feel free to ask questions in the discord / subreddit.
-
-
-## Potential pitfalls
-### vm.max_map_count
-**Elastic Search** in Docker requires the kernel setting of the host machine `vm.max_map_count` to be set to at least 262144.
-
-To temporary set the value run:
-```
-sudo sysctl -w vm.max_map_count=262144
-```
-
-To apply the change permanently depends on your host operating system:
-- For example on Ubuntu Server add `vm.max_map_count = 262144` to the file */etc/sysctl.conf*.
-- On Arch based systems create a file */etc/sysctl.d/max_map_count.conf* with the content `vm.max_map_count = 262144`.
-- On any other platform look up in the documentation on how to pass kernel parameters.
-
-### Permissions for elasticsearch
-If you see a message similar to `AccessDeniedException[/usr/share/elasticsearch/data/nodes]` when initially starting elasticsearch, that means the container is not allowed to write files to the volume.
-That's most likely the case when you run `docker-compose` as an unprivileged user. To fix that issue, shutdown the container and on your host machine run:
-```
-chown 1000:0 /path/to/mount/point
-```
-This will match the permissions with the **UID** and **GID** of elasticsearch within the container and should fix the issue.
-
-### Disk usage
-The Elasticsearch index will turn to *read only* if the disk usage of the container goes above 95% until the usage drops below 90% again, you will see error messages like `disk usage exceeded flood-stage watermark`, [link](https://github.com/tubearchivist/tubearchivist#disk-usage).
-
-Similar to that, TubeArchivist will become all sorts of messed up when running out of disk space. There are some error messages in the logs when that happens, but it's best to make sure to have enough disk space before starting to download.
+This is a [Next.js](https://nextjs.org/) project bootstrapped with [`create-next-app`](https://github.com/vercel/next.js/tree/canary/packages/create-next-app).
## Getting Started
-1. Go through the **settings** page and look at the available options. Particularly set *Download Format* to your desired video quality before downloading. **Tube Archivist** downloads the best available quality by default. To support iOS or MacOS and some other browsers a compatible format must be specified. For example:
+
+First, run the development server:
+
+```bash
+npm run dev
+# or
+yarn dev
```
-bestvideo[VCODEC=avc1]+bestaudio[ACODEC=mp4a]/mp4
-```
-2. Subscribe to some of your favorite YouTube channels on the **channels** page.
-3. On the **downloads** page, click on *Rescan subscriptions* to add videos from the subscribed channels to your Download queue or click on *Add to download queue* to manually add Video IDs, links, channels or playlists.
-4. Click on *Start download* and let **Tube Archivist** to it's thing.
-5. Enjoy your archived collection!
-
-## Roadmap
-We have come far, nonetheless we are not short of ideas on how to improve and extend this project. Issues waiting for you to be tackled in no particular order:
-- [ ] User roles
-- [ ] Podcast mode to serve channel as mp3
-- [ ] Implement [PyFilesystem](https://github.com/PyFilesystem/pyfilesystem2) for flexible video storage
-- [ ] Implement [Apprise](https://github.com/caronc/apprise) for notifications ([#97](https://github.com/tubearchivist/tubearchivist/issues/97))
-- [ ] Add passing browser cookies to yt-dlp ([#199](https://github.com/tubearchivist/tubearchivist/issues/199))
-- [ ] User created playlists, random and repeat controls ([#108](https://github.com/tubearchivist/tubearchivist/issues/108), [#220](https://github.com/tubearchivist/tubearchivist/issues/220))
-- [ ] Auto play or play next link ([#226](https://github.com/tubearchivist/tubearchivist/issues/226))
-- [ ] Show similar videos on video page
-- [ ] Multi language support
-- [ ] Show total video downloaded vs total videos available in channel
-- [ ] Make items in grid row configurable to use more of the screen
-- [ ] Add statistics of index
-- [ ] Implement complete offline media file import from json file ([#138](https://github.com/tubearchivist/tubearchivist/issues/138))
-- [ ] Filter and query in search form, search by url query ([#134](https://github.com/tubearchivist/tubearchivist/issues/134), [#139](https://github.com/tubearchivist/tubearchivist/issues/139))
-- [ ] Auto ignore videos by keyword ([#163](https://github.com/tubearchivist/tubearchivist/issues/163))
-- [ ] Custom searchable notes to videos, channels, playlists ([#144](https://github.com/tubearchivist/tubearchivist/issues/144))
-- [ ] Download video comments
+Open [http://localhost:3000](http://localhost:3000) with your browser to see the result.
-Implemented:
-- [X] Add [SponsorBlock](https://sponsor.ajay.app/) integration [2022-04-16]
-- [X] Implement per channel settings [2022-03-26]
-- [X] Subtitle download & indexing [2022-02-13]
-- [X] Fancy advanced unified search interface [2022-01-08]
-- [X] Auto rescan and auto download on a schedule [2021-12-17]
-- [X] Optional automatic deletion of watched items after a specified time [2021-12-17]
-- [X] Create playlists [2021-11-27]
-- [X] Access control [2021-11-01]
-- [X] Delete videos and channel [2021-10-16]
-- [X] Add thumbnail embed option [2021-10-16]
-- [X] Create a github wiki for user documentation [2021-10-03]
-- [X] Grid and list view for both channel and video list pages [2021-10-03]
-- [X] Un-ignore videos [2021-10-03]
-- [X] Dynamic download queue [2021-09-26]
-- [X] Backup and restore [2021-09-22]
-- [X] Scan your file system to index already downloaded videos [2021-09-14]
+You can start editing the page by modifying `pages/index.tsx`. The page auto-updates as you edit the file.
-## Known limitations
-- Video files created by Tube Archivist need to be playable in your browser of choice. Not every codec is compatible with every browser and might require some testing with format selection.
-- Every limitation of **yt-dlp** will also be present in Tube Archivist. If **yt-dlp** can't download or extract a video for any reason, Tube Archivist won't be able to either.
-- For now this is meant to be run in a trusted network environment. Not everything is properly authenticated.
-- There is currently no flexibility in naming of the media files.
+[API routes](https://nextjs.org/docs/api-routes/introduction) can be accessed on [http://localhost:3000/api/hello](http://localhost:3000/api/hello). This endpoint can be edited in `pages/api/hello.ts`.
+The `pages/api` directory is mapped to `/api/*`. Files in this directory are treated as [API routes](https://nextjs.org/docs/api-routes/introduction) instead of React pages.
-## Donate
-The best donation to **Tube Archivist** is your time, take a look at the [contribution page](CONTRIBUTING.md) to get started.
-Second best way to support the development is to provide for caffeinated beverages:
-* [Paypal.me](https://paypal.me/bbilly1) for a one time coffee
-* [Paypal Subscription](https://www.paypal.com/webapps/billing/plans/subscribe?plan_id=P-03770005GR991451KMFGVPMQ) for a monthly coffee
-* [ko-fi.com](https://ko-fi.com/bbilly1) for an alternative platform
+## Learn More
+
+To learn more about Next.js, take a look at the following resources:
+
+- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API.
+- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial.
+
+You can check out [the Next.js GitHub repository](https://github.com/vercel/next.js/) - your feedback and contributions are welcome!
+
+## Deploy on Vercel
+
+The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js.
+
+Check out our [Next.js deployment documentation](https://nextjs.org/docs/deployment) for more details.
diff --git a/deploy.sh b/deploy.sh
deleted file mode 100755
index 92ca8de..0000000
--- a/deploy.sh
+++ /dev/null
@@ -1,205 +0,0 @@
-#!/bin/bash
-
-# deploy all needed project files to different servers:
-# test for local vm for testing
-# blackhole for local production
-# unstable to publish intermediate releases
-# docker to publish regular release
-
-# create builder:
-# docker buildx create --name tubearchivist
-# docker buildx use tubearchivist
-# docker buildx inspect --bootstrap
-
-# more details:
-# https://github.com/tubearchivist/tubearchivist/issues/6
-
-set -e
-
-function sync_blackhole {
-
- # docker commands need sudo, only build amd64
- host="blackhole.local"
-
- read -sp 'Password: ' remote_pw
- export PASS=$remote_pw
-
- rsync -a --progress --delete-after \
- --exclude ".git" \
- --exclude ".gitignore" \
- --exclude "**/cache" \
- --exclude "**/__pycache__/" \
- --exclude "db.sqlite3" \
- . -e ssh "$host":tubearchivist
-
- echo "$PASS" | ssh "$host" 'sudo -S docker buildx build --platform linux/amd64 -t bbilly1/tubearchivist:latest tubearchivist --load 2>/dev/null'
- echo "$PASS" | ssh "$host" 'sudo -S docker-compose up -d 2>/dev/null'
-
-}
-
-function sync_test {
-
- # docker commands don't need sudo in testing vm
- # pass argument to build for specific platform
-
- host="tubearchivist.local"
- # make base folder
- ssh "$host" "mkdir -p docker"
-
- # copy project files to build image
- rsync -a --progress --delete-after \
- --exclude ".git" \
- --exclude ".gitignore" \
- --exclude "**/cache" \
- --exclude "**/__pycache__/" \
- --exclude "db.sqlite3" \
- . -e ssh "$host":tubearchivist
-
- # copy default docker-compose file if not exist
- rsync --progress --ignore-existing docker-compose.yml -e ssh "$host":docker
-
- if [[ $1 = "amd64" ]]; then
- platform="linux/amd64"
- elif [[ $1 = "arm64" ]]; then
- platform="linux/arm64"
- elif [[ $1 = "multi" ]]; then
- platform="linux/amd64,linux/arm64"
- else
- platform="linux/amd64"
- fi
-
- ssh "$host" "docker buildx build --build-arg INSTALL_DEBUG=1 --platform $platform -t bbilly1/tubearchivist:latest tubearchivist --load"
- ssh "$host" 'docker-compose -f docker/docker-compose.yml up -d'
-
-}
-
-
-# run same tests and checks as with github action but locally
-# takes filename to validate as optional argument
-function validate {
-
- if [[ $1 ]]; then
- check_path="$1"
- else
- check_path="."
- fi
-
- echo "run validate on $check_path"
-
- echo "running black"
- black --diff --color --check -l 79 "$check_path"
- echo "running codespell"
- codespell --skip="./.git" "$check_path"
- echo "running flake8"
- flake8 "$check_path" --count --max-complexity=10 --max-line-length=79 \
- --show-source --statistics
- echo "running isort"
- isort --check-only --diff --profile black -l 79 "$check_path"
- printf " \n> all validations passed\n"
-
-}
-
-
-# update latest tag compatible es for set and forget
-function sync_latest_es {
-
- VERSION=$(grep "bbilly1/tubearchivist-es" docker-compose.yml | awk '{print $NF}')
- printf "\nsync new ES version %s\nContinue?\n" "$VERSION"
- read -rn 1
-
- if [[ $(systemctl is-active docker) != 'active' ]]; then
- echo "starting docker"
- sudo systemctl start docker
- fi
-
- sudo docker image pull docker.elastic.co/elasticsearch/elasticsearch:"$VERSION"
-
- sudo docker tag \
- docker.elastic.co/elasticsearch/elasticsearch:"$VERSION" \
- bbilly1/tubearchivist-es
-
- sudo docker tag \
- docker.elastic.co/elasticsearch/elasticsearch:"$VERSION" \
- bbilly1/tubearchivist-es:"$VERSION"
-
- sudo docker push bbilly1/tubearchivist-es
- sudo docker push bbilly1/tubearchivist-es:"$VERSION"
-
-}
-
-
-# publish unstable tag to docker
-function sync_unstable {
-
- if [[ $(systemctl is-active docker) != 'active' ]]; then
- echo "starting docker"
- sudo systemctl start docker
- fi
-
- # start amd64 build
- sudo docker buildx build \
- --platform linux/amd64 \
- -t bbilly1/tubearchivist:unstable --push .
-
-}
-
-
-function sync_docker {
-
- # check things
- if [[ $(git branch --show-current) != 'master' ]]; then
- echo 'you are not on master, dummy!'
- return
- fi
-
- if [[ $(systemctl is-active docker) != 'active' ]]; then
- echo "starting docker"
- sudo systemctl start docker
- fi
-
- echo "latest tags:"
- git tag | tail -n 10
-
- printf "\ncreate new version:\n"
- read -r VERSION
-
- echo "build and push $VERSION?"
- read -rn 1
-
- # start build
- sudo docker buildx build \
- --platform linux/amd64,linux/arm64 \
- -t bbilly1/tubearchivist:latest \
- -t bbilly1/tubearchivist:"$VERSION" --push .
-
- # create release tag
- echo "commits since last version:"
- git log "$(git describe --tags --abbrev=0)"..HEAD --oneline
- git tag -a "$VERSION" -m "new release version $VERSION"
- git push all "$VERSION"
-
-}
-
-
-if [[ $1 == "blackhole" ]]; then
- sync_blackhole
-elif [[ $1 == "test" ]]; then
- sync_test "$2"
-elif [[ $1 == "validate" ]]; then
- # check package versions in requirements.txt for updates
- python version_check.py
- validate "$2"
-elif [[ $1 == "docker" ]]; then
- sync_docker
- sync_unstable
-elif [[ $1 == "unstable" ]]; then
- sync_unstable
-elif [[ $1 == "es" ]]; then
- sync_latest_es
-else
- echo "valid options are: blackhole | test | validate | docker | unstable | es"
-fi
-
-
-##
-exit 0
diff --git a/docker_assets/nginx.conf b/docker_assets/nginx.conf
deleted file mode 100644
index e134a8e..0000000
--- a/docker_assets/nginx.conf
+++ /dev/null
@@ -1,29 +0,0 @@
-server {
-
- listen 8000;
-
- location /cache/videos/ {
- alias /cache/videos/;
- }
-
- location /cache/channels/ {
- alias /cache/channels/;
- }
-
- location /cache/playlists/ {
- alias /cache/playlists/;
- }
-
- location /media/ {
- alias /youtube/;
- types {
- text/vtt vtt;
- }
- }
-
- location / {
- include uwsgi_params;
- uwsgi_pass localhost:8080;
- }
-
-}
\ No newline at end of file
diff --git a/docker_assets/run.sh b/docker_assets/run.sh
deleted file mode 100644
index 31022c7..0000000
--- a/docker_assets/run.sh
+++ /dev/null
@@ -1,50 +0,0 @@
-#!/bin/bash
-# startup script inside the container for tubearchivist
-
-if [[ -z "$ELASTIC_USER" ]]; then
- export ELASTIC_USER=elastic
-fi
-
-ENV_VARS=("TA_USERNAME" "TA_PASSWORD" "ELASTIC_PASSWORD" "ELASTIC_USER")
-for each in "${ENV_VARS[@]}"; do
- if ! [[ -v $each ]]; then
- echo "missing environment variable $each"
- exit 1
- fi
-done
-
-# ugly nginx and uwsgi port overwrite with env vars
-if [[ -n "$TA_PORT" ]]; then
- sed -i "s/8000/$TA_PORT/g" /etc/nginx/sites-available/default
-fi
-
-if [[ -n "$TA_UWSGI_PORT" ]]; then
- sed -i "s/8080/$TA_UWSGI_PORT/g" /etc/nginx/sites-available/default
- sed -i "s/8080/$TA_UWSGI_PORT/g" /app/uwsgi.ini
-fi
-
-# wait for elasticsearch
-counter=0
-until curl -u "$ELASTIC_USER":"$ELASTIC_PASSWORD" "$ES_URL" -fs; do
- echo "waiting for elastic search to start"
- counter=$((counter+1))
- if [[ $counter -eq 12 ]]; then
- # fail after 2 min
- echo "failed to connect to elastic search, exiting..."
- exit 1
- fi
- sleep 10
-done
-
-# start python application
-python manage.py makemigrations
-python manage.py migrate
-export DJANGO_SUPERUSER_PASSWORD=$TA_PASSWORD && \
- python manage.py createsuperuser --noinput --name "$TA_USERNAME"
-
-python manage.py collectstatic --noinput -c
-nginx &
-celery -A home.tasks worker --loglevel=INFO &
-celery -A home beat --loglevel=INFO \
- -s "${BEAT_SCHEDULE_PATH:-/cache/celerybeat-schedule}" &
-uwsgi --ini uwsgi.ini
diff --git a/docker_assets/uwsgi.ini b/docker_assets/uwsgi.ini
deleted file mode 100644
index f8752a4..0000000
--- a/docker_assets/uwsgi.ini
+++ /dev/null
@@ -1,8 +0,0 @@
-[uwsgi]
-module = config.wsgi:application
-master = True
-pidfile = /tmp/project-master.pid
-vacuum = True
-max-requests = 5000
-socket = :8080
-buffer-size = 8192
\ No newline at end of file
diff --git a/docs/Channels.md b/docs/Channels.md
deleted file mode 100644
index e6f7654..0000000
--- a/docs/Channels.md
+++ /dev/null
@@ -1,32 +0,0 @@
-# Channels Overview and Channel Detail Page
-
-The channels are organized on two different levels, similar as the [playlists](Playlists):
-
-## Channels Overview
-Accessible at `/channel/` of your Tube Archivist, the **Overview Page** shows a list of all channels you have indexed.
-- You can filter that list to show or hide subscribed channels with the toggle. Clicking on the channel banner or the channel name will direct you to the *Channel Detail Page*.
-- If you are subscribed to a channel a *Unsubscribe* button will show, if you aren't subscribed, a *Subscribe* button will show instead.
-
-The **Subscribe to Channels** button opens a text field to subscribe to a channel. You have a few options:
-- Enter the YouTube channel ID, a 25 character alphanumeric string. For example *UCBa659QWEk1AI4Tg--mrJ2A*
-- Enter the URL to the channel page on YouTube. For example *https://www.youtube.com/channel/UCBa659QWEk1AI4Tg--mrJ2A*
-- Enter the channel name for example: *https://www.youtube.com/c/TomScottGo*.
-- Enter the video URL for any video and let Tube Archivist extract the channel ID for you. For example *https://www.youtube.com/watch?v=2tdiKTSdE9Y*
-- Add one per line.
-
-You can search your indexed channels by clicking on the search icon . This will open a dedicated page.
-
-## Channel Detail
-Each channel will get a dedicated channel detail page accessible at `/channel//` of your Tube Archivist. This page shows all the videos you have downloaded from this channel plus additional metadata.
-- If you are subscribed to the channel, an *Unsubscribe* button will show, else the *Subscribe* button will show.
-- You can *Show* the channel description, that matches with the *About* tab on YouTube.
-- The **Mark as Watched** button will mark all videos of this channel as watched.
-- The button **Delete Channel** will delete the channel plus all videos of this channel, both media files and metadata additionally this will also delete playlists metadata belonging to that channel.
-- The button **Show Playlists** will go to the [playlists](Playlists) page and filter the list to only show playlists from this channel.
-
-### Channel Customize
-Clicking on the *Configure* button will open a form with options to configure settings on a per channel basis. Any configurations here will overwrite your settings from the [settings](Settings) page.
-- **Download Format**: Overwrite the download qualities for videos from this channel.
-- **Auto Delete**: Automatically delete watched videos from this channel after selected days.
-- **Index Playlists**: Automatically add all Playlists with at least a video downloaded to your index. Only do this for channels where you care about playlists as this will slow down indexing new videos for having to check which playlist this belongs to.
-- **SponsorBlock**: Using [SponsorBlock](https://sponsor.ajay.app/) to get and skip sponsored content. Customize per channel: You can *disable* or *enable* SponsorBlock for certain channels only to overwrite the behavior set on the [Settings](settings) page. Selecting *unset* will remove the overwrite and your setting will fall back to the default on the settings page.
diff --git a/docs/Downloads.md b/docs/Downloads.md
deleted file mode 100644
index 2174644..0000000
--- a/docs/Downloads.md
+++ /dev/null
@@ -1,43 +0,0 @@
-# Downloads Page
-Accessible at `/downloads/` of your Tube Archivist, this page handles all the download functionality.
-
-
-## Rescan Subscriptions
-The **Rescan Subscriptions** icon will start a background task to look for new videos from the channels and playlists you are subscribed to. You can define the channel and playlist page size on the [settings page](Settings#subscriptions). With the default page size, expect this process to take around 2-3 seconds for each channel or playlist you are subscribed to. A status message will show the progress.
-
-Then for every video found, **Tube Archivist** will skip the video if it has already been downloaded or if you added it to the *ignored* list before. All the other videos will get added to the download queue. Expect this to take around 2 seconds for each video as **Tube Archivist** needs to grab some additional metadata. New videos will get added at the bottom of the download queue.
-
-## Download Queue
-The **Start Download** icon will start the download process starting from the top of the queue. Take a look at the relevant settings on the [Settings Page](Settings#downloads). Once the process started, a progress message will show with additional details and controls:
-- The stop icon will gracefully stop the download process, once the current video has been finished successfully.
-- The cancel icon is equivalent to killing the process and will stop the download immediately. Any leftover files will get deleted, the canceled video will still be available in the download queue.
-
-After downloading, Tube Archivist tries to add new videos to already indexed playlists.
-
-## Add to Download Queue
-The **Add to Download Queue** icon opens a text field to manually add videos to the download queue. You have a few options:
-- Add a link to a YouTube video. For example *https://www.youtube.com/watch?v=2tdiKTSdE9Y*.
-- Add a YouTube video ID. For example *2tdiKTSdE9Y*.
-- Add a link to a YouTube video by providing the shortened URL, for example *https://youtu.be/2tdiKTSdE9Y*.
-- Add a Channel ID or Channel URL to add every available video to the download queue. This will ignore the channel page size as described before and is meant for an initial download of the whole channel. You can still ignore selected videos before starting the download.
-- Add a channel name like for example *https://www.youtube.com/c/TomScottGo*.
-- Add a playlist ID or URL to add every available video in the list to the download queue, for example *https://www.youtube.com/playlist?list=PL96C35uN7xGLLeET0dOWaKHkAlPsrkcha* or *PL96C35uN7xGLLeET0dOWaKHkAlPsrkcha*.
- - Note: When adding a playlist to the queue, this playlist will automatically get [indexed](Playlists#playlist-detail).
- - Note: When you add a link to a video in a playlist, Tube Archivist assumes you want to download only the specific video and not the whole playlist, for example *https://www.youtube.com/watch?v=CINVwWHlzTY&list=PL96C35uN7xGLLeET0dOWaKHkAlPsrkcha* will only add one video *CINVwWHlzTY* to the queue.
-- Add one link per line.
-
-## The Download Queue
-Below the three buttons you find the download queue. New items will get added at the bottom of the queue, the next video to download once you click on **Start Download** will be the first in the list.
-
-Every video in the download queue has two buttons:
-- **Ignore**: This will remove that video from the download queue and this video will not get added again, even when you **Rescan Subscriptions**.
-- **Download now**: This will give priority to this video. If the download process is already running, the prioritized video will get downloaded as soon as the current video is finished. If there is no download process running, this will start downloading this single video and stop after that.
-
-The button **Delete all queued** will delete all pending videos from the download queue.
-
-You can flip the view by activating **Show Only Ignored Videos**. This will show all videos you have previously *ignored*.
-Every video in the ignored list has two buttons:
-- **Forget**: This will delete the item form the ignored list.
-- **Add to Queue**: This will add the ignored video back to the download queue.
-
-The button **Delete all ignored** will delete all videos you have previously ignored.
diff --git a/docs/FAQ.md b/docs/FAQ.md
deleted file mode 100644
index 7d5c9ee..0000000
--- a/docs/FAQ.md
+++ /dev/null
@@ -1,34 +0,0 @@
-# Frequently Asked Questions
-
-## 1. Scope of this project
-Tube Archivist is *Your self hosted YouTube media server*, which also defines the primary scope of what this project tries to do:
-- **Self hosted**: This assumes you have full control over the underlying operating system and hardware and can configure things to work properly with Docker, it's volumes and networks as well as whatever disk storage and filesystem you choose to use.
-- **YouTube**: Downloading, indexing and playing videos from YouTube, there are currently no plans to expand this to any additional platforms.
-- **Media server**: This project tries to be a stand alone media server in it's own web interface.
-
-Additionally to that, progress is also happening on:
-- **API**: Endpoints for additional integrations.
-- **Browser Extension**: To integrate between youtube.com and Tube Archivist.
-
-Defining the scope is important for the success of any project:
-- A scope too broad will result in development effort spreading too thin and will run into danger that his project tries to do too many things and none of them well.
-- A too narrow scope will make this project uninteresting and will exclude audiences that could also benefit from this project.
-- Not defining a scope will easily lead to misunderstandings and false hopes of where this project tries to go.
-
-Of course this is subject to change, as this project continues to grow and more people contribute.
-
-## 2. Emby/Plex/Jellyfin/Kodi integrations
-Although there are similarities between these excellent projects and Tube Archivist, they have a very different use case. Trying to fit the metadata relations and database structure of a YouTube archival project into these media servers that specialize in Movies and TV shows is always going to be limiting.
-
-Part of the scope is to be its own media server, so that's where the focus and effort of this project is. That being said, the nature of self hosted and open source software gives you all the possible freedom to use your media as you wish.
-
-## 3. To Docker or not to Docker
-This project is a classical docker application: There are multiple moving parts that need to be able to interact with each other and need to be compatible with multiple architectures and operating systems. Additionally Docker also drastically reduces development complexity which is highly appreciated.
-
-So Docker is the only supported installation method. If you don't have any experience with Docker, consider investing the time to learn this very useful technology.
-
-## 4. Finetuning Elasticsearch
-A minimal configuration of Elasticsearch (ES) is provided in the example docker-compose.yml file. ES is highly configurable and very interesting to learn more about. Refer to the [documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html) if you want to get into it.
-
-## 5. Advanced Authentication
-If you like to use things like SSO, LDAP or 2FA to login, consider using something like Authelia as a reverse proxy so this project can focus on the core task. Tube Archivist has a *remember me* checkbox at login to extend your sessions lifetime in your browser.
diff --git a/docs/Home.md b/docs/Home.md
deleted file mode 100644
index a1b5098..0000000
--- a/docs/Home.md
+++ /dev/null
@@ -1,31 +0,0 @@
-# Tube Archivist Wiki
-Welcome to the official Tube Archivist Wiki. This is an up-to-date documentation of user functionality.
-
-Table of contents:
-* [FAQ](FAQ): Frequently asked questions what this project is and tries to do
-* [Channels](Channels): Browse your channels, handle channel subscriptions
-* [Playlists](Playlists): Browse your indexed playlists, handle playlist subscriptions
-* [Downloads](Downloads): Scanning subscriptions, handle download queue
-* [Settings](Settings): All the configuration options
-* [Video](Video): All details of a single video and playlist navigation.
-* [Users](Users): User management admin interface
-* [Installation](Installation): WIP - detailed installation instructions for various platforms.
-
-## Getting Started
-1. [Subscribe](Channels#channels-overview) to some of your favourite YouTube channels.
-2. [Scan](Downloads#rescan-subscriptions) subscriptions to add the latest videos to the download queue.
-3. [Add](Downloads#add-to-download-queue) additional videos, channels or playlist - ignore the ones you don't want to download.
-4. [Download](Downloads#download-queue) and let **Tube Archivist** do it's thing.
-5. Sit back and enjoy your archived and indexed collection!
-
-## General Navigation
-* Clicking on the channel name or the channel icon brings you to the dedicated channel page to show videos from that channel.
-* Clicking on a video title brings you to the dedicated video page and shows additional details.
-* Clicking on a video thumbnail opens the video player and starts streaming the selected video.
-* Clicking on the search icon will open a dedicated search page to search over your complete index.
-* The pagination - if available - builds links for up to 10'000 results, use the search, sort or filter functionality to find what you are looking for.
-
-
-An empty checkbox icon will show for videos you haven't marked as watched. Click on it and the icon will change to a filled checkbox indicating it as watched - click again to revert.
-
-When available the gridview icon will display the list in a grid, the listview icon will arrange the items in a list. The sort icon will open additional sort options.
diff --git a/docs/Installation.md b/docs/Installation.md
deleted file mode 100644
index a6cb554..0000000
--- a/docs/Installation.md
+++ /dev/null
@@ -1,56 +0,0 @@
-# Detailed Installation Instructions for Various Platforms
-
-## Unraid
-
-Tube Archivist, and all if it's dependencies are located in the [community applications](https://forums.unraid.net/topic/38582-plug-in-community-applications/) store. The three containers you will need are as follows:
-
-- **TubeArchivist-RedisJSON**: This container acts as a cache and temporary link between the application and the file system. Used to store and display messages and configuration variables.
-- **TubeArchivist-ES**: ElasticSearch stores video meta data and makes everything searchable. Also keeps track of the download queue.
-- **TubeArchivist**: Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.
-
-### Step 1: Install `TubeArchivist-RedisJSON`
-
-![enter image description here](https://i.imgur.com/ycAqFRU.png)
-This is the easiest container to setup of the thee, just make sure that you do not have any port conflicts, and that your `/data` is mounted to the correct path. The other containers will map to the same directory.
-
-If you need to install `TubeArchivist-RedisJSON`on a different port, you'll have to follow [these steps](https://github.com/tubearchivist/tubearchivist#redis-on-a-custom-port) later on when installing the `TubeArchivist` container
-
-
-### Step 2: Install `TubeArchivist-ES`
-![enter image description here](https://i.imgur.com/o6tsTdt.png)
-ElasticSeach is also pretty easy to setup. Again, make sure you have no port conflicts, make sure that you mapped `/usr/share/elasticsearch/data` to the same directory as `RedisJSON`, and make sure to change the default password to something more secure.
-
-There is three additional settings in the "show more settings" area, but leave those as they are.
-
-
-### Step 3: Install `TubeArchivist`
-
-![enter image description here](https://i.imgur.com/dwSCfgO.png)
-It's finally time to set up TubeArchivist!
-
- - `Port:`Again, make sure that you have no port conflicts on 8000.
-
- - `Youtube Media Path:` is where you'll download all of your videos to.
- Make sure that this is an empty directory to not cause confusion when
- starting the application. If you have existing videos that you'd like
- to import into Tube Archivist, please checkout the [settings
- wiki.](https://github.com/tubearchivist/tubearchivist/wiki/Settings#manual-media-files-import)
-
-
-- `Appdata:` This should be the same base path as the other two containers.
-
- - `TA Username:`This will be your username for TubeArchivist.
-
- - `TA Password:`This will be your password for TubeArchivist.
-
- - `Redis` This will be JUST the ip address of your redis container
-
- - `ElasticSearch Password:`This is the password you defined in the `TubeArchivist-ES` container.
- - `ElasticSearch:` This seems to cause some confusion, but it's a pretty simple step, just replace the IP and Port to match you `TubeArchivist-ES` container.
-
- (example: if your IP is 192.168.1.15, the value should be http://192.168.1.15:9200)
-
- - `Time Zone:` This is an important step for your scheduler, to find your timezone, use a site like [TimeZoneConverter](http://www.timezoneconverter.com/cgi-bin/findzone.tzc)
-
-### From there, you should be able to start up your containers and you're good to go!
-If you're still having trouble, join us on [discord](https://discord.gg/AFwz8nE7BK) and come to the #unraid channel.
diff --git a/docs/Playlists.md b/docs/Playlists.md
deleted file mode 100644
index cede675..0000000
--- a/docs/Playlists.md
+++ /dev/null
@@ -1,23 +0,0 @@
-# Playlist Overview and Playlist Detail Page
-The playlists are organized in two different levels, similar as the [channels](Channels):
-
-## Playlist Overview
-Accessible at `/playlist/` of your Tube Archivist, this **Overview Page** shows a list of all playlists you have indexed over all your channels.
-- You can filter that list to show only subscribed to playlists with the toggle.
-
-You can index playlists of a channel from the channel detail page as described [here](Channels#channel-detail).
-
-The **Subscribe to Playlist** button opens a text field to subscribe to playlists. You have a few options:
-- Enter the YouTube playlist id, for example: *PL96C35uN7xGLLeET0dOWaKHkAlPsrkcha*
-- Enter the Youtube dedicated playlist url, for example: *https://www.youtube.com/playlist?list=PL96C35uN7xGLLeET0dOWaKHkAlPsrkcha*
-- Add one per line.
-- NOTE: It doesn't make sense to subscribe to a playlist if you are already subscribed the corresponding channel as this will slow down the **Rescan Subscriptions** [task](Downloads#rescan-subscriptions).
-
-You can search your indexed playlists by clicking on the search icon . This will open a dedicated page.
-
-## Playlist Detail
-Each playlist will get a dedicated playlist detail page accessible at `/playlist//` of your Tube Archivist. This page shows all the videos you have downloaded from this playlist.
-
-- If you are subscribed to the playlist, an Unsubscribe button will show, else the Subscribe button will show.
-- The Mark as Watched button will mark all videos of this playlist as watched.
-- The **Delete Playlist** button will give you the option to delete just the *metadata* which won't delete any media files or *delete all* which will delete metadata plus all videos belonging to this playlist.
\ No newline at end of file
diff --git a/docs/Settings.md b/docs/Settings.md
deleted file mode 100644
index 50d1d4c..0000000
--- a/docs/Settings.md
+++ /dev/null
@@ -1,136 +0,0 @@
-# Settings Page
-Accessible at `/settings/` of your **Tube Archivist**, this page holds all the configurations and additional functionality related to the database.
-
-Click on **Update Settings** at the bottom of the form to apply your configurations.
-
-## Color scheme
-Switch between the easy on the eyes dark theme and the burning bright theme.
-
-## Archive View
-- **Page Size**: Defines how many results get displayed on a given page. Same value goes for all archive views.
-
-## Subscriptions
-Settings related to the channel management.
-- **Channel Page Size**: Defines how many pages will get analyzed by **Tube Archivist** each time you click on *Rescan Subscriptions*. The default page size used by yt-dlp is **50**, that's also the recommended value to set here. Any value higher will slow down the rescan process, for example if you set the value to 51, that means yt-dlp will have to go through 2 pages of results instead of 1 and by that doubling the time that process takes.
-
-## Downloads
-Settings related to the download process.
-- **Download Limit**: Stop the download process after downloading the set quantity of videos.
-- **Download Speed Limit**: Set your download speed limit in KB/s. This will pass the option `--limit-rate` to yt-dlp.
-- **Throttled Rate Limit**: Restart download if the download speed drops below this value in KB/s. This will pass the option `--throttled-rate` to yt-dlp. Using this option might have a negative effect if you have an unstable or slow internet connection.
-- **Sleep Interval**: Time in seconds to sleep between requests to YouTube. It's a good idea to set this to **3** seconds. Might be necessary to avoid throttling.
-- **Auto Delete Watched Videos**: Automatically delete videos marked as watched after selected days. If activated, checks your videos after download task is finished.
-
-## Download Format
-Additional settings passed to yt-dlp.
-- **Format**: This controls which streams get downloaded and is equivalent to passing `--format` to yt-dlp. Use one of the recommended one or look at the documentation of [yt-dlp](https://github.com/yt-dlp/yt-dlp#format-selection). Please note: The option `--merge-output-format mp4` is automatically passed to yt-dlp to guarantee browser compatibility. Similar to that, `--check-formats` is passed as well to check that the selected formats are actually downloadable.
-- **Embed Metadata**: This saves the available tags directly into the media file by passing `--embed-metadata` to yt-dlp.
-- **Embed Thumbnail**: This will save the thumbnail into the media file by passing `--embed-thumbnail` to yt-dlp.
-
-## Subtitles
-- **Download Setting**: Select the subtitle language you like to download. Add a comma separated list for multiple languages.
-- **Source Settings**: User created subtitles are provided from the uploader and are usually the video script. Auto generated is from YouTube, quality varies, particularly for auto translated tracks.
-- **Index Settings**: Enabling subtitle indexing will add the lines to Elasticsearch and will make subtitles searchable. This will increase the index size and is not recommended on low-end hardware.
-
-## Integrations
-All third party integrations of TubeArchivist will **always** be *opt in*.
-- **API**: Your access token for the Tube Archivist API.
-- **returnyoutubedislike.com**: This will get return dislikes and average ratings for each video by integrating with the API from [returnyoutubedislike.com](https://www.returnyoutubedislike.com/).
-- **SponsorBlock**: Using [SponsorBlock](https://sponsor.ajay.app/) to get and skip sponsored content. If a video doesn't have timestamps, or has unlocked timestamps, use the browser addon to contribute to this excellent project. Can also be activated and deactivated as a per [channel overwrite](Settings#channel-customize).
-- **Cast**: Enabling the cast integration in the settings page will load an additional JS library from **Google**.
- * Requirements
- - HTTPS
- * To use the cast integration HTTPS needs to be enabled, which can be done using a reverse proxy. This is a requirement by Google as communication to the cast device is required to be encrypted, but the content itself is not.
- - Supported Browser
- * A supported browser is required for this integration such as Google Chrome. Other browsers, especially Chromium-based browsers, may support casting by enabling it in the settings.
- - Subtitles
- * Subtitles are supported however they do not work out of the box and require additional configuration. Due to requirements by Google, to use subtitles you need additional headers which will need to be configured in your reverse proxy. See this [page](https://developers.google.com/cast/docs/web_sender/advanced#cors_requirements) for the specific requirements.
- > You need the following headers: Content-Type, Accept-Encoding, and Range. Note that the last two headers, Accept-Encoding and Range, are additional headers that you may not have needed previously.
- > Wildcards "*" cannot be used for the Access-Control-Allow-Origin header. If the page has protected media content, it must use a domain instead of a wildcard.
-
-
-# Scheduler Setup
-Schedule settings expect a cron like format, where the first value is minute, second is hour and third is day of the week. Day 0 is Sunday, day 1 is Monday etc.
-
-Examples:
-- **0 15 \***: Run task every day at 15:00 in the afternoon.
-- **30 8 \*/2**: Run task every second day of the week (Sun, Tue, Thu, Sat) at 08:30 in the morning.
-- **0 \*/3,8-17 \***: Execute every hour divisible by 3, and every hour during office hours (8 in the morning - 5 in the afternoon).
-- **0 8,16 \***: Execute every day at 8 in the morning and at 4 in the afternoon.
-- **auto**: Sensible default.
-- **0**: (zero), deactivate that task.
-
-NOTE:
-- Changes in the scheduler settings require a container restart to take effect.
-- Cron format as *number*/*number* are none standard cron and are not supported by the scheduler, for example **0 0/12 \*** is invalid, use **0 \*/12 \*** instead.
-- Avoid an unnecessary frequent schedule to not get blocked by YouTube. For that reason * or wildcards for minutes are not supported.
-
-## Rescan Subscriptions
-That's the equivalent task as run from the downloads page looking through your channel and playlist and add missing videos to the download queue.
-
-## Start download
-Start downloading all videos currently in the download queue.
-
-## Refresh Metadata
-Rescan videos, channels and playlists on youtube and update metadata periodically. This will also deactivate an item and exclude it from future refreshes if the link on YouTube is no longer available. This task is meant to be run once per day, set your schedule accordingly.
-
-The field **Refresh older than x days** takes a number where TubeArchivist will consider an item as *outdated*. This value is used to calculate how many items need to be refreshed today based on the total indexed. This will spread out the requests to YouTube. Sensible value here is **90** days.
-
-## Thumbnail check
-This will check if all expected thumbnails are there and will delete any artwork without matching video.
-
-## Index backup
-Create a zip file of the metadata and select **Max auto backups to keep** to automatically delete old backups created from this task.
-
-
-# Actions
-Additional database functionality.
-
-## Manual Media Files Import
-So far this depends on the video you are trying to import to be still available on YouTube to get the metadata. Add the files you'd like to import to the */cache/import* folder. Then start the process from the settings page *Manual Media Files Import*. Make sure to follow one of the two methods below.
-
-### Method 1:
-Add a matching *.json* file with the media file. Both files need to have the same base name, for example:
-- For the media file: \.mp4
-- For the JSON file: \.info.json
-- Alternate JSON file: \.json
-
-**Tube Archivist** then looks for the 'id' key within the JSON file to identify the video.
-
-### Method 2:
-Detect the YouTube ID from filename, this accepts the default yt-dlp naming convention for file names like:
-- \[\].mp4
-- The YouTube ID in square brackets at the end of the filename is the crucial part.
-
-### Some notes:
-- This will **consume** the files you put into the import folder: Files will get converted to mp4 if needed (this might take a long time...) and moved to the archive, *.json* files will get deleted upon completion to avoid having duplicates on the next run.
-- For best file transcoding quality, convert your media files with desired settings first before importing (#138).
-- There should be no subdirectories added to */cache/import*, only video files. If your existing video library has video files inside subdirectories, you can get all the files into one directory by running `find ./ -mindepth 2 -type f -exec mv '{}' . \;` from the top-level directory of your existing video library. You can also delete any remaining empty subdirectories with `find ./ -mindepth 1 -type d -delete`.
-- Maybe start with a subset of your files to import to make sure everything goes well...
-- Follow the logs to monitor progress and errors: `docker-compose logs -f tubearchivist`.
-
-## Embed thumbnails into media file
-This will write or overwrite all thumbnails in the media file using the downloaded thumbnail. This is only necessary if you didn't download the files with the option *Embed Thumbnail* enabled or want to make sure all media files get the newest thumbnail. Follow the docker-compose logs to monitor progress.
-
-## Backup Database
-This will backup your metadata into a zip file. The file will get stored at *cache/backup* and will contain the necessary files to restore the Elasticsearch index formatted **nd-json** files plus a complete export of the index in a set of conventional **json** files.
-
-BE AWARE: This will **not** backup any media files, just the metadata from the Elasticsearch.
-
-## Restore From Backup
-The restore functionality will expect the same zip file in *cache/backup* as created from the **Backup database** function. This will recreate the index from the snapshot. There will be a list of all available backup to choose from. The *source* tag can have these different values:
-- **manual**: For backups manually created from here on the settings page.
-- **auto**: For backups automatically created via a sceduled task.
-- **update**: For backups created after a Tube Archivist update due to changes in the index.
-- **False**: Undefined.
-
-BE AWARE: This will **replace** your current index with the one from the backup file. This won't restore any media files.
-
-## Rescan Filesystem
-This function will go through all your media files and looks at the whole index to try to find any issues:
-- Should the filename not match with the indexed media url, this will rename the video files correctly and update the index with the new link.
-- When you delete media files from the filesystem outside of the Tube Archivist interface, this will delete leftover metadata from the index.
-- When you have media files that are not indexed yet, this will grab the metadata from YouTube like it was a newly downloaded video. This can be useful when restoring from an older backup file with missing metadata but already downloaded mediafiles. NOTE: This only works if the media files are named in the same convention as Tube Archivist does, particularly the YouTube ID needs to be at the same index in the filename, alternatively see above for *Manual Media Files Import*.
--This will also check all of your thumbnails and download any that are missing.
-
-BE AWARE: There is no undo.
diff --git a/docs/Users.md b/docs/Users.md
deleted file mode 100644
index 27453d7..0000000
--- a/docs/Users.md
+++ /dev/null
@@ -1,20 +0,0 @@
-# User Management
-
-For now, **Tube Archivist** is a single user application. You can create multiple users with different names and passwords, they will share the same videos and permissions but some interface configurations are on a per user basis. *More is on the roadmap*.
-
-## Superuser
-The first user gets created with the environment variables **TA_USERNAME** and **TA_PASSWORD** from your docker-compose file. That first user will automatically have *superuser* privileges.
-
-## Admin Interface
-When logged in from your *superuser* account, you are able to access the admin interface from the settings page or at `/admin/`. This interface holds all functionality for user management.
-
-## Create additional users
-From the admin interface when you click on *Accounts* you will get a list of all users. From there you can create additional users by clicking on *Add Account*, provide a name and confirm password and click on *Save* to create the user.
-
-## Changing users
-You can delete or change permissions and password of a user by clicking on the username from the *Accounts* list page and follow the interface from there. Changing the password of the *superuser* here will overwrite the password originally set with the environment variables.
-
-## Reset
-Delete all user configurations by deleting the file `cache/db.sqlite3` and restart the container. This will create the superuser again from the environment variables.
-
-NOTE: Future improvements here will most likely require such a reset.
\ No newline at end of file
diff --git a/docs/Video.md b/docs/Video.md
deleted file mode 100644
index d9b36fd..0000000
--- a/docs/Video.md
+++ /dev/null
@@ -1,11 +0,0 @@
-# Video Page
-
-Every video downloaded gets a dedicated page accessible at `/video/` of your Tube Archivist.
-
-Clicking on the channel name or the channel icon will bring you to the dedicated channel detail [page](Channels#channel-detail).
-
-The button **Delete Video** will delete that video including the media file.
-
-When available, a playlist navigation will show at the bottom. Clicking on the playlist name will bring you to the dedicated [Playlist Detail](Playlists#playlist-detail) page showing all videos downloaded from that playlist. The number in square brackets indicates the position of the current video in that playlist.
-
-Clicking on the next or previous video name or thumbnail will bring you to that dedicated video page.
\ No newline at end of file
diff --git a/docs/assets/TubeArchivist-ES.png b/docs/assets/TubeArchivist-ES.png
deleted file mode 100644
index ae8f656..0000000
Binary files a/docs/assets/TubeArchivist-ES.png and /dev/null differ
diff --git a/docs/assets/TubeArchivist-RedisJSON.png b/docs/assets/TubeArchivist-RedisJSON.png
deleted file mode 100644
index c86c37a..0000000
Binary files a/docs/assets/TubeArchivist-RedisJSON.png and /dev/null differ
diff --git a/docs/assets/TubeArchivist.png b/docs/assets/TubeArchivist.png
deleted file mode 100644
index 20c61d7..0000000
Binary files a/docs/assets/TubeArchivist.png and /dev/null differ
diff --git a/docs/assets/icon-add.png b/docs/assets/icon-add.png
deleted file mode 100644
index 1b8486c..0000000
Binary files a/docs/assets/icon-add.png and /dev/null differ
diff --git a/docs/assets/icon-close-blue.png b/docs/assets/icon-close-blue.png
deleted file mode 100644
index 20d86c8..0000000
Binary files a/docs/assets/icon-close-blue.png and /dev/null differ
diff --git a/docs/assets/icon-close-red.png b/docs/assets/icon-close-red.png
deleted file mode 100644
index 15d256d..0000000
Binary files a/docs/assets/icon-close-red.png and /dev/null differ
diff --git a/docs/assets/icon-download.png b/docs/assets/icon-download.png
deleted file mode 100644
index 5a90b27..0000000
Binary files a/docs/assets/icon-download.png and /dev/null differ
diff --git a/docs/assets/icon-gridview.png b/docs/assets/icon-gridview.png
deleted file mode 100644
index 868c3d0..0000000
Binary files a/docs/assets/icon-gridview.png and /dev/null differ
diff --git a/docs/assets/icon-listview.png b/docs/assets/icon-listview.png
deleted file mode 100644
index 457967a..0000000
Binary files a/docs/assets/icon-listview.png and /dev/null differ
diff --git a/docs/assets/icon-rescan.png b/docs/assets/icon-rescan.png
deleted file mode 100644
index 73b505a..0000000
Binary files a/docs/assets/icon-rescan.png and /dev/null differ
diff --git a/docs/assets/icon-search.png b/docs/assets/icon-search.png
deleted file mode 100644
index ee82c61..0000000
Binary files a/docs/assets/icon-search.png and /dev/null differ
diff --git a/docs/assets/icon-seen.png b/docs/assets/icon-seen.png
deleted file mode 100644
index fc5988c..0000000
Binary files a/docs/assets/icon-seen.png and /dev/null differ
diff --git a/docs/assets/icon-sort.png b/docs/assets/icon-sort.png
deleted file mode 100644
index c8d895c..0000000
Binary files a/docs/assets/icon-sort.png and /dev/null differ
diff --git a/docs/assets/icon-stop.png b/docs/assets/icon-stop.png
deleted file mode 100644
index 9761259..0000000
Binary files a/docs/assets/icon-stop.png and /dev/null differ
diff --git a/docs/assets/icon-unseen.png b/docs/assets/icon-unseen.png
deleted file mode 100644
index 8b7778b..0000000
Binary files a/docs/assets/icon-unseen.png and /dev/null differ
diff --git a/tubearchivist/www/next-env.d.ts b/next-env.d.ts
similarity index 100%
rename from tubearchivist/www/next-env.d.ts
rename to next-env.d.ts
diff --git a/tubearchivist/www/next.config.js b/next.config.js
similarity index 100%
rename from tubearchivist/www/next.config.js
rename to next.config.js
diff --git a/tubearchivist/www/package.json b/package.json
similarity index 95%
rename from tubearchivist/www/package.json
rename to package.json
index 72afb53..2c938d0 100644
--- a/tubearchivist/www/package.json
+++ b/package.json
@@ -1,5 +1,5 @@
{
- "name": "www",
+ "name": "tubearchivist-frontend",
"version": "0.1.0",
"private": true,
"scripts": {
diff --git a/tubearchivist/www/public/favicon.ico b/public/favicon.ico
similarity index 100%
rename from tubearchivist/www/public/favicon.ico
rename to public/favicon.ico
diff --git a/tubearchivist/static/favicon/android-chrome-192x192.png b/public/favicon/android-chrome-192x192.png
similarity index 100%
rename from tubearchivist/static/favicon/android-chrome-192x192.png
rename to public/favicon/android-chrome-192x192.png
diff --git a/tubearchivist/static/favicon/android-chrome-512x512.png b/public/favicon/android-chrome-512x512.png
similarity index 100%
rename from tubearchivist/static/favicon/android-chrome-512x512.png
rename to public/favicon/android-chrome-512x512.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-114x114-precomposed.png b/public/favicon/apple-touch-icon-114x114-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-114x114-precomposed.png
rename to public/favicon/apple-touch-icon-114x114-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-114x114.png b/public/favicon/apple-touch-icon-114x114.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-114x114.png
rename to public/favicon/apple-touch-icon-114x114.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-120x120-precomposed.png b/public/favicon/apple-touch-icon-120x120-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-120x120-precomposed.png
rename to public/favicon/apple-touch-icon-120x120-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-120x120.png b/public/favicon/apple-touch-icon-120x120.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-120x120.png
rename to public/favicon/apple-touch-icon-120x120.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-144x144-precomposed.png b/public/favicon/apple-touch-icon-144x144-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-144x144-precomposed.png
rename to public/favicon/apple-touch-icon-144x144-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-144x144.png b/public/favicon/apple-touch-icon-144x144.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-144x144.png
rename to public/favicon/apple-touch-icon-144x144.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-152x152-precomposed.png b/public/favicon/apple-touch-icon-152x152-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-152x152-precomposed.png
rename to public/favicon/apple-touch-icon-152x152-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-152x152.png b/public/favicon/apple-touch-icon-152x152.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-152x152.png
rename to public/favicon/apple-touch-icon-152x152.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-180x180-precomposed.png b/public/favicon/apple-touch-icon-180x180-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-180x180-precomposed.png
rename to public/favicon/apple-touch-icon-180x180-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-180x180.png b/public/favicon/apple-touch-icon-180x180.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-180x180.png
rename to public/favicon/apple-touch-icon-180x180.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-57x57-precomposed.png b/public/favicon/apple-touch-icon-57x57-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-57x57-precomposed.png
rename to public/favicon/apple-touch-icon-57x57-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-57x57.png b/public/favicon/apple-touch-icon-57x57.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-57x57.png
rename to public/favicon/apple-touch-icon-57x57.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-60x60-precomposed.png b/public/favicon/apple-touch-icon-60x60-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-60x60-precomposed.png
rename to public/favicon/apple-touch-icon-60x60-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-60x60.png b/public/favicon/apple-touch-icon-60x60.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-60x60.png
rename to public/favicon/apple-touch-icon-60x60.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-72x72-precomposed.png b/public/favicon/apple-touch-icon-72x72-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-72x72-precomposed.png
rename to public/favicon/apple-touch-icon-72x72-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-72x72.png b/public/favicon/apple-touch-icon-72x72.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-72x72.png
rename to public/favicon/apple-touch-icon-72x72.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-76x76-precomposed.png b/public/favicon/apple-touch-icon-76x76-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-76x76-precomposed.png
rename to public/favicon/apple-touch-icon-76x76-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-76x76.png b/public/favicon/apple-touch-icon-76x76.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-76x76.png
rename to public/favicon/apple-touch-icon-76x76.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon-precomposed.png b/public/favicon/apple-touch-icon-precomposed.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon-precomposed.png
rename to public/favicon/apple-touch-icon-precomposed.png
diff --git a/tubearchivist/static/favicon/apple-touch-icon.png b/public/favicon/apple-touch-icon.png
similarity index 100%
rename from tubearchivist/static/favicon/apple-touch-icon.png
rename to public/favicon/apple-touch-icon.png
diff --git a/tubearchivist/static/favicon/browserconfig.xml b/public/favicon/browserconfig.xml
similarity index 100%
rename from tubearchivist/static/favicon/browserconfig.xml
rename to public/favicon/browserconfig.xml
diff --git a/tubearchivist/static/favicon/favicon-16x16.png b/public/favicon/favicon-16x16.png
similarity index 100%
rename from tubearchivist/static/favicon/favicon-16x16.png
rename to public/favicon/favicon-16x16.png
diff --git a/tubearchivist/static/favicon/favicon-32x32.png b/public/favicon/favicon-32x32.png
similarity index 100%
rename from tubearchivist/static/favicon/favicon-32x32.png
rename to public/favicon/favicon-32x32.png
diff --git a/tubearchivist/static/favicon/favicon.ico b/public/favicon/favicon.ico
similarity index 100%
rename from tubearchivist/static/favicon/favicon.ico
rename to public/favicon/favicon.ico
diff --git a/tubearchivist/static/favicon/mstile-150x150.png b/public/favicon/mstile-150x150.png
similarity index 100%
rename from tubearchivist/static/favicon/mstile-150x150.png
rename to public/favicon/mstile-150x150.png
diff --git a/tubearchivist/static/favicon/safari-pinned-tab.svg b/public/favicon/safari-pinned-tab.svg
similarity index 100%
rename from tubearchivist/static/favicon/safari-pinned-tab.svg
rename to public/favicon/safari-pinned-tab.svg
diff --git a/tubearchivist/static/favicon/site.webmanifest b/public/favicon/site.webmanifest
similarity index 100%
rename from tubearchivist/static/favicon/site.webmanifest
rename to public/favicon/site.webmanifest
diff --git a/tubearchivist/static/img/banner-tube-archivist-dark.png b/public/img/banner-tube-archivist-dark.png
similarity index 100%
rename from tubearchivist/static/img/banner-tube-archivist-dark.png
rename to public/img/banner-tube-archivist-dark.png
diff --git a/tubearchivist/static/img/banner-tube-archivist-light.png b/public/img/banner-tube-archivist-light.png
similarity index 100%
rename from tubearchivist/static/img/banner-tube-archivist-light.png
rename to public/img/banner-tube-archivist-light.png
diff --git a/tubearchivist/static/img/default-channel-banner.jpg b/public/img/default-channel-banner.jpg
similarity index 100%
rename from tubearchivist/static/img/default-channel-banner.jpg
rename to public/img/default-channel-banner.jpg
diff --git a/tubearchivist/static/img/default-channel-icon.jpg b/public/img/default-channel-icon.jpg
similarity index 100%
rename from tubearchivist/static/img/default-channel-icon.jpg
rename to public/img/default-channel-icon.jpg
diff --git a/tubearchivist/static/img/default-video-thumb.jpg b/public/img/default-video-thumb.jpg
similarity index 100%
rename from tubearchivist/static/img/default-video-thumb.jpg
rename to public/img/default-video-thumb.jpg
diff --git a/tubearchivist/static/img/icon-add.svg b/public/img/icon-add.svg
similarity index 100%
rename from tubearchivist/static/img/icon-add.svg
rename to public/img/icon-add.svg
diff --git a/tubearchivist/static/img/icon-close.svg b/public/img/icon-close.svg
similarity index 100%
rename from tubearchivist/static/img/icon-close.svg
rename to public/img/icon-close.svg
diff --git a/tubearchivist/static/img/icon-download.svg b/public/img/icon-download.svg
similarity index 100%
rename from tubearchivist/static/img/icon-download.svg
rename to public/img/icon-download.svg
diff --git a/tubearchivist/static/img/icon-exit.svg b/public/img/icon-exit.svg
similarity index 100%
rename from tubearchivist/static/img/icon-exit.svg
rename to public/img/icon-exit.svg
diff --git a/tubearchivist/static/img/icon-eye.svg b/public/img/icon-eye.svg
similarity index 100%
rename from tubearchivist/static/img/icon-eye.svg
rename to public/img/icon-eye.svg
diff --git a/tubearchivist/static/img/icon-gear.svg b/public/img/icon-gear.svg
similarity index 100%
rename from tubearchivist/static/img/icon-gear.svg
rename to public/img/icon-gear.svg
diff --git a/tubearchivist/static/img/icon-gridview.svg b/public/img/icon-gridview.svg
similarity index 100%
rename from tubearchivist/static/img/icon-gridview.svg
rename to public/img/icon-gridview.svg
diff --git a/tubearchivist/static/img/icon-listview.svg b/public/img/icon-listview.svg
similarity index 100%
rename from tubearchivist/static/img/icon-listview.svg
rename to public/img/icon-listview.svg
diff --git a/tubearchivist/static/img/icon-play.svg b/public/img/icon-play.svg
similarity index 100%
rename from tubearchivist/static/img/icon-play.svg
rename to public/img/icon-play.svg
diff --git a/tubearchivist/static/img/icon-rescan.svg b/public/img/icon-rescan.svg
similarity index 100%
rename from tubearchivist/static/img/icon-rescan.svg
rename to public/img/icon-rescan.svg
diff --git a/tubearchivist/static/img/icon-search.svg b/public/img/icon-search.svg
similarity index 100%
rename from tubearchivist/static/img/icon-search.svg
rename to public/img/icon-search.svg
diff --git a/tubearchivist/static/img/icon-seen.svg b/public/img/icon-seen.svg
similarity index 100%
rename from tubearchivist/static/img/icon-seen.svg
rename to public/img/icon-seen.svg
diff --git a/tubearchivist/static/img/icon-sort.svg b/public/img/icon-sort.svg
similarity index 100%
rename from tubearchivist/static/img/icon-sort.svg
rename to public/img/icon-sort.svg
diff --git a/tubearchivist/static/img/icon-star-empty.svg b/public/img/icon-star-empty.svg
similarity index 100%
rename from tubearchivist/static/img/icon-star-empty.svg
rename to public/img/icon-star-empty.svg
diff --git a/tubearchivist/static/img/icon-star-full.svg b/public/img/icon-star-full.svg
similarity index 100%
rename from tubearchivist/static/img/icon-star-full.svg
rename to public/img/icon-star-full.svg
diff --git a/tubearchivist/static/img/icon-star-half.svg b/public/img/icon-star-half.svg
similarity index 100%
rename from tubearchivist/static/img/icon-star-half.svg
rename to public/img/icon-star-half.svg
diff --git a/tubearchivist/static/img/icon-stop.svg b/public/img/icon-stop.svg
similarity index 100%
rename from tubearchivist/static/img/icon-stop.svg
rename to public/img/icon-stop.svg
diff --git a/tubearchivist/static/img/icon-thumb.svg b/public/img/icon-thumb.svg
similarity index 100%
rename from tubearchivist/static/img/icon-thumb.svg
rename to public/img/icon-thumb.svg
diff --git a/tubearchivist/static/img/icon-unseen.svg b/public/img/icon-unseen.svg
similarity index 100%
rename from tubearchivist/static/img/icon-unseen.svg
rename to public/img/icon-unseen.svg
diff --git a/tubearchivist/static/img/logo-tube-archivist-dark.png b/public/img/logo-tube-archivist-dark.png
similarity index 100%
rename from tubearchivist/static/img/logo-tube-archivist-dark.png
rename to public/img/logo-tube-archivist-dark.png
diff --git a/tubearchivist/static/img/logo-tube-archivist-light.png b/public/img/logo-tube-archivist-light.png
similarity index 100%
rename from tubearchivist/static/img/logo-tube-archivist-light.png
rename to public/img/logo-tube-archivist-light.png
diff --git a/tubearchivist/static/cast-videos.js b/public/js/cast-videos.js
similarity index 100%
rename from tubearchivist/static/cast-videos.js
rename to public/js/cast-videos.js
diff --git a/tubearchivist/static/progress.js b/public/js/progress.js
similarity index 100%
rename from tubearchivist/static/progress.js
rename to public/js/progress.js
diff --git a/tubearchivist/www/public/js/script.js b/public/js/script.js
similarity index 100%
rename from tubearchivist/www/public/js/script.js
rename to public/js/script.js
diff --git a/tubearchivist/www/public/vercel.svg b/public/vercel.svg
similarity index 100%
rename from tubearchivist/www/public/vercel.svg
rename to public/vercel.svg
diff --git a/tubearchivist/www/src/components/BoxedContent.tsx b/src/components/BoxedContent.tsx
similarity index 100%
rename from tubearchivist/www/src/components/BoxedContent.tsx
rename to src/components/BoxedContent.tsx
diff --git a/tubearchivist/www/src/components/CustomHead.tsx b/src/components/CustomHead.tsx
similarity index 100%
rename from tubearchivist/www/src/components/CustomHead.tsx
rename to src/components/CustomHead.tsx
diff --git a/tubearchivist/www/src/components/Footer.tsx b/src/components/Footer.tsx
similarity index 100%
rename from tubearchivist/www/src/components/Footer.tsx
rename to src/components/Footer.tsx
diff --git a/tubearchivist/www/src/components/Header.tsx b/src/components/Header.tsx
similarity index 100%
rename from tubearchivist/www/src/components/Header.tsx
rename to src/components/Header.tsx
diff --git a/tubearchivist/www/src/components/Layout.tsx b/src/components/Layout.tsx
similarity index 100%
rename from tubearchivist/www/src/components/Layout.tsx
rename to src/components/Layout.tsx
diff --git a/tubearchivist/www/src/components/Nav.tsx b/src/components/Nav.tsx
similarity index 100%
rename from tubearchivist/www/src/components/Nav.tsx
rename to src/components/Nav.tsx
diff --git a/tubearchivist/www/src/components/VideoList/VideoList.tsx b/src/components/VideoList/VideoList.tsx
similarity index 100%
rename from tubearchivist/www/src/components/VideoList/VideoList.tsx
rename to src/components/VideoList/VideoList.tsx
diff --git a/tubearchivist/www/src/components/VideoList/index.ts b/src/components/VideoList/index.ts
similarity index 100%
rename from tubearchivist/www/src/components/VideoList/index.ts
rename to src/components/VideoList/index.ts
diff --git a/tubearchivist/www/src/components/VideoPlayer/VideoPlayer.tsx b/src/components/VideoPlayer/VideoPlayer.tsx
similarity index 100%
rename from tubearchivist/www/src/components/VideoPlayer/VideoPlayer.tsx
rename to src/components/VideoPlayer/VideoPlayer.tsx
diff --git a/tubearchivist/www/src/components/VideoPlayer/index.tsx b/src/components/VideoPlayer/index.tsx
similarity index 100%
rename from tubearchivist/www/src/components/VideoPlayer/index.tsx
rename to src/components/VideoPlayer/index.tsx
diff --git a/tubearchivist/www/public/img/banner-tube-archivist-dark.png b/src/images/banner-tube-archivist-dark.png
similarity index 100%
rename from tubearchivist/www/public/img/banner-tube-archivist-dark.png
rename to src/images/banner-tube-archivist-dark.png
diff --git a/tubearchivist/www/public/img/banner-tube-archivist-light.png b/src/images/banner-tube-archivist-light.png
similarity index 100%
rename from tubearchivist/www/public/img/banner-tube-archivist-light.png
rename to src/images/banner-tube-archivist-light.png
diff --git a/tubearchivist/www/public/img/default-channel-banner.jpg b/src/images/default-channel-banner.jpg
similarity index 100%
rename from tubearchivist/www/public/img/default-channel-banner.jpg
rename to src/images/default-channel-banner.jpg
diff --git a/tubearchivist/www/public/img/default-channel-icon.jpg b/src/images/default-channel-icon.jpg
similarity index 100%
rename from tubearchivist/www/public/img/default-channel-icon.jpg
rename to src/images/default-channel-icon.jpg
diff --git a/tubearchivist/www/public/img/default-video-thumb.jpg b/src/images/default-video-thumb.jpg
similarity index 100%
rename from tubearchivist/www/public/img/default-video-thumb.jpg
rename to src/images/default-video-thumb.jpg
diff --git a/tubearchivist/www/public/img/icon-add.svg b/src/images/icon-add.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-add.svg
rename to src/images/icon-add.svg
diff --git a/tubearchivist/www/public/img/icon-close.svg b/src/images/icon-close.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-close.svg
rename to src/images/icon-close.svg
diff --git a/tubearchivist/www/public/img/icon-download.svg b/src/images/icon-download.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-download.svg
rename to src/images/icon-download.svg
diff --git a/tubearchivist/www/public/img/icon-exit.svg b/src/images/icon-exit.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-exit.svg
rename to src/images/icon-exit.svg
diff --git a/tubearchivist/www/public/img/icon-eye.svg b/src/images/icon-eye.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-eye.svg
rename to src/images/icon-eye.svg
diff --git a/tubearchivist/www/public/img/icon-gear.svg b/src/images/icon-gear.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-gear.svg
rename to src/images/icon-gear.svg
diff --git a/tubearchivist/www/public/img/icon-gridview.svg b/src/images/icon-gridview.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-gridview.svg
rename to src/images/icon-gridview.svg
diff --git a/tubearchivist/www/public/img/icon-listview.svg b/src/images/icon-listview.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-listview.svg
rename to src/images/icon-listview.svg
diff --git a/tubearchivist/www/public/img/icon-play.svg b/src/images/icon-play.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-play.svg
rename to src/images/icon-play.svg
diff --git a/tubearchivist/www/public/img/icon-rescan.svg b/src/images/icon-rescan.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-rescan.svg
rename to src/images/icon-rescan.svg
diff --git a/tubearchivist/www/public/img/icon-search.svg b/src/images/icon-search.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-search.svg
rename to src/images/icon-search.svg
diff --git a/tubearchivist/www/public/img/icon-seen.svg b/src/images/icon-seen.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-seen.svg
rename to src/images/icon-seen.svg
diff --git a/tubearchivist/www/public/img/icon-sort.svg b/src/images/icon-sort.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-sort.svg
rename to src/images/icon-sort.svg
diff --git a/tubearchivist/www/public/img/icon-star-empty.svg b/src/images/icon-star-empty.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-star-empty.svg
rename to src/images/icon-star-empty.svg
diff --git a/tubearchivist/www/public/img/icon-star-full.svg b/src/images/icon-star-full.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-star-full.svg
rename to src/images/icon-star-full.svg
diff --git a/tubearchivist/www/public/img/icon-star-half.svg b/src/images/icon-star-half.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-star-half.svg
rename to src/images/icon-star-half.svg
diff --git a/tubearchivist/www/public/img/icon-stop.svg b/src/images/icon-stop.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-stop.svg
rename to src/images/icon-stop.svg
diff --git a/tubearchivist/www/public/img/icon-thumb.svg b/src/images/icon-thumb.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-thumb.svg
rename to src/images/icon-thumb.svg
diff --git a/tubearchivist/www/public/img/icon-unseen.svg b/src/images/icon-unseen.svg
similarity index 100%
rename from tubearchivist/www/public/img/icon-unseen.svg
rename to src/images/icon-unseen.svg
diff --git a/tubearchivist/www/public/img/logo-tube-archivist-dark.png b/src/images/logo-tube-archivist-dark.png
similarity index 100%
rename from tubearchivist/www/public/img/logo-tube-archivist-dark.png
rename to src/images/logo-tube-archivist-dark.png
diff --git a/tubearchivist/www/public/img/logo-tube-archivist-light.png b/src/images/logo-tube-archivist-light.png
similarity index 100%
rename from tubearchivist/www/public/img/logo-tube-archivist-light.png
rename to src/images/logo-tube-archivist-light.png
diff --git a/tubearchivist/www/src/lib/constants.ts b/src/lib/constants.ts
similarity index 100%
rename from tubearchivist/www/src/lib/constants.ts
rename to src/lib/constants.ts
diff --git a/tubearchivist/www/src/lib/getChannels.ts b/src/lib/getChannels.ts
similarity index 100%
rename from tubearchivist/www/src/lib/getChannels.ts
rename to src/lib/getChannels.ts
diff --git a/tubearchivist/www/src/lib/getDownloads.ts b/src/lib/getDownloads.ts
similarity index 100%
rename from tubearchivist/www/src/lib/getDownloads.ts
rename to src/lib/getDownloads.ts
diff --git a/tubearchivist/www/src/lib/getPlaylists.ts b/src/lib/getPlaylists.ts
similarity index 100%
rename from tubearchivist/www/src/lib/getPlaylists.ts
rename to src/lib/getPlaylists.ts
diff --git a/tubearchivist/www/src/lib/getVideos.ts b/src/lib/getVideos.ts
similarity index 100%
rename from tubearchivist/www/src/lib/getVideos.ts
rename to src/lib/getVideos.ts
diff --git a/tubearchivist/www/src/lib/utils.ts b/src/lib/utils.ts
similarity index 100%
rename from tubearchivist/www/src/lib/utils.ts
rename to src/lib/utils.ts
diff --git a/tubearchivist/www/src/pages/_app.tsx b/src/pages/_app.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/_app.tsx
rename to src/pages/_app.tsx
diff --git a/tubearchivist/www/src/pages/_document.tsx b/src/pages/_document.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/_document.tsx
rename to src/pages/_document.tsx
diff --git a/tubearchivist/www/src/pages/api/auth/[...nextauth].ts b/src/pages/api/auth/[...nextauth].ts
similarity index 100%
rename from tubearchivist/www/src/pages/api/auth/[...nextauth].ts
rename to src/pages/api/auth/[...nextauth].ts
diff --git a/tubearchivist/www/src/pages/auth/login.tsx b/src/pages/auth/login.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/auth/login.tsx
rename to src/pages/auth/login.tsx
diff --git a/tubearchivist/www/src/pages/channel.tsx b/src/pages/channel.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/channel.tsx
rename to src/pages/channel.tsx
diff --git a/tubearchivist/www/src/pages/download.tsx b/src/pages/download.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/download.tsx
rename to src/pages/download.tsx
diff --git a/tubearchivist/www/src/pages/index.tsx b/src/pages/index.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/index.tsx
rename to src/pages/index.tsx
diff --git a/tubearchivist/www/src/pages/playlist.tsx b/src/pages/playlist.tsx
similarity index 100%
rename from tubearchivist/www/src/pages/playlist.tsx
rename to src/pages/playlist.tsx
diff --git a/tubearchivist/www/src/pages/video/[videoId].tsx b/src/pages/video/[videoId].tsx
similarity index 100%
rename from tubearchivist/www/src/pages/video/[videoId].tsx
rename to src/pages/video/[videoId].tsx
diff --git a/tubearchivist/static/css/dark.css b/src/styles/dark.css
similarity index 100%
rename from tubearchivist/static/css/dark.css
rename to src/styles/dark.css
diff --git a/tubearchivist/www/src/styles/globals.css b/src/styles/globals.css
similarity index 100%
rename from tubearchivist/www/src/styles/globals.css
rename to src/styles/globals.css
diff --git a/tubearchivist/static/css/light.css b/src/styles/light.css
similarity index 100%
rename from tubearchivist/static/css/light.css
rename to src/styles/light.css
diff --git a/tubearchivist/www/src/types/channel.ts b/src/types/channel.ts
similarity index 100%
rename from tubearchivist/www/src/types/channel.ts
rename to src/types/channel.ts
diff --git a/tubearchivist/www/src/types/download.ts b/src/types/download.ts
similarity index 100%
rename from tubearchivist/www/src/types/download.ts
rename to src/types/download.ts
diff --git a/tubearchivist/www/src/types/next-auth.d.ts b/src/types/next-auth.d.ts
similarity index 100%
rename from tubearchivist/www/src/types/next-auth.d.ts
rename to src/types/next-auth.d.ts
diff --git a/tubearchivist/www/src/types/playlist.ts b/src/types/playlist.ts
similarity index 100%
rename from tubearchivist/www/src/types/playlist.ts
rename to src/types/playlist.ts
diff --git a/tubearchivist/www/src/types/playlists.ts b/src/types/playlists.ts
similarity index 100%
rename from tubearchivist/www/src/types/playlists.ts
rename to src/types/playlists.ts
diff --git a/tubearchivist/www/src/types/video.ts b/src/types/video.ts
similarity index 100%
rename from tubearchivist/www/src/types/video.ts
rename to src/types/video.ts
diff --git a/tubearchivist/www/src/types/videos.ts b/src/types/videos.ts
similarity index 100%
rename from tubearchivist/www/src/types/videos.ts
rename to src/types/videos.ts
diff --git a/tubearchivist/www/tsconfig.json b/tsconfig.json
similarity index 100%
rename from tubearchivist/www/tsconfig.json
rename to tsconfig.json
diff --git a/tubearchivist/api/README.md b/tubearchivist/api/README.md
deleted file mode 100644
index e2bed71..0000000
--- a/tubearchivist/api/README.md
+++ /dev/null
@@ -1,202 +0,0 @@
-# TubeArchivist API
-Documentation of available API endpoints.
-**Note: This is very early alpha and will change!**
-
-## Authentication
-API token will get automatically created, accessible on the settings page. Token needs to be passed as an authorization header with every request. Additionally session based authentication is enabled too: When you are logged into your TubeArchivist instance, you'll have access to the api in the browser for testing.
-
-Curl example:
-```shell
-curl -v /api/video// \
- -H "Authorization: Token xxxxxxxxxx"
-```
-
-Python requests example:
-```python
-import requests
-
-url = "/api/video//"
-headers = {"Authorization": "Token xxxxxxxxxx"}
-response = requests.get(url, headers=headers)
-```
-
-## Pagination
-The list views return a paginate object with the following keys:
-- page_size: int current page size set in config
-- page_from: int first result idx
-- prev_pages: array of ints of previous pages, if available
-- current_page: int current page from query
-- max_hits: reached: bool if max of 10k results is reached
-- last_page: int of last page link
-- next_pages: array of ints of next pages
-- total_hits: int total results
-
-Pass page number as a query parameter: `page=2`. Defaults to *0*, `page=1` is redundant and falls back to *0*. If a page query doesn't return any results, you'll get `HTTP 404 Not Found`.
-
-## Login View
-Return token and user ID for username and password:
-POST /api/login
-```json
-{
- "username": "tubearchivist",
- "password": "verysecret"
-}
-```
-
-after successful login returns
-```json
-{
- "token": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
- "user_id": 1
-}
-```
-
-## Video List View
-/api/video/
-
-## Video Item View
-/api/video/\/
-
-## Video Progress View
-/api/video/\/progress
-
-Progress is stored for each user.
-
-### Get last player position of a video
-GET /api/video/\/progress
-```json
-{
- "youtube_id": "",
- "user_id": 1,
- "position": 100
-}
-```
-
-### Post player position of video
-POST /api/video/\/progress
-```json
-{
- "position": 100
-}
-```
-
-### Delete player position of video
-DELETE /api/video/\/progress
-
-
-## Sponsor Block View
-/api/video/\/sponsor/
-
-Integrate with sponsorblock
-
-### Get list of segments
-GET /api/video/\/sponsor/
-
-
-### Vote on existing segment
-**This only simulates the request**
-POST /api/video/\/sponsor/
-```json
-{
- "vote": {
- "uuid": "",
- "yourVote": 1
- }
-}
-```
-yourVote needs to be *int*: 0 for downvote, 1 for upvote, 20 to undo vote
-
-### Create new segment
-**This only simulates the request**
-POST /api/video/\/sponsor/
-```json
-{
- "segment": {
- "startTime": 5,
- "endTime": 10
- }
-}
-```
-Timestamps either *int* or *float*, end time can't be before start time.
-
-
-## Channel List View
-/api/channel/
-
-### Subscribe to a list of channels
-POST /api/channel/
-```json
-{
- "data": [
- {"channel_id": "UC9-y-6csu5WGm29I7JiwpnA", "channel_subscribed": true}
- ]
-}
-```
-
-## Channel Item View
-/api/channel/\/
-
-## Channel Videos View
-/api/channel/\/video/
-
-## Playlist List View
-/api/playlist/
-
-## Playlists Item View
-/api/playlist/\/
-
-## Playlist Videos View
-/api/playlist/\/video/
-
-## Download Queue List View
-GET /api/download/
-
-Parameter:
-- filter: pending, ignore
-
-### Add list of videos to download queue
-POST /api/download/
-```json
-{
- "data": [
- {"youtube_id": "NYj3DnI81AQ", "status": "pending"}
- ]
-}
-```
-
-### Delete download queue items by filter
-DELETE /api/download/?filter=ignore
-DELETE /api/download/?filter=pending
-
-## Download Queue Item View
-GET /api/download/\/
-POST /api/download/\/
-
-Ignore video in download queue:
-```json
-{
- "status": "ignore"
-}
-```
-
-Add to queue previously ignored video:
-```json
-{
- "status": "pending"
-}
-```
-
-DELETE /api/download/\/
-Forget or delete from download queue
-
-## Ping View
-Validate your connection with the API
-GET /api/ping
-
-When valid returns message with user id:
-```json
-{
- "response": "pong",
- "user": 1
-}
-```
diff --git a/tubearchivist/api/__init__.py b/tubearchivist/api/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/api/admin.py b/tubearchivist/api/admin.py
deleted file mode 100644
index 4fd5490..0000000
--- a/tubearchivist/api/admin.py
+++ /dev/null
@@ -1,3 +0,0 @@
-from django.contrib import admin # noqa: F401
-
-# Register your models here.
diff --git a/tubearchivist/api/apps.py b/tubearchivist/api/apps.py
deleted file mode 100644
index 9acf5fb..0000000
--- a/tubearchivist/api/apps.py
+++ /dev/null
@@ -1,10 +0,0 @@
-"""apps file for api package"""
-
-from django.apps import AppConfig
-
-
-class ApiConfig(AppConfig):
- """app config"""
-
- default_auto_field = "django.db.models.BigAutoField"
- name = "api"
diff --git a/tubearchivist/api/migrations/__init__.py b/tubearchivist/api/migrations/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/api/models.py b/tubearchivist/api/models.py
deleted file mode 100644
index b225e99..0000000
--- a/tubearchivist/api/models.py
+++ /dev/null
@@ -1,3 +0,0 @@
-"""api models"""
-
-# from django.db import models
diff --git a/tubearchivist/api/serializers.py b/tubearchivist/api/serializers.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/api/src/__init__.py b/tubearchivist/api/src/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/api/src/search_processor.py b/tubearchivist/api/src/search_processor.py
deleted file mode 100644
index 6a1e2dd..0000000
--- a/tubearchivist/api/src/search_processor.py
+++ /dev/null
@@ -1,125 +0,0 @@
-"""
-Functionality:
-- processing search results for frontend
-- this is duplicated code from home.src.frontend.searching.SearchHandler
-"""
-
-import urllib.parse
-
-from home.src.download.thumbnails import ThumbManager
-from home.src.ta.config import AppConfig
-from home.src.ta.helper import date_praser
-
-
-class SearchProcess:
- """process search results"""
-
- CONFIG = AppConfig().config
- CACHE_DIR = CONFIG["application"]["cache_dir"]
-
- def __init__(self, response):
- self.response = response
- self.processed = False
-
- def process(self):
- """dedect type and process"""
- if "_source" in self.response.keys():
- # single
- self.processed = self._process_result(self.response)
-
- elif "hits" in self.response.keys():
- # multiple
- self.processed = []
- all_sources = self.response["hits"]["hits"]
- for result in all_sources:
- self.processed.append(self._process_result(result))
-
- return self.processed
-
- def _process_result(self, result):
- """dedect which type of data to process"""
- index = result["_index"]
- processed = False
- if index == "ta_video":
- processed = self._process_video(result["_source"])
- if index == "ta_channel":
- processed = self._process_channel(result["_source"])
- if index == "ta_playlist":
- processed = self._process_playlist(result["_source"])
- if index == "ta_download":
- processed = self._process_download(result["_source"])
-
- return processed
-
- @staticmethod
- def _process_channel(channel_dict):
- """run on single channel"""
- channel_id = channel_dict["channel_id"]
- art_base = f"/cache/channels/{channel_id}"
- date_str = date_praser(channel_dict["channel_last_refresh"])
- channel_dict.update(
- {
- "channel_last_refresh": date_str,
- "channel_banner_url": f"{art_base}_banner.jpg",
- "channel_thumb_url": f"{art_base}_thumb.jpg",
- "channel_tvart_url": False,
- }
- )
-
- return dict(sorted(channel_dict.items()))
-
- def _process_video(self, video_dict):
- """run on single video dict"""
- video_id = video_dict["youtube_id"]
- media_url = urllib.parse.quote(video_dict["media_url"])
- vid_last_refresh = date_praser(video_dict["vid_last_refresh"])
- published = date_praser(video_dict["published"])
- vid_thumb_url = ThumbManager().vid_thumb_path(video_id)
- channel = self._process_channel(video_dict["channel"])
-
- if "subtitles" in video_dict:
- for idx, _ in enumerate(video_dict["subtitles"]):
- url = video_dict["subtitles"][idx]["media_url"]
- video_dict["subtitles"][idx]["media_url"] = f"/media/{url}"
-
- video_dict.update(
- {
- "channel": channel,
- "media_url": f"/media/{media_url}",
- "vid_last_refresh": vid_last_refresh,
- "published": published,
- "vid_thumb_url": f"{self.CACHE_DIR}/{vid_thumb_url}",
- }
- )
-
- return dict(sorted(video_dict.items()))
-
- @staticmethod
- def _process_playlist(playlist_dict):
- """run on single playlist dict"""
- playlist_id = playlist_dict["playlist_id"]
- playlist_last_refresh = date_praser(
- playlist_dict["playlist_last_refresh"]
- )
- playlist_dict.update(
- {
- "playlist_thumbnail": f"/cache/playlists/{playlist_id}.jpg",
- "playlist_last_refresh": playlist_last_refresh,
- }
- )
-
- return dict(sorted(playlist_dict.items()))
-
- def _process_download(self, download_dict):
- """run on single download item"""
- video_id = download_dict["youtube_id"]
- vid_thumb_url = ThumbManager().vid_thumb_path(video_id)
- published = date_praser(download_dict["published"])
-
- download_dict.update(
- {
- "vid_thumb_url": f"{self.CACHE_DIR}/{vid_thumb_url}",
- "published": published,
- }
- )
- return dict(sorted(download_dict.items()))
diff --git a/tubearchivist/api/tests.py b/tubearchivist/api/tests.py
deleted file mode 100644
index e55d689..0000000
--- a/tubearchivist/api/tests.py
+++ /dev/null
@@ -1,3 +0,0 @@
-from django.test import TestCase # noqa: F401
-
-# Create your tests here.
diff --git a/tubearchivist/api/urls.py b/tubearchivist/api/urls.py
deleted file mode 100644
index b19f5c7..0000000
--- a/tubearchivist/api/urls.py
+++ /dev/null
@@ -1,84 +0,0 @@
-"""all api urls"""
-
-from api.views import (
- ChannelApiListView,
- ChannelApiVideoView,
- ChannelApiView,
- DownloadApiListView,
- DownloadApiView,
- LoginApiView,
- PingView,
- PlaylistApiListView,
- PlaylistApiVideoView,
- PlaylistApiView,
- VideoApiListView,
- VideoApiView,
- VideoProgressView,
- VideoSponsorView,
-)
-from django.urls import path
-
-urlpatterns = [
- path("ping/", PingView.as_view(), name="ping"),
- path("login/", LoginApiView.as_view(), name="api-login"),
- path(
- "video/",
- VideoApiListView.as_view(),
- name="api-video-list",
- ),
- path(
- "video//",
- VideoApiView.as_view(),
- name="api-video",
- ),
- path(
- "video//progress/",
- VideoProgressView.as_view(),
- name="api-video-progress",
- ),
- path(
- "video//sponsor/",
- VideoSponsorView.as_view(),
- name="api-video-sponsor",
- ),
- path(
- "channel/",
- ChannelApiListView.as_view(),
- name="api-channel-list",
- ),
- path(
- "channel//",
- ChannelApiView.as_view(),
- name="api-channel",
- ),
- path(
- "channel//video/",
- ChannelApiVideoView.as_view(),
- name="api-channel-video",
- ),
- path(
- "playlist/",
- PlaylistApiListView.as_view(),
- name="api-playlist-list",
- ),
- path(
- "playlist//",
- PlaylistApiView.as_view(),
- name="api-playlist",
- ),
- path(
- "playlist//video/",
- PlaylistApiVideoView.as_view(),
- name="api-playlist-video",
- ),
- path(
- "download/",
- DownloadApiListView.as_view(),
- name="api-download-list",
- ),
- path(
- "download//",
- DownloadApiView.as_view(),
- name="api-download",
- ),
-]
diff --git a/tubearchivist/api/views.py b/tubearchivist/api/views.py
deleted file mode 100644
index a92cf60..0000000
--- a/tubearchivist/api/views.py
+++ /dev/null
@@ -1,448 +0,0 @@
-"""all API views"""
-
-from api.src.search_processor import SearchProcess
-from home.src.download.queue import PendingInteract
-from home.src.es.connect import ElasticWrap
-from home.src.index.generic import Pagination
-from home.src.index.video import SponsorBlock
-from home.src.ta.config import AppConfig
-from home.src.ta.helper import UrlListParser
-from home.src.ta.ta_redis import RedisArchivist, RedisQueue
-from home.tasks import extrac_dl, subscribe_to
-from rest_framework.authentication import (
- SessionAuthentication,
- TokenAuthentication,
-)
-from rest_framework.authtoken.models import Token
-from rest_framework.authtoken.views import ObtainAuthToken
-from rest_framework.permissions import IsAuthenticated
-from rest_framework.response import Response
-from rest_framework.views import APIView
-
-
-class ApiBaseView(APIView):
- """base view to inherit from"""
-
- authentication_classes = [SessionAuthentication, TokenAuthentication]
- permission_classes = [IsAuthenticated]
- search_base = False
- data = False
-
- def __init__(self):
- super().__init__()
- self.response = {"data": False, "config": AppConfig().config}
- self.data = {"query": {"match_all": {}}}
- self.status_code = False
- self.context = False
- self.pagination_handler = False
-
- def get_document(self, document_id):
- """get single document from es"""
- path = f"{self.search_base}{document_id}"
- print(path)
- response, status_code = ElasticWrap(path).get()
- try:
- self.response["data"] = SearchProcess(response).process()
- except KeyError:
- print(f"item not found: {document_id}")
- self.response["data"] = False
- self.status_code = status_code
-
- def initiate_pagination(self, request):
- """set initial pagination values"""
- user_id = request.user.id
- page_get = int(request.GET.get("page", 0))
- self.pagination_handler = Pagination(page_get, user_id)
- self.data.update(
- {
- "size": self.pagination_handler.pagination["page_size"],
- "from": self.pagination_handler.pagination["page_from"],
- }
- )
-
- def get_document_list(self, request):
- """get a list of results"""
- print(self.search_base)
- self.initiate_pagination(request)
- es_handler = ElasticWrap(self.search_base)
- response, status_code = es_handler.get(data=self.data)
- self.response["data"] = SearchProcess(response).process()
- if self.response["data"]:
- self.status_code = status_code
- else:
- self.status_code = 404
-
- self.pagination_handler.validate(response["hits"]["total"]["value"])
- self.response["paginate"] = self.pagination_handler.pagination
-
-
-class VideoApiView(ApiBaseView):
- """resolves to /api/video//
- GET: returns metadata dict of video
- """
-
- search_base = "ta_video/_doc/"
-
- def get(self, request, video_id):
- # pylint: disable=unused-argument
- """get request"""
- self.get_document(video_id)
- return Response(self.response, status=self.status_code)
-
-
-class VideoApiListView(ApiBaseView):
- """resolves to /api/video/
- GET: returns list of videos
- """
-
- search_base = "ta_video/_search/"
-
- def get(self, request):
- """get request"""
- self.data.update({"sort": [{"published": {"order": "desc"}}]})
- self.get_document_list(request)
-
- return Response(self.response)
-
-
-class VideoProgressView(ApiBaseView):
- """resolves to /api/video//
- handle progress status for video
- """
-
- def get(self, request, video_id):
- """get progress for a single video"""
- user_id = request.user.id
- key = f"{user_id}:progress:{video_id}"
- video_progress = RedisArchivist().get_message(key)
- position = video_progress.get("position", 0)
-
- self.response = {
- "youtube_id": video_id,
- "user_id": user_id,
- "position": position,
- }
- return Response(self.response)
-
- def post(self, request, video_id):
- """set progress position in redis"""
- position = request.data.get("position", 0)
- key = f"{request.user.id}:progress:{video_id}"
- message = {"position": position, "youtube_id": video_id}
- RedisArchivist().set_message(key, message, expire=False)
- self.response = request.data
-
- return Response(self.response)
-
- def delete(self, request, video_id):
- """delete progress position"""
- key = f"{request.user.id}:progress:{video_id}"
- RedisArchivist().del_message(key)
- self.response = {"progress-reset": video_id}
-
- return Response(self.response)
-
-
-class VideoSponsorView(ApiBaseView):
- """resolves to /api/video//sponsor/
- handle sponsor block integration
- """
-
- search_base = "ta_video/_doc/"
-
- def get(self, request, video_id):
- """get sponsor info"""
- # pylint: disable=unused-argument
-
- self.get_document(video_id)
- sponsorblock = self.response["data"].get("sponsorblock")
-
- return Response(sponsorblock)
-
- def post(self, request, video_id):
- """post verification and timestamps"""
- if "segment" in request.data:
- response, status_code = self._create_segment(request, video_id)
- elif "vote" in request.data:
- response, status_code = self._vote_on_segment(request)
-
- return Response(response, status=status_code)
-
- @staticmethod
- def _create_segment(request, video_id):
- """create segment in API"""
- start_time = request.data["segment"]["startTime"]
- end_time = request.data["segment"]["endTime"]
- response, status_code = SponsorBlock(request.user.id).post_timestamps(
- video_id, start_time, end_time
- )
-
- return response, status_code
-
- @staticmethod
- def _vote_on_segment(request):
- """validate on existing segment"""
- user_id = request.user.id
- uuid = request.data["vote"]["uuid"]
- vote = request.data["vote"]["yourVote"]
- response, status_code = SponsorBlock(user_id).vote_on_segment(
- uuid, vote
- )
-
- return response, status_code
-
-
-class ChannelApiView(ApiBaseView):
- """resolves to /api/channel//
- GET: returns metadata dict of channel
- """
-
- search_base = "ta_channel/_doc/"
-
- def get(self, request, channel_id):
- # pylint: disable=unused-argument
- """get request"""
- self.get_document(channel_id)
- return Response(self.response, status=self.status_code)
-
-
-class ChannelApiListView(ApiBaseView):
- """resolves to /api/channel/
- GET: returns list of channels
- POST: edit a list of channels
- """
-
- search_base = "ta_channel/_search/"
-
- def get(self, request):
- """get request"""
- self.get_document_list(request)
- self.data.update(
- {"sort": [{"channel_name.keyword": {"order": "asc"}}]}
- )
-
- return Response(self.response)
-
- @staticmethod
- def post(request):
- """subscribe to list of channels"""
- data = request.data
- try:
- to_add = data["data"]
- except KeyError:
- message = "missing expected data key"
- print(message)
- return Response({"message": message}, status=400)
-
- pending = [i["channel_id"] for i in to_add if i["channel_subscribed"]]
- url_str = " ".join(pending)
- subscribe_to.delay(url_str)
-
- return Response(data)
-
-
-class ChannelApiVideoView(ApiBaseView):
- """resolves to /api/channel//video
- GET: returns a list of videos of channel
- """
-
- search_base = "ta_video/_search/"
-
- def get(self, request, channel_id):
- """handle get request"""
- self.data.update(
- {
- "query": {
- "term": {"channel.channel_id": {"value": channel_id}}
- },
- "sort": [{"published": {"order": "desc"}}],
- }
- )
- self.get_document_list(request)
-
- return Response(self.response, status=self.status_code)
-
-
-class PlaylistApiListView(ApiBaseView):
- """resolves to /api/playlist/
- GET: returns list of indexed playlists
- """
-
- search_base = "ta_playlist/_search/"
-
- def get(self, request):
- """handle get request"""
- self.data.update(
- {"sort": [{"playlist_name.keyword": {"order": "asc"}}]}
- )
- self.get_document_list(request)
- return Response(self.response)
-
-
-class PlaylistApiView(ApiBaseView):
- """resolves to /api/playlist//
- GET: returns metadata dict of playlist
- """
-
- search_base = "ta_playlist/_doc/"
-
- def get(self, request, playlist_id):
- # pylint: disable=unused-argument
- """get request"""
- self.get_document(playlist_id)
- return Response(self.response, status=self.status_code)
-
-
-class PlaylistApiVideoView(ApiBaseView):
- """resolves to /api/playlist//video
- GET: returns list of videos in playlist
- """
-
- search_base = "ta_video/_search/"
-
- def get(self, request, playlist_id):
- """handle get request"""
- self.data["query"] = {
- "term": {"playlist.keyword": {"value": playlist_id}}
- }
- self.data.update({"sort": [{"published": {"order": "desc"}}]})
-
- self.get_document_list(request)
- return Response(self.response, status=self.status_code)
-
-
-class DownloadApiView(ApiBaseView):
- """resolves to /api/download//
- GET: returns metadata dict of an item in the download queue
- POST: update status of item to pending or ignore
- DELETE: forget from download queue
- """
-
- search_base = "ta_download/_doc/"
- valid_status = ["pending", "ignore"]
-
- def get(self, request, video_id):
- # pylint: disable=unused-argument
- """get request"""
- self.get_document(video_id)
- return Response(self.response, status=self.status_code)
-
- def post(self, request, video_id):
- """post to video to change status"""
- item_status = request.data["status"]
- if item_status not in self.valid_status:
- message = f"{video_id}: invalid status {item_status}"
- print(message)
- return Response({"message": message}, status=400)
-
- print(f"{video_id}: change status to {item_status}")
- PendingInteract(video_id=video_id, status=item_status).update_status()
- RedisQueue().clear_item(video_id)
-
- return Response(request.data)
-
- @staticmethod
- def delete(request, video_id):
- # pylint: disable=unused-argument
- """delete single video from queue"""
- print(f"{video_id}: delete from queue")
- PendingInteract(video_id=video_id).delete_item()
-
- return Response({"success": True})
-
-
-class DownloadApiListView(ApiBaseView):
- """resolves to /api/download/
- GET: returns latest videos in the download queue
- POST: add a list of videos to download queue
- DELETE: remove items based on query filter
- """
-
- search_base = "ta_download/_search/"
- valid_filter = ["pending", "ignore"]
-
- def get(self, request):
- """get request"""
- query_filter = request.GET.get("filter", False)
- self.data.update({"sort": [{"timestamp": {"order": "asc"}}]})
- if query_filter:
- if query_filter not in self.valid_filter:
- message = f"invalid url query filder: {query_filter}"
- print(message)
- return Response({"message": message}, status=400)
-
- self.data["query"] = {"term": {"status": {"value": query_filter}}}
-
- self.get_document_list(request)
- return Response(self.response)
-
- @staticmethod
- def post(request):
- """add list of videos to download queue"""
- print(f"request meta data: {request.META}")
- data = request.data
- try:
- to_add = data["data"]
- except KeyError:
- message = "missing expected data key"
- print(message)
- return Response({"message": message}, status=400)
-
- pending = [i["youtube_id"] for i in to_add if i["status"] == "pending"]
- url_str = " ".join(pending)
- try:
- youtube_ids = UrlListParser(url_str).process_list()
- except ValueError:
- message = f"failed to parse: {url_str}"
- print(message)
- return Response({"message": message}, status=400)
-
- extrac_dl.delay(youtube_ids)
-
- return Response(data)
-
- def delete(self, request):
- """delete download queue"""
- query_filter = request.GET.get("filter", False)
- if query_filter not in self.valid_filter:
- message = f"invalid url query filter: {query_filter}"
- print(message)
- return Response({"message": message}, status=400)
-
- message = f"delete queue by status: {query_filter}"
- print(message)
- PendingInteract(status=query_filter).delete_by_status()
-
- return Response({"message": message})
-
-
-class PingView(ApiBaseView):
- """resolves to /api/ping/
- GET: test your connection
- """
-
- @staticmethod
- def get(request):
- """get pong"""
- data = {"response": "pong", "user": request.user.id}
- return Response(data)
-
-
-class LoginApiView(ObtainAuthToken):
- """resolves to /api/login/
- POST: return token and username after successful login
- """
-
- def post(self, request, *args, **kwargs):
- """post data"""
- # pylint: disable=no-member
- serializer = self.serializer_class(
- data=request.data, context={"request": request}
- )
- serializer.is_valid(raise_exception=True)
- user = serializer.validated_data["user"]
- token, _ = Token.objects.get_or_create(user=user)
-
- print(f"returning token for user with id {user.pk}")
-
- return Response({"token": token.key, "user_id": user.pk})
diff --git a/tubearchivist/config/__init__.py b/tubearchivist/config/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/config/asgi.py b/tubearchivist/config/asgi.py
deleted file mode 100644
index c31b0ec..0000000
--- a/tubearchivist/config/asgi.py
+++ /dev/null
@@ -1,16 +0,0 @@
-"""
-ASGI config for config project.
-
-It exposes the ASGI callable as a module-level variable named ``application``.
-
-For more information on this file, see
-https://docs.djangoproject.com/en/3.2/howto/deployment/asgi/
-"""
-
-import os
-
-from django.core.asgi import get_asgi_application
-
-os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings")
-
-application = get_asgi_application()
diff --git a/tubearchivist/config/settings.py b/tubearchivist/config/settings.py
deleted file mode 100644
index 8e8bf5a..0000000
--- a/tubearchivist/config/settings.py
+++ /dev/null
@@ -1,157 +0,0 @@
-"""
-Django settings for config project.
-
-Generated by 'django-admin startproject' using Django 3.2.5.
-
-For more information on this file, see
-https://docs.djangoproject.com/en/3.2/topics/settings/
-
-For the full list of settings and their values, see
-https://docs.djangoproject.com/en/3.2/ref/settings/
-"""
-
-import hashlib
-from os import environ, path
-from pathlib import Path
-
-from corsheaders.defaults import default_headers
-from home.src.ta.config import AppConfig
-
-# Build paths inside the project like this: BASE_DIR / 'subdir'.
-BASE_DIR = Path(__file__).resolve().parent.parent
-
-
-# Quick-start development settings - unsuitable for production
-# See https://docs.djangoproject.com/en/3.2/howto/deployment/checklist/
-
-PW_HASH = hashlib.sha256(environ.get("TA_PASSWORD").encode())
-SECRET_KEY = PW_HASH.hexdigest()
-
-# SECURITY WARNING: don't run with debug turned on in production!
-DEBUG = bool(environ.get("DJANGO_DEBUG"))
-
-ALLOWED_HOSTS = ["*"]
-
-
-# Application definition
-
-INSTALLED_APPS = [
- "home.apps.HomeConfig",
- "django.contrib.admin",
- "django.contrib.auth",
- "django.contrib.contenttypes",
- "django.contrib.sessions",
- "django.contrib.messages",
- "corsheaders",
- "whitenoise.runserver_nostatic",
- "django.contrib.staticfiles",
- "django.contrib.humanize",
- "rest_framework",
- "rest_framework.authtoken",
- "api",
-]
-
-MIDDLEWARE = [
- "django.middleware.security.SecurityMiddleware",
- "django.contrib.sessions.middleware.SessionMiddleware",
- "corsheaders.middleware.CorsMiddleware",
- "whitenoise.middleware.WhiteNoiseMiddleware",
- "django.middleware.common.CommonMiddleware",
- "django.middleware.csrf.CsrfViewMiddleware",
- "django.contrib.auth.middleware.AuthenticationMiddleware",
- "django.contrib.messages.middleware.MessageMiddleware",
- "django.middleware.clickjacking.XFrameOptionsMiddleware",
-]
-
-ROOT_URLCONF = "config.urls"
-
-TEMPLATES = [
- {
- "BACKEND": "django.template.backends.django.DjangoTemplates",
- "DIRS": [],
- "APP_DIRS": True,
- "OPTIONS": {
- "context_processors": [
- "django.template.context_processors.debug",
- "django.template.context_processors.request",
- "django.contrib.auth.context_processors.auth",
- "django.contrib.messages.context_processors.messages",
- ],
- },
- },
-]
-
-WSGI_APPLICATION = "config.wsgi.application"
-
-
-# Database
-# https://docs.djangoproject.com/en/3.2/ref/settings/#databases
-
-CACHE_DIR = AppConfig().config["application"]["cache_dir"]
-DB_PATH = path.join(CACHE_DIR, "db.sqlite3")
-DATABASES = {
- "default": {
- "ENGINE": "django.db.backends.sqlite3",
- "NAME": DB_PATH,
- }
-}
-
-
-# Password validation
-# https://docs.djangoproject.com/en/3.2/ref/settings/#auth-password-validators
-
-AUTH_PASSWORD_VALIDATORS = [
- {
- "NAME": "django.contrib.auth.password_validation.UserAttributeSimilarityValidator", # noqa: E501
- },
- {
- "NAME": "django.contrib.auth.password_validation.MinimumLengthValidator", # noqa: E501
- },
- {
- "NAME": "django.contrib.auth.password_validation.CommonPasswordValidator", # noqa: E501
- },
- {
- "NAME": "django.contrib.auth.password_validation.NumericPasswordValidator", # noqa: E501
- },
-]
-
-AUTH_USER_MODEL = "home.Account"
-
-
-# Internationalization
-# https://docs.djangoproject.com/en/3.2/topics/i18n/
-
-LANGUAGE_CODE = "en-us"
-TIME_ZONE = environ.get("TZ") or "UTC"
-USE_I18N = True
-USE_L10N = True
-USE_TZ = True
-
-
-# Static files (CSS, JavaScript, Images)
-# https://docs.djangoproject.com/en/3.2/howto/static-files/
-
-STATIC_URL = "/static/"
-STATICFILES_DIRS = (str(BASE_DIR.joinpath("static")),)
-STATIC_ROOT = str(BASE_DIR.joinpath("staticfiles"))
-STATICFILES_STORAGE = "whitenoise.storage.CompressedManifestStaticFilesStorage"
-
-# Default primary key field type
-# https://docs.djangoproject.com/en/3.2/ref/settings/#default-auto-field
-
-DEFAULT_AUTO_FIELD = "django.db.models.BigAutoField"
-
-LOGIN_URL = "/login/"
-LOGOUT_REDIRECT_URL = "/login/"
-
-# Cors needed for browser extension
-# background.js makes the request so HTTP_ORIGIN will be from extension
-CORS_ALLOWED_ORIGIN_REGEXES = [r"moz-extension://*", r"chrome-extension://*"]
-
-CORS_ALLOW_HEADERS = list(default_headers) + [
- "mode",
-]
-
-# TA application settings
-TA_UPSTREAM = "https://github.com/tubearchivist/tubearchivist"
-TA_VERSION = "v0.1.4"
diff --git a/tubearchivist/config/urls.py b/tubearchivist/config/urls.py
deleted file mode 100644
index 11a1ed7..0000000
--- a/tubearchivist/config/urls.py
+++ /dev/null
@@ -1,23 +0,0 @@
-"""config URL Configuration
-
-The `urlpatterns` list routes URLs to views. For more information please see:
- https://docs.djangoproject.com/en/3.2/topics/http/urls/
-Examples:
-Function views
- 1. Add an import: from my_app import views
- 2. Add a URL to urlpatterns: path('', views.home, name='home')
-Class-based views
- 1. Add an import: from other_app.views import Home
- 2. Add a URL to urlpatterns: path('', Home.as_view(), name='home')
-Including another URLconf
- 1. Import the include() function: from django.urls import include, path
- 2. Add a URL to urlpatterns: path('blog/', include('blog.urls'))
-"""
-from django.contrib import admin
-from django.urls import include, path
-
-urlpatterns = [
- path("", include("home.urls")),
- path("api/", include("api.urls")),
- path("admin/", admin.site.urls),
-]
diff --git a/tubearchivist/config/wsgi.py b/tubearchivist/config/wsgi.py
deleted file mode 100644
index 3d6fa1b..0000000
--- a/tubearchivist/config/wsgi.py
+++ /dev/null
@@ -1,16 +0,0 @@
-"""
-WSGI config for config project.
-
-It exposes the WSGI callable as a module-level variable named ``application``.
-
-For more information on this file, see
-https://docs.djangoproject.com/en/3.2/howto/deployment/wsgi/
-"""
-
-import os
-
-from django.core.wsgi import get_wsgi_application
-
-os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings")
-
-application = get_wsgi_application()
diff --git a/tubearchivist/home/__init__.py b/tubearchivist/home/__init__.py
deleted file mode 100644
index 736385d..0000000
--- a/tubearchivist/home/__init__.py
+++ /dev/null
@@ -1,5 +0,0 @@
-""" handle celery startup """
-
-from .tasks import app as celery_app
-
-__all__ = ("celery_app",)
diff --git a/tubearchivist/home/admin.py b/tubearchivist/home/admin.py
deleted file mode 100644
index 2bcb701..0000000
--- a/tubearchivist/home/admin.py
+++ /dev/null
@@ -1,36 +0,0 @@
-"""custom admin classes"""
-
-from django.contrib import admin
-from django.contrib.auth.admin import UserAdmin as BaseUserAdmin
-
-from .models import Account
-
-
-class HomeAdmin(BaseUserAdmin):
- """register in admin page"""
-
- list_display = ("name", "is_staff", "is_superuser")
- list_filter = ("is_superuser",)
-
- fieldsets = (
- (None, {"fields": ("is_staff", "is_superuser", "password")}),
- ("Personal info", {"fields": ("name",)}),
- ("Groups", {"fields": ("groups",)}),
- ("Permissions", {"fields": ("user_permissions",)}),
- )
- add_fieldsets = (
- (
- None,
- {"fields": ("is_staff", "is_superuser", "password1", "password2")},
- ),
- ("Personal info", {"fields": ("name",)}),
- ("Groups", {"fields": ("groups",)}),
- ("Permissions", {"fields": ("user_permissions",)}),
- )
-
- search_fields = ("name",)
- ordering = ("name",)
- filter_horizontal = ()
-
-
-admin.site.register(Account, HomeAdmin)
diff --git a/tubearchivist/home/apps.py b/tubearchivist/home/apps.py
deleted file mode 100644
index 1053bc6..0000000
--- a/tubearchivist/home/apps.py
+++ /dev/null
@@ -1,120 +0,0 @@
-"""handle custom startup functions"""
-
-import os
-import sys
-
-from django.apps import AppConfig
-from home.src.es.connect import ElasticWrap
-from home.src.es.index_setup import index_check
-from home.src.ta.config import AppConfig as ArchivistConfig
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class StartupCheck:
- """checks to run at application startup"""
-
- MIN_MAJOR, MAX_MAJOR = 7, 7
- MIN_MINOR = 17
-
- def __init__(self):
- self.config_handler = ArchivistConfig()
- self.redis_con = RedisArchivist()
- self.has_run = self.get_has_run()
-
- def run(self):
- """run all startup checks"""
- print("run startup checks")
- self.es_version_check()
- self.release_lock()
- index_check()
- self.sync_redis_state()
- self.make_folders()
- self.set_has_run()
-
- def get_has_run(self):
- """validate if check has already executed"""
- return self.redis_con.get_message("startup_check")
-
- def set_has_run(self):
- """startup checks run"""
- message = {"status": True}
- self.redis_con.set_message("startup_check", message, expire=120)
-
- def sync_redis_state(self):
- """make sure redis gets new config.json values"""
- print("sync redis")
- self.config_handler.load_new_defaults()
-
- def make_folders(self):
- """make needed cache folders here so docker doesn't mess it up"""
- folders = [
- "download",
- "channels",
- "videos",
- "playlists",
- "import",
- "backup",
- ]
- cache_dir = self.config_handler.config["application"]["cache_dir"]
- for folder in folders:
- folder_path = os.path.join(cache_dir, folder)
- try:
- os.makedirs(folder_path)
- except FileExistsError:
- continue
-
- def release_lock(self):
- """make sure there are no leftover locks set in redis"""
- all_locks = [
- "startup_check",
- "manual_import",
- "downloading",
- "dl_queue",
- "dl_queue_id",
- "rescan",
- ]
- for lock in all_locks:
- response = self.redis_con.del_message(lock)
- if response:
- print("deleted leftover key from redis: " + lock)
-
- def is_invalid(self, version):
- """return true if es version is invalid, false if ok"""
- major, minor = [int(i) for i in version.split(".")[:2]]
- if not self.MIN_MAJOR <= major <= self.MAX_MAJOR:
- return True
-
- if minor >= self.MIN_MINOR:
- return False
-
- return True
-
- def es_version_check(self):
- """check for minimal elasticsearch version"""
- response, _ = ElasticWrap("/").get()
- version = response["version"]["number"]
- invalid = self.is_invalid(version)
-
- if invalid:
- print(
- "required elasticsearch version: "
- + f"{self.MIN_MAJOR}.{self.MIN_MINOR}"
- )
- sys.exit(1)
-
- print("elasticsearch version check passed")
-
-
-class HomeConfig(AppConfig):
- """call startup funcs"""
-
- default_auto_field = "django.db.models.BigAutoField"
- name = "home"
-
- def ready(self):
- startup = StartupCheck()
- if startup.has_run["status"]:
- print("startup checks run in other thread")
- return
-
- startup.run()
diff --git a/tubearchivist/home/config.json b/tubearchivist/home/config.json
deleted file mode 100644
index edb6356..0000000
--- a/tubearchivist/home/config.json
+++ /dev/null
@@ -1,50 +0,0 @@
-{
- "archive": {
- "sort_by": "published",
- "sort_order": "desc",
- "page_size": 12
- },
- "default_view": {
- "home": "grid",
- "channel": "list",
- "downloads": "list",
- "playlist": "grid"
- },
- "subscriptions": {
- "auto_search": false,
- "auto_download": false,
- "channel_size": 50
- },
- "downloads": {
- "limit_count": false,
- "limit_speed": false,
- "sleep_interval": 3,
- "autodelete_days": false,
- "format": false,
- "add_metadata": false,
- "add_thumbnail": false,
- "subtitle": false,
- "subtitle_source": false,
- "subtitle_index": false,
- "throttledratelimit": false,
- "integrate_ryd": false,
- "integrate_sponsorblock": false
- },
- "application": {
- "app_root": "/app",
- "cache_dir": "/cache",
- "videos": "/youtube",
- "file_template": "%(id)s_%(title)s.mp4",
- "colors": "dark",
- "enable_cast": false
- },
- "scheduler": {
- "update_subscribed": false,
- "download_pending": false,
- "check_reindex": {"minute": "0", "hour": "12", "day_of_week": "*"},
- "check_reindex_days": 90,
- "thumbnail_check": {"minute": "0", "hour": "17", "day_of_week": "*"},
- "run_backup": {"minute": "0", "hour": "8", "day_of_week": "0"},
- "run_backup_rotate": 5
- }
-}
diff --git a/tubearchivist/home/migrations/__init__.py b/tubearchivist/home/migrations/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/models.py b/tubearchivist/home/models.py
deleted file mode 100644
index d99b2db..0000000
--- a/tubearchivist/home/models.py
+++ /dev/null
@@ -1,53 +0,0 @@
-"""custom models"""
-from django.contrib.auth.models import (
- AbstractBaseUser,
- BaseUserManager,
- PermissionsMixin,
-)
-from django.db import models
-
-
-class AccountManager(BaseUserManager):
- """manage user creation methods"""
-
- use_in_migrations = True
-
- def _create_user(self, name, password, **extra_fields):
- """create regular user private"""
- values = [name, password]
- field_value_map = dict(zip(self.model.REQUIRED_FIELDS, values))
- for field_name, value in field_value_map.items():
- if not value:
- raise ValueError(f"The {field_name} value must be set")
-
- user = self.model(name=name, **extra_fields)
- user.set_password(password)
- user.save(using=self._db)
- return user
-
- def create_user(self, name, password):
- """create regular user public"""
- return self._create_user(name, password)
-
- def create_superuser(self, name, password, **extra_fields):
- """create super user"""
- extra_fields.setdefault("is_staff", True)
- extra_fields.setdefault("is_superuser", True)
-
- if extra_fields.get("is_staff") is not True:
- raise ValueError("Superuser must have is_staff=True.")
- if extra_fields.get("is_superuser") is not True:
- raise ValueError("Superuser must have is_superuser=True.")
-
- return self._create_user(name, password, **extra_fields)
-
-
-class Account(AbstractBaseUser, PermissionsMixin):
- """handle account creation"""
-
- name = models.CharField(max_length=150, unique=True)
- is_staff = models.BooleanField(default=False)
- objects = AccountManager()
-
- USERNAME_FIELD = "name"
- REQUIRED_FIELDS = ["password"]
diff --git a/tubearchivist/home/settings.py b/tubearchivist/home/settings.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/__init__.py b/tubearchivist/home/src/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/download/__init__.py b/tubearchivist/home/src/download/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/download/queue.py b/tubearchivist/home/src/download/queue.py
deleted file mode 100644
index d898b82..0000000
--- a/tubearchivist/home/src/download/queue.py
+++ /dev/null
@@ -1,267 +0,0 @@
-"""
-Functionality:
-- handle download queue
-- linked with ta_dowload index
-"""
-
-import json
-from datetime import datetime
-
-import yt_dlp
-from home.src.download.subscriptions import (
- ChannelSubscription,
- PlaylistSubscription,
-)
-from home.src.download.thumbnails import ThumbManager
-from home.src.es.connect import ElasticWrap, IndexPaginate
-from home.src.index.playlist import YoutubePlaylist
-from home.src.ta.helper import DurationConverter
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class PendingIndex:
- """base class holding all export methods"""
-
- def __init__(self):
- self.all_pending = False
- self.all_ignored = False
- self.all_videos = False
- self.all_channels = False
- self.channel_overwrites = False
- self.video_overwrites = False
- self.to_skip = False
-
- def get_download(self):
- """get a list of all pending videos in ta_download"""
- data = {
- "query": {"match_all": {}},
- "sort": [{"timestamp": {"order": "asc"}}],
- }
- all_results = IndexPaginate("ta_download", data).get_results()
-
- self.all_pending = []
- self.all_ignored = []
- self.to_skip = []
-
- for result in all_results:
- self.to_skip.append(result["youtube_id"])
- if result["status"] == "pending":
- self.all_pending.append(result)
- elif result["status"] == "ignore":
- self.all_ignored.append(result)
-
- def get_indexed(self):
- """get a list of all videos indexed"""
- data = {
- "query": {"match_all": {}},
- "sort": [{"published": {"order": "desc"}}],
- }
- self.all_videos = IndexPaginate("ta_video", data).get_results()
- for video in self.all_videos:
- self.to_skip.append(video["youtube_id"])
-
- def get_channels(self):
- """get a list of all channels indexed"""
- self.all_channels = []
- self.channel_overwrites = {}
- data = {
- "query": {"match_all": {}},
- "sort": [{"channel_id": {"order": "asc"}}],
- }
- channels = IndexPaginate("ta_channel", data).get_results()
-
- for channel in channels:
- channel_id = channel["channel_id"]
- self.all_channels.append(channel_id)
- if channel.get("channel_overwrites"):
- self.channel_overwrites.update(
- {channel_id: channel.get("channel_overwrites")}
- )
-
- self._map_overwrites()
-
- def _map_overwrites(self):
- """map video ids to channel ids overwrites"""
- self.video_overwrites = {}
- for video in self.all_pending:
- video_id = video["youtube_id"]
- channel_id = video["channel_id"]
- overwrites = self.channel_overwrites.get(channel_id, False)
- if overwrites:
- self.video_overwrites.update({video_id: overwrites})
-
-
-class PendingInteract:
- """interact with items in download queue"""
-
- def __init__(self, video_id=False, status=False):
- self.video_id = video_id
- self.status = status
-
- def delete_item(self):
- """delete single item from pending"""
- path = f"ta_download/_doc/{self.video_id}"
- _, _ = ElasticWrap(path).delete()
-
- def delete_by_status(self):
- """delete all matching item by status"""
- data = {"query": {"term": {"status": {"value": self.status}}}}
- path = "ta_download/_delete_by_query"
- _, _ = ElasticWrap(path).post(data=data)
-
- def update_status(self):
- """update status field of pending item"""
- data = {"doc": {"status": self.status}}
- path = f"ta_download/_update/{self.video_id}"
- _, _ = ElasticWrap(path).post(data=data)
-
-
-class PendingList(PendingIndex):
- """manage the pending videos list"""
-
- def __init__(self, youtube_ids=False):
- super().__init__()
- self.youtube_ids = youtube_ids
- self.to_skip = False
- self.missing_videos = False
-
- def parse_url_list(self):
- """extract youtube ids from list"""
- self.missing_videos = []
- self.get_download()
- self.get_indexed()
- for entry in self.youtube_ids:
- # notify
- mess_dict = {
- "status": "message:add",
- "level": "info",
- "title": "Adding to download queue.",
- "message": "Extracting lists",
- }
- RedisArchivist().set_message("message:add", mess_dict)
- self._process_entry(entry)
-
- def _process_entry(self, entry):
- """process single entry from url list"""
- if entry["type"] == "video":
- self._add_video(entry["url"])
- elif entry["type"] == "channel":
- self._parse_channel(entry["url"])
- elif entry["type"] == "playlist":
- self._parse_playlist(entry["url"])
- new_thumbs = PlaylistSubscription().process_url_str(
- [entry], subscribed=False
- )
- ThumbManager().download_playlist(new_thumbs)
- else:
- raise ValueError(f"invalid url_type: {entry}")
-
- def _add_video(self, url):
- """add video to list"""
- if url not in self.missing_videos and url not in self.to_skip:
- self.missing_videos.append(url)
-
- def _parse_channel(self, url):
- """add all videos of channel to list"""
- video_results = ChannelSubscription().get_last_youtube_videos(
- url, limit=False
- )
- youtube_ids = [i[0] for i in video_results]
- for video_id in youtube_ids:
- self._add_video(video_id)
-
- def _parse_playlist(self, url):
- """add all videos of playlist to list"""
- playlist = YoutubePlaylist(url)
- playlist.build_json()
- video_results = playlist.json_data.get("playlist_entries")
- youtube_ids = [i["youtube_id"] for i in video_results]
- for video_id in youtube_ids:
- self._add_video(video_id)
-
- def add_to_pending(self, status="pending"):
- """add missing videos to pending list"""
- self.get_channels()
- bulk_list = []
-
- thumb_handler = ThumbManager()
- for idx, youtube_id in enumerate(self.missing_videos):
- video_details = self.get_youtube_details(youtube_id)
- if not video_details:
- continue
-
- video_details["status"] = status
- action = {"create": {"_id": youtube_id, "_index": "ta_download"}}
- bulk_list.append(json.dumps(action))
- bulk_list.append(json.dumps(video_details))
-
- thumb_needed = [(youtube_id, video_details["vid_thumb_url"])]
- thumb_handler.download_vid(thumb_needed)
- self._notify_add(idx)
-
- # add last newline
- bulk_list.append("\n")
- query_str = "\n".join(bulk_list)
- _, _ = ElasticWrap("_bulk").post(query_str, ndjson=True)
-
- def _notify_add(self, idx):
- """send notification for adding videos to download queue"""
- progress = f"{idx + 1}/{len(self.missing_videos)}"
- mess_dict = {
- "status": "message:add",
- "level": "info",
- "title": "Adding new videos to download queue.",
- "message": "Progress: " + progress,
- }
- if idx + 1 == len(self.missing_videos):
- RedisArchivist().set_message("message:add", mess_dict, expire=4)
- else:
- RedisArchivist().set_message("message:add", mess_dict)
-
- if idx + 1 % 25 == 0:
- print("adding to queue progress: " + progress)
-
- def get_youtube_details(self, youtube_id):
- """get details from youtubedl for single pending video"""
- obs = {
- "default_search": "ytsearch",
- "quiet": True,
- "check_formats": "selected",
- "noplaylist": True,
- "writethumbnail": True,
- "simulate": True,
- }
- try:
- vid = yt_dlp.YoutubeDL(obs).extract_info(youtube_id)
- except yt_dlp.utils.DownloadError:
- print("failed to extract info for: " + youtube_id)
- return False
- # stop if video is streaming live now
- if vid["is_live"]:
- return False
-
- return self._parse_youtube_details(vid)
-
- def _parse_youtube_details(self, vid):
- """parse response"""
- vid_id = vid.get("id")
- duration_str = DurationConverter.get_str(vid["duration"])
- if duration_str == "NA":
- print(f"skip extracting duration for: {vid_id}")
- published = datetime.strptime(vid["upload_date"], "%Y%m%d").strftime(
- "%Y-%m-%d"
- )
-
- # build dict
- youtube_details = {
- "youtube_id": vid_id,
- "channel_name": vid["channel"],
- "vid_thumb_url": vid["thumbnail"],
- "title": vid["title"],
- "channel_id": vid["channel_id"],
- "channel_indexed": vid["channel_id"] in self.all_channels,
- "duration": duration_str,
- "published": published,
- "timestamp": int(datetime.now().strftime("%s")),
- }
- return youtube_details
diff --git a/tubearchivist/home/src/download/subscriptions.py b/tubearchivist/home/src/download/subscriptions.py
deleted file mode 100644
index a460af9..0000000
--- a/tubearchivist/home/src/download/subscriptions.py
+++ /dev/null
@@ -1,223 +0,0 @@
-"""
-Functionality:
-- handle channel subscriptions
-- handle playlist subscriptions
-"""
-
-import yt_dlp
-from home.src.download import queue # partial import
-from home.src.es.connect import IndexPaginate
-from home.src.index.channel import YoutubeChannel
-from home.src.index.playlist import YoutubePlaylist
-from home.src.ta.config import AppConfig
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class ChannelSubscription:
- """manage the list of channels subscribed"""
-
- def __init__(self):
- config = AppConfig().config
- self.es_url = config["application"]["es_url"]
- self.es_auth = config["application"]["es_auth"]
- self.channel_size = config["subscriptions"]["channel_size"]
-
- @staticmethod
- def get_channels(subscribed_only=True):
- """get a list of all channels subscribed to"""
- data = {
- "sort": [{"channel_name.keyword": {"order": "asc"}}],
- }
- if subscribed_only:
- data["query"] = {"term": {"channel_subscribed": {"value": True}}}
- else:
- data["query"] = {"match_all": {}}
-
- all_channels = IndexPaginate("ta_channel", data).get_results()
-
- return all_channels
-
- def get_last_youtube_videos(self, channel_id, limit=True):
- """get a list of last videos from channel"""
- url = f"https://www.youtube.com/channel/{channel_id}/videos"
- obs = {
- "default_search": "ytsearch",
- "quiet": True,
- "skip_download": True,
- "extract_flat": True,
- }
- if limit:
- obs["playlistend"] = self.channel_size
-
- try:
- chan = yt_dlp.YoutubeDL(obs).extract_info(url, download=False)
- except yt_dlp.utils.DownloadError:
- print(f"{channel_id}: failed to extract videos, skipping.")
- return False
-
- last_videos = [(i["id"], i["title"]) for i in chan["entries"]]
- return last_videos
-
- def find_missing(self):
- """add missing videos from subscribed channels to pending"""
- all_channels = self.get_channels()
- pending = queue.PendingList()
- pending.get_download()
- pending.get_indexed()
-
- missing_videos = []
-
- for idx, channel in enumerate(all_channels):
- channel_id = channel["channel_id"]
- last_videos = self.get_last_youtube_videos(channel_id)
-
- if last_videos:
- for video in last_videos:
- if video[0] not in pending.to_skip:
- missing_videos.append(video[0])
- # notify
- message = {
- "status": "message:rescan",
- "level": "info",
- "title": "Scanning channels: Looking for new videos.",
- "message": f"Progress: {idx + 1}/{len(all_channels)}",
- }
- if idx + 1 == len(all_channels):
- RedisArchivist().set_message(
- "message:rescan", message=message, expire=4
- )
- else:
- RedisArchivist().set_message("message:rescan", message=message)
-
- return missing_videos
-
- @staticmethod
- def change_subscribe(channel_id, channel_subscribed):
- """subscribe or unsubscribe from channel and update"""
- channel = YoutubeChannel(channel_id)
- channel.build_json()
- channel.json_data["channel_subscribed"] = channel_subscribed
- channel.upload_to_es()
- channel.sync_to_videos()
-
-
-class PlaylistSubscription:
- """manage the playlist download functionality"""
-
- def __init__(self):
- self.config = AppConfig().config
-
- @staticmethod
- def get_playlists(subscribed_only=True):
- """get a list of all active playlists"""
- data = {
- "sort": [{"playlist_channel.keyword": {"order": "desc"}}],
- }
- data["query"] = {
- "bool": {"must": [{"term": {"playlist_active": {"value": True}}}]}
- }
- if subscribed_only:
- data["query"]["bool"]["must"].append(
- {"term": {"playlist_subscribed": {"value": True}}}
- )
-
- all_playlists = IndexPaginate("ta_playlist", data).get_results()
-
- return all_playlists
-
- def process_url_str(self, new_playlists, subscribed=True):
- """process playlist subscribe form url_str"""
- data = {
- "query": {"match_all": {}},
- "sort": [{"published": {"order": "desc"}}],
- }
- all_indexed = IndexPaginate("ta_video", data).get_results()
- all_youtube_ids = [i["youtube_id"] for i in all_indexed]
-
- new_thumbs = []
- for idx, playlist in enumerate(new_playlists):
- url_type = playlist["type"]
- playlist_id = playlist["url"]
- if not url_type == "playlist":
- print(f"{playlist_id} not a playlist, skipping...")
- continue
-
- playlist_h = YoutubePlaylist(playlist_id)
- playlist_h.all_youtube_ids = all_youtube_ids
- playlist_h.build_json()
- playlist_h.json_data["playlist_subscribed"] = subscribed
- playlist_h.upload_to_es()
- playlist_h.add_vids_to_playlist()
- self.channel_validate(playlist_h.json_data["playlist_channel_id"])
- thumb = playlist_h.json_data["playlist_thumbnail"]
- new_thumbs.append((playlist_id, thumb))
- # notify
- message = {
- "status": "message:subplaylist",
- "level": "info",
- "title": "Subscribing to Playlists",
- "message": f"Processing {idx + 1} of {len(new_playlists)}",
- }
- RedisArchivist().set_message(
- "message:subplaylist", message=message
- )
-
- return new_thumbs
-
- @staticmethod
- def channel_validate(channel_id):
- """make sure channel of playlist is there"""
- channel = YoutubeChannel(channel_id)
- channel.build_json()
-
- @staticmethod
- def change_subscribe(playlist_id, subscribe_status):
- """change the subscribe status of a playlist"""
- playlist = YoutubePlaylist(playlist_id)
- playlist.build_json()
- playlist.json_data["playlist_subscribed"] = subscribe_status
- playlist.upload_to_es()
-
- @staticmethod
- def get_to_ignore():
- """get all youtube_ids already downloaded or ignored"""
- pending = queue.PendingList()
- pending.get_download()
- pending.get_indexed()
-
- return pending.to_skip
-
- def find_missing(self):
- """find videos in subscribed playlists not downloaded yet"""
- all_playlists = [i["playlist_id"] for i in self.get_playlists()]
- to_ignore = self.get_to_ignore()
-
- missing_videos = []
- for idx, playlist_id in enumerate(all_playlists):
- size_limit = self.config["subscriptions"]["channel_size"]
- playlist = YoutubePlaylist(playlist_id)
- playlist.update_playlist()
- if not playlist:
- playlist.deactivate()
- continue
-
- playlist_entries = playlist.json_data["playlist_entries"]
- if size_limit:
- del playlist_entries[size_limit:]
-
- all_missing = [i for i in playlist_entries if not i["downloaded"]]
-
- message = {
- "status": "message:rescan",
- "level": "info",
- "title": "Scanning playlists: Looking for new videos.",
- "message": f"Progress: {idx + 1}/{len(all_playlists)}",
- }
- RedisArchivist().set_message("message:rescan", message=message)
-
- for video in all_missing:
- youtube_id = video["youtube_id"]
- if youtube_id not in to_ignore:
- missing_videos.append(youtube_id)
-
- return missing_videos
diff --git a/tubearchivist/home/src/download/thumbnails.py b/tubearchivist/home/src/download/thumbnails.py
deleted file mode 100644
index d25f4d1..0000000
--- a/tubearchivist/home/src/download/thumbnails.py
+++ /dev/null
@@ -1,347 +0,0 @@
-"""
-functionality:
-- handle download and caching for thumbnails
-- check for missing thumbnails
-"""
-
-import base64
-import os
-from collections import Counter
-from io import BytesIO
-from time import sleep
-
-import requests
-from home.src.download import queue # partial import
-from home.src.download import subscriptions # partial import
-from home.src.ta.config import AppConfig
-from home.src.ta.helper import ignore_filelist
-from home.src.ta.ta_redis import RedisArchivist
-from mutagen.mp4 import MP4, MP4Cover
-from PIL import Image, ImageFilter
-
-
-class ThumbManager:
- """handle thumbnails related functions"""
-
- CONFIG = AppConfig().config
- MEDIA_DIR = CONFIG["application"]["videos"]
- CACHE_DIR = CONFIG["application"]["cache_dir"]
- VIDEO_DIR = os.path.join(CACHE_DIR, "videos")
- CHANNEL_DIR = os.path.join(CACHE_DIR, "channels")
- PLAYLIST_DIR = os.path.join(CACHE_DIR, "playlists")
-
- def get_all_thumbs(self):
- """get all video artwork already downloaded"""
- all_thumb_folders = ignore_filelist(os.listdir(self.VIDEO_DIR))
- all_thumbs = []
- for folder in all_thumb_folders:
- folder_path = os.path.join(self.VIDEO_DIR, folder)
- if os.path.isfile(folder_path):
- self.update_path(folder)
- all_thumbs.append(folder_path)
- continue
- # raise exemption here in a future version
- # raise FileExistsError("video cache dir has files inside")
-
- all_folder_thumbs = ignore_filelist(os.listdir(folder_path))
- all_thumbs.extend(all_folder_thumbs)
-
- return all_thumbs
-
- def update_path(self, file_name):
- """reorganize thumbnails into folders as update path from v0.0.5"""
- folder_name = file_name[0].lower()
- folder_path = os.path.join(self.VIDEO_DIR, folder_name)
- old_file = os.path.join(self.VIDEO_DIR, file_name)
- new_file = os.path.join(folder_path, file_name)
- os.makedirs(folder_path, exist_ok=True)
- os.rename(old_file, new_file)
-
- def get_needed_thumbs(self, missing_only=False):
- """get a list of all missing thumbnails"""
- all_thumbs = self.get_all_thumbs()
-
- pending = queue.PendingList()
- pending.get_download()
- pending.get_indexed()
-
- needed_thumbs = []
- for video in pending.all_videos:
- youtube_id = video["youtube_id"]
- thumb_url = video["vid_thumb_url"]
- if missing_only:
- if youtube_id + ".jpg" not in all_thumbs:
- needed_thumbs.append((youtube_id, thumb_url))
- else:
- needed_thumbs.append((youtube_id, thumb_url))
-
- for video in pending.all_pending + pending.all_ignored:
- youtube_id = video["youtube_id"]
- thumb_url = video["vid_thumb_url"]
- if missing_only:
- if youtube_id + ".jpg" not in all_thumbs:
- needed_thumbs.append((youtube_id, thumb_url))
- else:
- needed_thumbs.append((youtube_id, thumb_url))
-
- return needed_thumbs
-
- def get_missing_channels(self):
- """get all channel artwork"""
- all_channel_art = os.listdir(self.CHANNEL_DIR)
- files = [i[0:24] for i in all_channel_art]
- cached_channel_ids = [k for (k, v) in Counter(files).items() if v > 1]
- channel_sub = subscriptions.ChannelSubscription()
- channels = channel_sub.get_channels(subscribed_only=False)
-
- missing_channels = []
- for channel in channels:
- channel_id = channel["channel_id"]
- if channel_id not in cached_channel_ids:
- channel_banner = channel["channel_banner_url"]
- channel_thumb = channel["channel_thumb_url"]
- missing_channels.append(
- (channel_id, channel_thumb, channel_banner)
- )
-
- return missing_channels
-
- def get_missing_playlists(self):
- """get all missing playlist artwork"""
- all_downloaded = ignore_filelist(os.listdir(self.PLAYLIST_DIR))
- all_ids_downloaded = [i.replace(".jpg", "") for i in all_downloaded]
- playlist_sub = subscriptions.PlaylistSubscription()
- playlists = playlist_sub.get_playlists(subscribed_only=False)
-
- missing_playlists = []
- for playlist in playlists:
- playlist_id = playlist["playlist_id"]
- if playlist_id not in all_ids_downloaded:
- playlist_thumb = playlist["playlist_thumbnail"]
- missing_playlists.append((playlist_id, playlist_thumb))
-
- return missing_playlists
-
- def get_raw_img(self, img_url, thumb_type):
- """get raw image from youtube and handle 404"""
- try:
- app_root = self.CONFIG["application"]["app_root"]
- except KeyError:
- # lazy keyerror fix to not have to deal with a strange startup
- # racing contition between the threads in HomeConfig.ready()
- app_root = "/app"
- default_map = {
- "video": os.path.join(
- app_root, "static/img/default-video-thumb.jpg"
- ),
- "icon": os.path.join(
- app_root, "static/img/default-channel-icon.jpg"
- ),
- "banner": os.path.join(
- app_root, "static/img/default-channel-banner.jpg"
- ),
- }
- if img_url:
- try:
- response = requests.get(img_url, stream=True)
- except ConnectionError:
- sleep(5)
- response = requests.get(img_url, stream=True)
- if not response.ok and not response.status_code == 404:
- print("retry thumbnail download for " + img_url)
- sleep(5)
- response = requests.get(img_url, stream=True)
- else:
- response = False
- if not response or response.status_code == 404:
- # use default
- img_raw = Image.open(default_map[thumb_type])
- else:
- # use response
- img_obj = response.raw
- img_raw = Image.open(img_obj)
-
- return img_raw
-
- def download_vid(self, missing_thumbs, notify=True):
- """download all missing thumbnails from list"""
- print(f"downloading {len(missing_thumbs)} thumbnails")
- for idx, (youtube_id, thumb_url) in enumerate(missing_thumbs):
- folder_path = os.path.join(self.VIDEO_DIR, youtube_id[0].lower())
- thumb_path = os.path.join(
- self.CACHE_DIR, self.vid_thumb_path(youtube_id)
- )
-
- os.makedirs(folder_path, exist_ok=True)
- img_raw = self.get_raw_img(thumb_url, "video")
-
- width, height = img_raw.size
- if not width / height == 16 / 9:
- new_height = width / 16 * 9
- offset = (height - new_height) / 2
- img_raw = img_raw.crop((0, offset, width, height - offset))
- img_raw.convert("RGB").save(thumb_path)
-
- progress = f"{idx + 1}/{len(missing_thumbs)}"
- if notify:
- mess_dict = {
- "status": "message:add",
- "level": "info",
- "title": "Processing Videos",
- "message": "Downloading Thumbnails, Progress: " + progress,
- }
- if idx + 1 == len(missing_thumbs):
- RedisArchivist().set_message(
- "message:add", mess_dict, expire=4
- )
- else:
- RedisArchivist().set_message("message:add", mess_dict)
-
- if idx + 1 % 25 == 0:
- print("thumbnail progress: " + progress)
-
- def download_chan(self, missing_channels):
- """download needed artwork for channels"""
- print(f"downloading {len(missing_channels)} channel artwork")
- for channel in missing_channels:
- channel_id, channel_thumb, channel_banner = channel
-
- thumb_path = os.path.join(
- self.CHANNEL_DIR, channel_id + "_thumb.jpg"
- )
- img_raw = self.get_raw_img(channel_thumb, "icon")
- img_raw.convert("RGB").save(thumb_path)
-
- banner_path = os.path.join(
- self.CHANNEL_DIR, channel_id + "_banner.jpg"
- )
- img_raw = self.get_raw_img(channel_banner, "banner")
- img_raw.convert("RGB").save(banner_path)
-
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Processing Channels",
- "message": "Downloading Channel Art.",
- }
- RedisArchivist().set_message("message:download", mess_dict)
-
- def download_playlist(self, missing_playlists):
- """download needed artwork for playlists"""
- print(f"downloading {len(missing_playlists)} playlist artwork")
- for playlist in missing_playlists:
- playlist_id, playlist_thumb_url = playlist
- thumb_path = os.path.join(self.PLAYLIST_DIR, playlist_id + ".jpg")
- img_raw = self.get_raw_img(playlist_thumb_url, "video")
- img_raw.convert("RGB").save(thumb_path)
-
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Processing Playlists",
- "message": "Downloading Playlist Art.",
- }
- RedisArchivist().set_message("message:download", mess_dict)
-
- def get_base64_blur(self, youtube_id):
- """return base64 encoded placeholder"""
- img_path = self.vid_thumb_path(youtube_id)
- file_path = os.path.join(self.CACHE_DIR, img_path)
- img_raw = Image.open(file_path)
- img_raw.thumbnail((img_raw.width // 20, img_raw.height // 20))
- img_blur = img_raw.filter(ImageFilter.BLUR)
- buffer = BytesIO()
- img_blur.save(buffer, format="JPEG")
- img_data = buffer.getvalue()
- img_base64 = base64.b64encode(img_data).decode()
- data_url = f"data:image/jpg;base64,{img_base64}"
-
- return data_url
-
- @staticmethod
- def vid_thumb_path(youtube_id):
- """build expected path for video thumbnail from youtube_id"""
- folder_name = youtube_id[0].lower()
- folder_path = os.path.join("videos", folder_name)
- thumb_path = os.path.join(folder_path, youtube_id + ".jpg")
- return thumb_path
-
- def delete_vid_thumb(self, youtube_id):
- """delete video thumbnail if exists"""
- thumb_path = self.vid_thumb_path(youtube_id)
- to_delete = os.path.join(self.CACHE_DIR, thumb_path)
- if os.path.exists(to_delete):
- os.remove(to_delete)
-
- def delete_chan_thumb(self, channel_id):
- """delete all artwork of channel"""
- thumb = os.path.join(self.CHANNEL_DIR, channel_id + "_thumb.jpg")
- banner = os.path.join(self.CHANNEL_DIR, channel_id + "_banner.jpg")
- if os.path.exists(thumb):
- os.remove(thumb)
- if os.path.exists(banner):
- os.remove(banner)
-
- def cleanup_downloaded(self):
- """find downloaded thumbnails without video indexed"""
- all_thumbs = self.get_all_thumbs()
- all_indexed = self.get_needed_thumbs()
- all_needed_thumbs = [i[0] + ".jpg" for i in all_indexed]
- for thumb in all_thumbs:
- if thumb not in all_needed_thumbs:
- # cleanup
- youtube_id = thumb.rstrip(".jpg")
- self.delete_vid_thumb(youtube_id)
-
- def get_thumb_list(self):
- """get list of mediafiles and matching thumbnails"""
- pending = queue.PendingList()
- pending.get_indexed()
-
- video_list = []
- for video in pending.all_videos:
- youtube_id = video["youtube_id"]
- media_url = os.path.join(self.MEDIA_DIR, video["media_url"])
- thumb_path = os.path.join(
- self.CACHE_DIR, self.vid_thumb_path(youtube_id)
- )
- video_list.append(
- {
- "media_url": media_url,
- "thumb_path": thumb_path,
- }
- )
-
- return video_list
-
- @staticmethod
- def write_all_thumbs(video_list):
- """rewrite the thumbnail into media file"""
-
- counter = 1
- for video in video_list:
- # loop through all videos
- media_url = video["media_url"]
- thumb_path = video["thumb_path"]
-
- mutagen_vid = MP4(media_url)
- with open(thumb_path, "rb") as f:
- mutagen_vid["covr"] = [
- MP4Cover(f.read(), imageformat=MP4Cover.FORMAT_JPEG)
- ]
- mutagen_vid.save()
- if counter % 50 == 0:
- print(f"thumbnail write progress {counter}/{len(video_list)}")
- counter = counter + 1
-
-
-def validate_thumbnails():
- """check if all thumbnails are there and organized correctly"""
- handler = ThumbManager()
- thumbs_to_download = handler.get_needed_thumbs(missing_only=True)
- handler.download_vid(thumbs_to_download)
- missing_channels = handler.get_missing_channels()
- handler.download_chan(missing_channels)
- missing_playlists = handler.get_missing_playlists()
- handler.download_playlist(missing_playlists)
- handler.cleanup_downloaded()
diff --git a/tubearchivist/home/src/download/yt_dlp_handler.py b/tubearchivist/home/src/download/yt_dlp_handler.py
deleted file mode 100644
index 31d8ce9..0000000
--- a/tubearchivist/home/src/download/yt_dlp_handler.py
+++ /dev/null
@@ -1,405 +0,0 @@
-"""
-functionality:
-- handle yt_dlp
-- build options and post processor
-- download video files
-- move to archive
-"""
-
-import os
-import shutil
-from datetime import datetime
-from time import sleep
-
-import yt_dlp
-from home.src.download.queue import PendingList
-from home.src.download.subscriptions import PlaylistSubscription
-from home.src.es.connect import ElasticWrap, IndexPaginate
-from home.src.index.channel import YoutubeChannel
-from home.src.index.playlist import YoutubePlaylist
-from home.src.index.video import YoutubeVideo, index_new_video
-from home.src.ta.config import AppConfig
-from home.src.ta.helper import clean_string, ignore_filelist
-from home.src.ta.ta_redis import RedisArchivist, RedisQueue
-
-
-class DownloadPostProcess:
- """handle task to run after download queue finishes"""
-
- def __init__(self, download):
- self.download = download
- self.now = int(datetime.now().strftime("%s"))
- self.pending = False
-
- def run(self):
- """run all functions"""
- self.pending = PendingList()
- self.pending.get_download()
- self.pending.get_channels()
- self.pending.get_indexed()
- self.auto_delete_all()
- self.auto_delete_overwrites()
- self.validate_playlists()
-
- def auto_delete_all(self):
- """handle auto delete"""
- autodelete_days = self.download.config["downloads"]["autodelete_days"]
- if not autodelete_days:
- return
-
- print(f"auto delete older than {autodelete_days} days")
- now_lte = self.now - autodelete_days * 24 * 60 * 60
- data = {
- "query": {"range": {"player.watched_date": {"lte": now_lte}}},
- "sort": [{"player.watched_date": {"order": "asc"}}],
- }
- self._auto_delete_watched(data)
-
- def auto_delete_overwrites(self):
- """handle per channel auto delete from overwrites"""
- for channel_id, value in self.pending.channel_overwrites.items():
- if "autodelete_days" in value:
- autodelete_days = value.get("autodelete_days")
- print(f"{channel_id}: delete older than {autodelete_days}d")
- now_lte = self.now - autodelete_days * 24 * 60 * 60
- must_list = [
- {"range": {"player.watched_date": {"lte": now_lte}}},
- {"term": {"channel.channel_id": {"value": channel_id}}},
- ]
- data = {
- "query": {"bool": {"must": must_list}},
- "sort": [{"player.watched_date": {"order": "desc"}}],
- }
- self._auto_delete_watched(data)
-
- @staticmethod
- def _auto_delete_watched(data):
- """delete watched videos after x days"""
- to_delete = IndexPaginate("ta_video", data).get_results()
- if not to_delete:
- return
-
- for video in to_delete:
- youtube_id = video["youtube_id"]
- print(f"{youtube_id}: auto delete video")
- YoutubeVideo(youtube_id).delete_media_file()
-
- print("add deleted to ignore list")
- vids = [{"type": "video", "url": i["youtube_id"]} for i in to_delete]
- pending = PendingList(youtube_ids=vids)
- pending.parse_url_list()
- pending.add_to_pending(status="ignore")
-
- def validate_playlists(self):
- """look for playlist needing to update"""
- for id_c, channel_id in enumerate(self.download.channels):
- channel = YoutubeChannel(channel_id)
- overwrites = self.pending.channel_overwrites.get(channel_id, False)
- if overwrites and overwrites.get("index_playlists"):
- # validate from remote
- channel.index_channel_playlists()
- continue
-
- # validate from local
- playlists = channel.get_indexed_playlists()
- all_channel_playlist = [i["playlist_id"] for i in playlists]
- self._validate_channel_playlist(all_channel_playlist, id_c)
-
- def _validate_channel_playlist(self, all_channel_playlist, id_c):
- """scan channel for playlist needing update"""
- all_youtube_ids = [i["youtube_id"] for i in self.pending.all_videos]
- for id_p, playlist_id in enumerate(all_channel_playlist):
- playlist = YoutubePlaylist(playlist_id)
- playlist.all_youtube_ids = all_youtube_ids
- playlist.build_json(scrape=True)
- if not playlist.json_data:
- playlist.deactivate()
-
- playlist.add_vids_to_playlist()
- playlist.upload_to_es()
- self._notify_playlist_progress(all_channel_playlist, id_c, id_p)
-
- def _notify_playlist_progress(self, all_channel_playlist, id_c, id_p):
- """notify to UI"""
- title = (
- "Processing playlists for channels: "
- + f"{id_c + 1}/{len(self.download.channels)}"
- )
- message = f"Progress: {id_p + 1}/{len(all_channel_playlist)}"
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": title,
- "message": message,
- }
- if id_p + 1 == len(all_channel_playlist):
- RedisArchivist().set_message(
- "message:download", mess_dict, expire=4
- )
- else:
- RedisArchivist().set_message("message:download", mess_dict)
-
-
-class VideoDownloader:
- """
- handle the video download functionality
- if not initiated with list, take from queue
- """
-
- def __init__(self, youtube_id_list=False):
- self.obs = False
- self.video_overwrites = False
- self.youtube_id_list = youtube_id_list
- self.config = AppConfig().config
- self._build_obs()
- self.channels = set()
-
- def run_queue(self):
- """setup download queue in redis loop until no more items"""
- pending = PendingList()
- pending.get_download()
- pending.get_channels()
- self.video_overwrites = pending.video_overwrites
-
- queue = RedisQueue()
-
- limit_queue = self.config["downloads"]["limit_count"]
- if limit_queue:
- queue.trim(limit_queue - 1)
-
- while True:
- youtube_id = queue.get_next()
- if not youtube_id:
- break
-
- try:
- self._dl_single_vid(youtube_id)
- except yt_dlp.utils.DownloadError:
- print("failed to download " + youtube_id)
- continue
- vid_dict = index_new_video(
- youtube_id, video_overwrites=self.video_overwrites
- )
- self.channels.add(vid_dict["channel"]["channel_id"])
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Moving....",
- "message": "Moving downloaded file to storage folder",
- }
- RedisArchivist().set_message("message:download", mess_dict, False)
-
- self.move_to_archive(vid_dict)
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Completed",
- "message": "",
- }
- RedisArchivist().set_message("message:download", mess_dict, 10)
- self._delete_from_pending(youtube_id)
-
- # post processing
- self._add_subscribed_channels()
- DownloadPostProcess(self).run()
-
- @staticmethod
- def add_pending():
- """add pending videos to download queue"""
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Looking for videos to download",
- "message": "Scanning your download queue.",
- }
- RedisArchivist().set_message("message:download", mess_dict)
- pending = PendingList()
- pending.get_download()
- to_add = [i["youtube_id"] for i in pending.all_pending]
- if not to_add:
- # there is nothing pending
- print("download queue is empty")
- mess_dict = {
- "status": "message:download",
- "level": "error",
- "title": "Download queue is empty",
- "message": "Add some videos to the queue first.",
- }
- RedisArchivist().set_message("message:download", mess_dict)
- return
-
- RedisQueue().add_list(to_add)
-
- @staticmethod
- def _progress_hook(response):
- """process the progress_hooks from yt_dlp"""
- # title
- path = os.path.split(response["filename"])[-1][12:]
- filename = os.path.splitext(os.path.splitext(path)[0])[0]
- filename_clean = filename.replace("_", " ")
- title = "Downloading: " + filename_clean
- # message
- try:
- percent = response["_percent_str"]
- size = response["_total_bytes_str"]
- speed = response["_speed_str"]
- eta = response["_eta_str"]
- message = f"{percent} of {size} at {speed} - time left: {eta}"
- except KeyError:
- message = "processing"
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": title,
- "message": message,
- }
- RedisArchivist().set_message("message:download", mess_dict)
-
- def _build_obs(self):
- """collection to build all obs passed to yt-dlp"""
- self._build_obs_basic()
- self._build_obs_user()
- self._build_obs_postprocessors()
-
- def _build_obs_basic(self):
- """initial obs"""
- self.obs = {
- "default_search": "ytsearch",
- "merge_output_format": "mp4",
- "restrictfilenames": True,
- "outtmpl": (
- self.config["application"]["cache_dir"]
- + "/download/"
- + self.config["application"]["file_template"]
- ),
- "progress_hooks": [self._progress_hook],
- "noprogress": True,
- "quiet": True,
- "continuedl": True,
- "retries": 3,
- "writethumbnail": False,
- "noplaylist": True,
- "check_formats": "selected",
- }
-
- def _build_obs_user(self):
- """build user customized options"""
- if self.config["downloads"]["format"]:
- self.obs["format"] = self.config["downloads"]["format"]
- if self.config["downloads"]["limit_speed"]:
- self.obs["ratelimit"] = (
- self.config["downloads"]["limit_speed"] * 1024
- )
-
- throttle = self.config["downloads"]["throttledratelimit"]
- if throttle:
- self.obs["throttledratelimit"] = throttle * 1024
-
- def _build_obs_postprocessors(self):
- """add postprocessor to obs"""
- postprocessors = []
-
- if self.config["downloads"]["add_metadata"]:
- postprocessors.append(
- {
- "key": "FFmpegMetadata",
- "add_chapters": True,
- "add_metadata": True,
- }
- )
-
- if self.config["downloads"]["add_thumbnail"]:
- postprocessors.append(
- {
- "key": "EmbedThumbnail",
- "already_have_thumbnail": True,
- }
- )
- self.obs["writethumbnail"] = True
-
- self.obs["postprocessors"] = postprocessors
-
- def get_format_overwrites(self, youtube_id):
- """get overwrites from single video"""
- overwrites = self.video_overwrites.get(youtube_id, False)
- if overwrites:
- return overwrites.get("download_format", False)
-
- return False
-
- def _dl_single_vid(self, youtube_id):
- """download single video"""
- obs = self.obs.copy()
- format_overwrite = self.get_format_overwrites(youtube_id)
- if format_overwrite:
- obs["format"] = format_overwrite
-
- dl_cache = self.config["application"]["cache_dir"] + "/download/"
-
- # check if already in cache to continue from there
- all_cached = ignore_filelist(os.listdir(dl_cache))
- for file_name in all_cached:
- if youtube_id in file_name:
- obs["outtmpl"] = os.path.join(dl_cache, file_name)
-
- with yt_dlp.YoutubeDL(obs) as ydl:
- try:
- ydl.download([youtube_id])
- except yt_dlp.utils.DownloadError:
- print("retry failed download: " + youtube_id)
- sleep(10)
- ydl.download([youtube_id])
-
- if self.obs["writethumbnail"]:
- # webp files don't get cleaned up automatically
- all_cached = ignore_filelist(os.listdir(dl_cache))
- to_clean = [i for i in all_cached if not i.endswith(".mp4")]
- for file_name in to_clean:
- file_path = os.path.join(dl_cache, file_name)
- os.remove(file_path)
-
- def move_to_archive(self, vid_dict):
- """move downloaded video from cache to archive"""
- videos = self.config["application"]["videos"]
- host_uid = self.config["application"]["HOST_UID"]
- host_gid = self.config["application"]["HOST_GID"]
- channel_name = clean_string(vid_dict["channel"]["channel_name"])
- if len(channel_name) <= 3:
- # fall back to channel id
- channel_name = vid_dict["channel"]["channel_id"]
- # make archive folder with correct permissions
- new_folder = os.path.join(videos, channel_name)
- if not os.path.exists(new_folder):
- os.makedirs(new_folder)
- if host_uid and host_gid:
- os.chown(new_folder, host_uid, host_gid)
- # find real filename
- cache_dir = self.config["application"]["cache_dir"]
- all_cached = ignore_filelist(os.listdir(cache_dir + "/download/"))
- for file_str in all_cached:
- if vid_dict["youtube_id"] in file_str:
- old_file = file_str
- old_file_path = os.path.join(cache_dir, "download", old_file)
- new_file_path = os.path.join(videos, vid_dict["media_url"])
- # move media file and fix permission
- shutil.move(old_file_path, new_file_path)
- if host_uid and host_gid:
- os.chown(new_file_path, host_uid, host_gid)
-
- @staticmethod
- def _delete_from_pending(youtube_id):
- """delete downloaded video from pending index if its there"""
- path = f"ta_download/_doc/{youtube_id}"
- _, _ = ElasticWrap(path).delete()
-
- def _add_subscribed_channels(self):
- """add all channels subscribed to refresh"""
- all_subscribed = PlaylistSubscription().get_playlists()
- if not all_subscribed:
- return
-
- channel_ids = [i["playlist_channel_id"] for i in all_subscribed]
- for channel_id in channel_ids:
- self.channels.add(channel_id)
-
- return
diff --git a/tubearchivist/home/src/es/__init__.py b/tubearchivist/home/src/es/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/es/connect.py b/tubearchivist/home/src/es/connect.py
deleted file mode 100644
index f976943..0000000
--- a/tubearchivist/home/src/es/connect.py
+++ /dev/null
@@ -1,152 +0,0 @@
-"""
-functionality:
-- wrapper around requests to call elastic search
-- reusable search_after to extract total index
-"""
-
-import json
-
-import requests
-from home.src.ta.config import AppConfig
-
-
-class ElasticWrap:
- """makes all calls to elastic search
- returns response json and status code tuple
- """
-
- def __init__(self, path, config=False):
- self.url = False
- self.auth = False
- self.path = path
- self.config = config
- self._get_config()
-
- def _get_config(self):
- """add config if not passed"""
- if not self.config:
- self.config = AppConfig().config
-
- es_url = self.config["application"]["es_url"]
- self.auth = self.config["application"]["es_auth"]
- self.url = f"{es_url}/{self.path}"
-
- def get(self, data=False):
- """get data from es"""
- if data:
- response = requests.get(self.url, json=data, auth=self.auth)
- else:
- response = requests.get(self.url, auth=self.auth)
- if not response.ok:
- print(response.text)
-
- return response.json(), response.status_code
-
- def post(self, data=False, ndjson=False):
- """post data to es"""
- if ndjson:
- headers = {"Content-type": "application/x-ndjson"}
- payload = data
- else:
- headers = {"Content-type": "application/json"}
- payload = json.dumps(data)
-
- if data:
- response = requests.post(
- self.url, data=payload, headers=headers, auth=self.auth
- )
- else:
- response = requests.post(self.url, headers=headers, auth=self.auth)
-
- if not response.ok:
- print(response.text)
-
- return response.json(), response.status_code
-
- def put(self, data, refresh=False):
- """put data to es"""
- if refresh:
- self.url = f"{self.url}/?refresh=true"
- response = requests.put(f"{self.url}", json=data, auth=self.auth)
- if not response.ok:
- print(response.text)
- print(data)
- raise ValueError("failed to add item to index")
-
- return response.json(), response.status_code
-
- def delete(self, data=False):
- """delete document from es"""
- if data:
- response = requests.delete(self.url, json=data, auth=self.auth)
- else:
- response = requests.delete(self.url, auth=self.auth)
-
- if not response.ok:
- print(response.text)
-
- return response.json(), response.status_code
-
-
-class IndexPaginate:
- """use search_after to go through whole index"""
-
- DEFAULT_SIZE = 500
-
- def __init__(self, index_name, data, size=False, keep_source=False):
- self.index_name = index_name
- self.data = data
- self.pit_id = False
- self.size = size
- self.keep_source = keep_source
-
- def get_results(self):
- """get all results"""
- self.get_pit()
- self.validate_data()
- all_results = self.run_loop()
- self.clean_pit()
- return all_results
-
- def get_pit(self):
- """get pit for index"""
- path = f"{self.index_name}/_pit?keep_alive=10m"
- response, _ = ElasticWrap(path).post()
- self.pit_id = response["id"]
-
- def validate_data(self):
- """add pit and size to data"""
- if "sort" not in self.data.keys():
- print(self.data)
- raise ValueError("missing sort key in data")
-
- size = self.size or self.DEFAULT_SIZE
-
- self.data["size"] = size
- self.data["pit"] = {"id": self.pit_id, "keep_alive": "10m"}
-
- def run_loop(self):
- """loop through results until last hit"""
- all_results = []
- while True:
- response, _ = ElasticWrap("_search").get(data=self.data)
- all_hits = response["hits"]["hits"]
- if all_hits:
- for hit in all_hits:
- if self.keep_source:
- source = hit
- else:
- source = hit["_source"]
- search_after = hit["sort"]
- all_results.append(source)
- # update search_after with last hit data
- self.data["search_after"] = search_after
- else:
- break
-
- return all_results
-
- def clean_pit(self):
- """delete pit from elastic search"""
- data = {"id": self.pit_id}
- ElasticWrap("_pit").delete(data=data)
diff --git a/tubearchivist/home/src/es/index_mapping.json b/tubearchivist/home/src/es/index_mapping.json
deleted file mode 100644
index f023eef..0000000
--- a/tubearchivist/home/src/es/index_mapping.json
+++ /dev/null
@@ -1,465 +0,0 @@
-{
- "index_config": [{
- "index_name": "channel",
- "expected_map": {
- "channel_id": {
- "type": "keyword"
- },
- "channel_name": {
- "type": "text",
- "analyzer": "english",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- },
- "search_as_you_type": {
- "type": "search_as_you_type",
- "doc_values": false,
- "max_shingle_size": 3
- }
- }
- },
- "channel_banner_url": {
- "type": "keyword",
- "index": false
- },
- "channel_tvart_url": {
- "type": "keyword",
- "index": false
- },
- "channel_thumb_url": {
- "type": "keyword",
- "index": false
- },
- "channel_description": {
- "type": "text"
- },
- "channel_last_refresh": {
- "type": "date",
- "format": "epoch_second"
- },
- "channel_overwrites": {
- "properties": {
- "download_format": {
- "type": "text"
- },
- "autodelete_days": {
- "type": "long"
- },
- "index_playlists": {
- "type": "boolean"
- },
- "integrate_sponsorblock": {
- "type" : "boolean"
- }
- }
- }
- },
- "expected_set": {
- "analysis": {
- "normalizer": {
- "to_lower": {
- "type": "custom",
- "filter": ["lowercase"]
- }
- }
- },
- "number_of_replicas": "0"
- }
- },
- {
- "index_name": "video",
- "expected_map": {
- "vid_thumb_url": {
- "type": "text",
- "index": false
- },
- "vid_thumb_base64": {
- "type": "text",
- "index": false
- },
- "date_downloaded": {
- "type": "date"
- },
- "channel": {
- "properties": {
- "channel_id": {
- "type": "keyword"
- },
- "channel_name": {
- "type": "text",
- "analyzer": "english",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- },
- "search_as_you_type": {
- "type": "search_as_you_type",
- "doc_values": false,
- "max_shingle_size": 3
- }
- }
- },
- "channel_banner_url": {
- "type": "keyword",
- "index": false
- },
- "channel_tvart_url": {
- "type": "keyword",
- "index": false
- },
- "channel_thumb_url": {
- "type": "keyword",
- "index": false
- },
- "channel_description": {
- "type": "text"
- },
- "channel_last_refresh": {
- "type": "date",
- "format": "epoch_second"
- },
- "channel_overwrites": {
- "properties": {
- "download_format": {
- "type": "text"
- },
- "autodelete_days": {
- "type": "long"
- },
- "index_playlists": {
- "type": "boolean"
- },
- "integrate_sponsorblock": {
- "type" : "boolean"
- }
- }
- }
- }
- },
- "description": {
- "type": "text"
- },
- "media_url": {
- "type": "keyword",
- "index": false
- },
- "tags": {
- "type": "text",
- "analyzer": "english",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256
- }
- }
- },
- "title": {
- "type": "text",
- "analyzer": "english",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- },
- "search_as_you_type": {
- "type": "search_as_you_type",
- "doc_values": false,
- "max_shingle_size": 3
- }
- }
- },
- "vid_last_refresh": {
- "type": "date"
- },
- "youtube_id": {
- "type": "keyword"
- },
- "published": {
- "type": "date"
- },
- "playlist": {
- "type": "text",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- }
- }
- },
- "stats" : {
- "properties" : {
- "average_rating" : {
- "type" : "float"
- },
- "dislike_count" : {
- "type" : "long"
- },
- "like_count" : {
- "type" : "long"
- },
- "view_count" : {
- "type" : "long"
- }
- }
- },
- "subtitles": {
- "properties": {
- "ext": {
- "type": "keyword",
- "index": false
- },
- "lang": {
- "type": "keyword",
- "index": false
- },
- "media_url": {
- "type": "keyword",
- "index": false
- },
- "name": {
- "type": "keyword"
- },
- "source": {
- "type": "keyword"
- },
- "url": {
- "type": "keyword",
- "index": false
- }
- }
- },
- "sponsorblock": {
- "properties": {
- "last_refresh": {
- "type": "date"
- },
- "has_unlocked": {
- "type": "boolean"
- },
- "is_enabled": {
- "type": "boolean"
- },
- "segments" : {
- "properties" : {
- "UUID" : {
- "type": "keyword"
- },
- "actionType" : {
- "type": "keyword"
- },
- "category" : {
- "type": "keyword"
- },
- "locked" : {
- "type" : "short"
- },
- "segment" : {
- "type" : "float"
- },
- "videoDuration" : {
- "type" : "float"
- },
- "votes" : {
- "type" : "long"
- }
- }
- }
- }
- }
- },
- "expected_set": {
- "analysis": {
- "normalizer": {
- "to_lower": {
- "type": "custom",
- "filter": ["lowercase"]
- }
- }
- },
- "number_of_replicas": "0"
- }
- },
- {
- "index_name": "download",
- "expected_map": {
- "timestamp": {
- "type": "date"
- },
- "channel_id": {
- "type": "keyword"
- },
- "channel_name": {
- "type": "text",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- }
- }
- },
- "status": {
- "type": "keyword"
- },
- "title": {
- "type": "text",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- }
- }
- },
- "vid_thumb_url": {
- "type": "keyword"
- },
- "youtube_id": {
- "type": "keyword"
- }
- },
- "expected_set": {
- "analysis": {
- "normalizer": {
- "to_lower": {
- "type": "custom",
- "filter": ["lowercase"]
- }
- }
- },
- "number_of_replicas": "0"
- }
- },
- {
- "index_name": "playlist",
- "expected_map": {
- "playlist_id": {
- "type": "keyword"
- },
- "playlist_description": {
- "type": "text"
- },
- "playlist_name": {
- "type": "text",
- "analyzer": "english",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- },
- "search_as_you_type": {
- "type": "search_as_you_type",
- "doc_values": false,
- "max_shingle_size": 3
- }
- }
- },
- "playlist_channel": {
- "type": "text",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- }
- }
- },
- "playlist_channel_id": {
- "type": "keyword"
- },
- "playlist_thumbnail": {
- "type": "keyword"
- },
- "playlist_last_refresh": {
- "type": "date"
- }
- },
- "expected_set": {
- "analysis": {
- "normalizer": {
- "to_lower": {
- "type": "custom",
- "filter": ["lowercase"]
- }
- }
- },
- "number_of_replicas": "0"
- }
- },
- {
- "index_name": "subtitle",
- "expected_map": {
- "youtube_id": {
- "type": "keyword"
- },
- "title": {
- "type": "text",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- }
- }
- },
- "subtitle_fragment_id": {
- "type": "keyword"
- },
- "subtitle_channel": {
- "type": "text",
- "fields": {
- "keyword": {
- "type": "keyword",
- "ignore_above": 256,
- "normalizer": "to_lower"
- }
- }
- },
- "subtitle_channel_id": {
- "type": "keyword"
- },
- "subtitle_start": {
- "type": "text"
- },
- "subtitle_end": {
- "type": "text"
- },
- "subtitle_last_refresh": {
- "type": "date"
- },
- "subtitle_index": {
- "type" : "long"
- },
- "subtitle_lang": {
- "type": "keyword"
- },
- "subtitle_source": {
- "type": "keyword"
- },
- "subtitle_line": {
- "type" : "text",
- "analyzer": "english"
- }
- },
- "expected_set": {
- "analysis": {
- "normalizer": {
- "to_lower": {
- "type": "custom",
- "filter": ["lowercase"]
- }
- }
- },
- "number_of_replicas": "0"
- }
- }
- ]
-}
\ No newline at end of file
diff --git a/tubearchivist/home/src/es/index_setup.py b/tubearchivist/home/src/es/index_setup.py
deleted file mode 100644
index 85c4e28..0000000
--- a/tubearchivist/home/src/es/index_setup.py
+++ /dev/null
@@ -1,402 +0,0 @@
-"""
-functionality:
-- setup elastic index at first start
-- verify and update index mapping and settings if needed
-- backup and restore metadata
-"""
-
-import json
-import os
-import zipfile
-from datetime import datetime
-
-from home.src.es.connect import ElasticWrap, IndexPaginate
-from home.src.ta.config import AppConfig
-from home.src.ta.helper import ignore_filelist
-
-
-class ElasticIndex:
- """
- handle mapping and settings on elastic search for a given index
- """
-
- def __init__(self, index_name, expected_map, expected_set):
- self.index_name = index_name
- self.expected_map = expected_map
- self.expected_set = expected_set
- self.exists, self.details = self.index_exists()
-
- def index_exists(self):
- """check if index already exists and return mapping if it does"""
- response, status_code = ElasticWrap(f"ta_{self.index_name}").get()
- exists = status_code == 200
- details = response.get(f"ta_{self.index_name}", False)
-
- return exists, details
-
- def validate(self):
- """
- check if all expected mappings and settings match
- returns True when rebuild is needed
- """
-
- if self.expected_map:
- rebuild = self.validate_mappings()
- if rebuild:
- return rebuild
-
- if self.expected_set:
- rebuild = self.validate_settings()
- if rebuild:
- return rebuild
-
- return False
-
- def validate_mappings(self):
- """check if all mappings are as expected"""
- now_map = self.details["mappings"]["properties"]
-
- for key, value in self.expected_map.items():
- # nested
- if list(value.keys()) == ["properties"]:
- for key_n, value_n in value["properties"].items():
- if key not in now_map:
- print(key_n, value_n)
- return True
- if key_n not in now_map[key]["properties"].keys():
- print(key_n, value_n)
- return True
- if not value_n == now_map[key]["properties"][key_n]:
- print(key_n, value_n)
- return True
-
- continue
-
- # not nested
- if key not in now_map.keys():
- print(key, value)
- return True
- if not value == now_map[key]:
- print(key, value)
- return True
-
- return False
-
- def validate_settings(self):
- """check if all settings are as expected"""
-
- now_set = self.details["settings"]["index"]
-
- for key, value in self.expected_set.items():
- if key not in now_set.keys():
- print(key, value)
- return True
-
- if not value == now_set[key]:
- print(key, value)
- return True
-
- return False
-
- def rebuild_index(self):
- """rebuild with new mapping"""
- self.reindex("backup")
- self.delete_index(backup=False)
- self.create_blank()
- self.reindex("restore")
- self.delete_index()
-
- def reindex(self, method):
- """create on elastic search"""
- if method == "backup":
- source = f"ta_{self.index_name}"
- destination = f"ta_{self.index_name}_backup"
- elif method == "restore":
- source = f"ta_{self.index_name}_backup"
- destination = f"ta_{self.index_name}"
-
- data = {"source": {"index": source}, "dest": {"index": destination}}
- _, _ = ElasticWrap("_reindex?refresh=true").post(data=data)
-
- def delete_index(self, backup=True):
- """delete index passed as argument"""
- path = f"ta_{self.index_name}"
- if backup:
- path = path + "_backup"
-
- _, _ = ElasticWrap(path).delete()
-
- def create_blank(self):
- """apply new mapping and settings for blank new index"""
- data = {}
- if self.expected_set:
- data.update({"settings": self.expected_set})
- if self.expected_map:
- data.update({"mappings": {"properties": self.expected_map}})
-
- _, _ = ElasticWrap(f"ta_{self.index_name}").put(data)
-
-
-class ElasticBackup:
- """dump index to nd-json files for later bulk import"""
-
- def __init__(self, index_config, reason):
- self.config = AppConfig().config
- self.cache_dir = self.config["application"]["cache_dir"]
- self.index_config = index_config
- self.reason = reason
- self.timestamp = datetime.now().strftime("%Y%m%d")
- self.backup_files = []
-
- @staticmethod
- def get_all_documents(index_name):
- """export all documents of a single index"""
- data = {
- "query": {"match_all": {}},
- "sort": [{"_doc": {"order": "desc"}}],
- }
- paginate = IndexPaginate(f"ta_{index_name}", data, keep_source=True)
- all_results = paginate.get_results()
-
- return all_results
-
- @staticmethod
- def build_bulk(all_results):
- """build bulk query data from all_results"""
- bulk_list = []
-
- for document in all_results:
- document_id = document["_id"]
- es_index = document["_index"]
- action = {"index": {"_index": es_index, "_id": document_id}}
- source = document["_source"]
- bulk_list.append(json.dumps(action))
- bulk_list.append(json.dumps(source))
-
- # add last newline
- bulk_list.append("\n")
- file_content = "\n".join(bulk_list)
-
- return file_content
-
- def write_es_json(self, file_content, index_name):
- """write nd-json file for es _bulk API to disk"""
- file_name = f"es_{index_name}-{self.timestamp}.json"
- file_path = os.path.join(self.cache_dir, "backup", file_name)
- with open(file_path, "w", encoding="utf-8") as f:
- f.write(file_content)
-
- self.backup_files.append(file_path)
-
- def write_ta_json(self, all_results, index_name):
- """write generic json file to disk"""
- file_name = f"ta_{index_name}-{self.timestamp}.json"
- file_path = os.path.join(self.cache_dir, "backup", file_name)
- to_write = [i["_source"] for i in all_results]
- file_content = json.dumps(to_write)
- with open(file_path, "w", encoding="utf-8") as f:
- f.write(file_content)
-
- self.backup_files.append(file_path)
-
- def zip_it(self):
- """pack it up into single zip file"""
- file_name = f"ta_backup-{self.timestamp}-{self.reason}.zip"
- backup_folder = os.path.join(self.cache_dir, "backup")
- backup_file = os.path.join(backup_folder, file_name)
-
- with zipfile.ZipFile(
- backup_file, "w", compression=zipfile.ZIP_DEFLATED
- ) as zip_f:
- for backup_file in self.backup_files:
- zip_f.write(backup_file, os.path.basename(backup_file))
-
- # cleanup
- for backup_file in self.backup_files:
- os.remove(backup_file)
-
- def post_bulk_restore(self, file_name):
- """send bulk to es"""
- file_path = os.path.join(self.cache_dir, file_name)
- with open(file_path, "r", encoding="utf-8") as f:
- data = f.read()
-
- if not data.strip():
- return
-
- _, _ = ElasticWrap("_bulk").post(data=data, ndjson=True)
-
- def get_all_backup_files(self):
- """build all available backup files for view"""
- backup_dir = os.path.join(self.cache_dir, "backup")
- backup_files = os.listdir(backup_dir)
- all_backup_files = ignore_filelist(backup_files)
- all_available_backups = [
- i
- for i in all_backup_files
- if i.startswith("ta_") and i.endswith(".zip")
- ]
- all_available_backups.sort(reverse=True)
-
- backup_dicts = []
- for backup_file in all_available_backups:
- file_split = backup_file.split("-")
- if len(file_split) == 2:
- timestamp = file_split[1].strip(".zip")
- reason = False
- elif len(file_split) == 3:
- timestamp = file_split[1]
- reason = file_split[2].strip(".zip")
-
- to_add = {
- "filename": backup_file,
- "timestamp": timestamp,
- "reason": reason,
- }
- backup_dicts.append(to_add)
-
- return backup_dicts
-
- def unpack_zip_backup(self, filename):
- """extract backup zip and return filelist"""
- backup_dir = os.path.join(self.cache_dir, "backup")
- file_path = os.path.join(backup_dir, filename)
-
- with zipfile.ZipFile(file_path, "r") as z:
- zip_content = z.namelist()
- z.extractall(backup_dir)
-
- return zip_content
-
- def restore_json_files(self, zip_content):
- """go through the unpacked files and restore"""
- backup_dir = os.path.join(self.cache_dir, "backup")
-
- for json_f in zip_content:
-
- file_name = os.path.join(backup_dir, json_f)
-
- if not json_f.startswith("es_") or not json_f.endswith(".json"):
- os.remove(file_name)
- continue
-
- print("restoring: " + json_f)
- self.post_bulk_restore(file_name)
- os.remove(file_name)
-
- @staticmethod
- def index_exists(index_name):
- """check if index already exists to skip"""
- _, status_code = ElasticWrap(f"ta_{index_name}").get()
- exists = status_code == 200
-
- return exists
-
- def rotate_backup(self):
- """delete old backups if needed"""
- rotate = self.config["scheduler"]["run_backup_rotate"]
- if not rotate:
- return
-
- all_backup_files = self.get_all_backup_files()
- auto = [i for i in all_backup_files if i["reason"] == "auto"]
-
- if len(auto) <= rotate:
- print("no backup files to rotate")
- return
-
- backup_dir = os.path.join(self.cache_dir, "backup")
-
- all_to_delete = auto[rotate:]
- for to_delete in all_to_delete:
- file_path = os.path.join(backup_dir, to_delete["filename"])
- print(f"remove old backup file: {file_path}")
- os.remove(file_path)
-
-
-def get_mapping():
- """read index_mapping.json and get expected mapping and settings"""
- with open("home/src/es/index_mapping.json", "r", encoding="utf-8") as f:
- index_config = json.load(f).get("index_config")
-
- return index_config
-
-
-def index_check(force_restore=False):
- """check if all indexes are created and have correct mapping"""
-
- backed_up = False
- index_config = get_mapping()
-
- for index in index_config:
- index_name = index["index_name"]
- expected_map = index["expected_map"]
- expected_set = index["expected_set"]
- handler = ElasticIndex(index_name, expected_map, expected_set)
- # force restore
- if force_restore:
- handler.delete_index(backup=False)
- handler.create_blank()
- continue
-
- # create new
- if not handler.exists:
- print(f"create new blank index with name ta_{index_name}...")
- handler.create_blank()
- continue
-
- # validate index
- rebuild = handler.validate()
- if rebuild:
- # make backup before rebuild
- if not backed_up:
- print("running backup first")
- backup_all_indexes(reason="update")
- backed_up = True
-
- print(f"applying new mappings to index ta_{index_name}...")
- handler.rebuild_index()
- continue
-
- # else all good
- print(f"ta_{index_name} index is created and up to date...")
-
-
-def get_available_backups():
- """return dict of available backups for settings view"""
- index_config = get_mapping()
- backup_handler = ElasticBackup(index_config, reason=False)
- all_backup_files = backup_handler.get_all_backup_files()
- return all_backup_files
-
-
-def backup_all_indexes(reason):
- """backup all es indexes to disk"""
- index_config = get_mapping()
- backup_handler = ElasticBackup(index_config, reason)
-
- for index in backup_handler.index_config:
- index_name = index["index_name"]
- print(f"backup: export in progress for {index_name}")
- if not backup_handler.index_exists(index_name):
- continue
- all_results = backup_handler.get_all_documents(index_name)
- file_content = backup_handler.build_bulk(all_results)
- backup_handler.write_es_json(file_content, index_name)
- backup_handler.write_ta_json(all_results, index_name)
-
- backup_handler.zip_it()
-
- if reason == "auto":
- backup_handler.rotate_backup()
-
-
-def restore_from_backup(filename):
- """restore indexes from backup file"""
- # delete
- index_check(force_restore=True)
- # recreate
- index_config = get_mapping()
- backup_handler = ElasticBackup(index_config, reason=False)
- zip_content = backup_handler.unpack_zip_backup(filename)
- backup_handler.restore_json_files(zip_content)
diff --git a/tubearchivist/home/src/frontend/__init__.py b/tubearchivist/home/src/frontend/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/frontend/api_calls.py b/tubearchivist/home/src/frontend/api_calls.py
deleted file mode 100644
index 26b1730..0000000
--- a/tubearchivist/home/src/frontend/api_calls.py
+++ /dev/null
@@ -1,316 +0,0 @@
-"""
-Functionality:
-- collection of functions and tasks from frontend
-- called via user input
-"""
-
-from home.src.download.queue import PendingInteract
-from home.src.download.subscriptions import (
- ChannelSubscription,
- PlaylistSubscription,
-)
-from home.src.frontend.searching import SearchForm
-from home.src.frontend.watched import WatchState
-from home.src.index.channel import YoutubeChannel
-from home.src.index.playlist import YoutubePlaylist
-from home.src.index.video import YoutubeVideo
-from home.src.ta.helper import UrlListParser
-from home.src.ta.ta_redis import RedisArchivist, RedisQueue
-from home.tasks import (
- download_pending,
- download_single,
- extrac_dl,
- index_channel_playlists,
- kill_dl,
- re_sync_thumbs,
- rescan_filesystem,
- run_backup,
- run_manual_import,
- run_restore_backup,
- subscribe_to,
- update_subscribed,
-)
-
-
-class PostData:
- """
- map frontend http post values to backend funcs
- handover long running tasks to celery
- """
-
- def __init__(self, post_dict, current_user):
- self.post_dict = post_dict
- self.to_exec, self.exec_val = list(post_dict.items())[0]
- self.current_user = current_user
-
- def run_task(self):
- """execute and return task result"""
- to_exec = self.exec_map()
- task_result = to_exec()
- return task_result
-
- def exec_map(self):
- """map dict key and return function to execute"""
- exec_map = {
- "watched": self._watched,
- "un_watched": self._un_watched,
- "change_view": self._change_view,
- "rescan_pending": self._rescan_pending,
- "ignore": self._ignore,
- "dl_pending": self._dl_pending,
- "queue": self._queue_handler,
- "unsubscribe": self._unsubscribe,
- "subscribe": self._subscribe,
- "sort_order": self._sort_order,
- "hide_watched": self._hide_watched,
- "show_subed_only": self._show_subed_only,
- "dlnow": self._dlnow,
- "show_ignored_only": self._show_ignored_only,
- "forgetIgnore": self._forget_ignore,
- "addSingle": self._add_single,
- "deleteQueue": self._delete_queue,
- "manual-import": self._manual_import,
- "re-embed": self._re_embed,
- "db-backup": self._db_backup,
- "db-restore": self._db_restore,
- "fs-rescan": self._fs_rescan,
- "multi_search": self._multi_search,
- "delete-video": self._delete_video,
- "delete-channel": self._delete_channel,
- "delete-playlist": self._delete_playlist,
- "find-playlists": self._find_playlists,
- }
-
- return exec_map[self.to_exec]
-
- def _watched(self):
- """mark as watched"""
- WatchState(self.exec_val).mark_as_watched()
- return {"success": True}
-
- def _un_watched(self):
- """mark as unwatched"""
- WatchState(self.exec_val).mark_as_unwatched()
- return {"success": True}
-
- def _change_view(self):
- """process view changes in home, channel, and downloads"""
- origin, new_view = self.exec_val.split(":")
- key = f"{self.current_user}:view:{origin}"
- print(f"change view: {key} to {new_view}")
- RedisArchivist().set_message(key, {"status": new_view}, expire=False)
- return {"success": True}
-
- @staticmethod
- def _rescan_pending():
- """look for new items in subscribed channels"""
- print("rescan subscribed channels")
- update_subscribed.delay()
- return {"success": True}
-
- def _ignore(self):
- """ignore from download queue"""
- video_id = self.exec_val
- print(f"ignore video {video_id}")
- PendingInteract(video_id=video_id, status="ignore").update_status()
- # also clear from redis queue
- RedisQueue().clear_item(video_id)
- return {"success": True}
-
- @staticmethod
- def _dl_pending():
- """start the download queue"""
- print("download pending")
- running = download_pending.delay()
- task_id = running.id
- print("set task id: " + task_id)
- RedisArchivist().set_message("dl_queue_id", task_id, expire=False)
- return {"success": True}
-
- def _queue_handler(self):
- """queue controls from frontend"""
- to_execute = self.exec_val
- if to_execute == "stop":
- print("stopping download queue")
- RedisQueue().clear()
- elif to_execute == "kill":
- task_id = RedisArchivist().get_message("dl_queue_id")
- if not isinstance(task_id, str):
- task_id = False
- else:
- print("brutally killing " + task_id)
- kill_dl(task_id)
-
- return {"success": True}
-
- def _unsubscribe(self):
- """unsubscribe from channels or playlists"""
- id_unsub = self.exec_val
- print("unsubscribe from " + id_unsub)
- to_unsub_list = UrlListParser(id_unsub).process_list()
- for to_unsub in to_unsub_list:
- unsub_type = to_unsub["type"]
- unsub_id = to_unsub["url"]
- if unsub_type == "playlist":
- PlaylistSubscription().change_subscribe(
- unsub_id, subscribe_status=False
- )
- elif unsub_type == "channel":
- ChannelSubscription().change_subscribe(
- unsub_id, channel_subscribed=False
- )
- else:
- raise ValueError("failed to process " + id_unsub)
-
- return {"success": True}
-
- def _subscribe(self):
- """subscribe to channel or playlist, called from js buttons"""
- id_sub = self.exec_val
- print("subscribe to " + id_sub)
- subscribe_to.delay(id_sub)
- return {"success": True}
-
- def _sort_order(self):
- """change the sort between published to downloaded"""
- sort_order = {"status": self.exec_val}
- if self.exec_val in ["asc", "desc"]:
- RedisArchivist().set_message(
- f"{self.current_user}:sort_order", sort_order, expire=False
- )
- else:
- RedisArchivist().set_message(
- f"{self.current_user}:sort_by", sort_order, expire=False
- )
- return {"success": True}
-
- def _hide_watched(self):
- """toggle if to show watched vids or not"""
- key = f"{self.current_user}:hide_watched"
- message = {"status": bool(int(self.exec_val))}
- print(f"toggle {key}: {message}")
- RedisArchivist().set_message(key, message, expire=False)
- return {"success": True}
-
- def _show_subed_only(self):
- """show or hide subscribed channels only on channels page"""
- key = f"{self.current_user}:show_subed_only"
- message = {"status": bool(int(self.exec_val))}
- print(f"toggle {key}: {message}")
- RedisArchivist().set_message(key, message, expire=False)
- return {"success": True}
-
- def _dlnow(self):
- """start downloading single vid now"""
- youtube_id = self.exec_val
- print("downloading: " + youtube_id)
- running = download_single.delay(youtube_id=youtube_id)
- task_id = running.id
- print("set task id: " + task_id)
- RedisArchivist().set_message("dl_queue_id", task_id, expire=False)
- return {"success": True}
-
- def _show_ignored_only(self):
- """switch view on /downloads/ to show ignored only"""
- show_value = self.exec_val
- key = f"{self.current_user}:show_ignored_only"
- value = {"status": show_value}
- print(f"Filter download view ignored only: {show_value}")
- RedisArchivist().set_message(key, value, expire=False)
- return {"success": True}
-
- def _forget_ignore(self):
- """delete from ta_download index"""
- video_id = self.exec_val
- print(f"forgetting from download index: {video_id}")
- PendingInteract(video_id=video_id).delete_item()
- return {"success": True}
-
- def _add_single(self):
- """add single youtube_id to download queue"""
- video_id = self.exec_val
- print(f"add vid to dl queue: {video_id}")
- PendingInteract(video_id=video_id).delete_item()
- video_ids = UrlListParser(video_id).process_list()
- extrac_dl.delay(video_ids)
- return {"success": True}
-
- def _delete_queue(self):
- """delete download queue"""
- status = self.exec_val
- print("deleting from download queue: " + status)
- PendingInteract(status=status).delete_by_status()
- return {"success": True}
-
- @staticmethod
- def _manual_import():
- """run manual import from settings page"""
- print("starting manual import")
- run_manual_import.delay()
- return {"success": True}
-
- @staticmethod
- def _re_embed():
- """rewrite thumbnails into media files"""
- print("start video thumbnail embed process")
- re_sync_thumbs.delay()
- return {"success": True}
-
- @staticmethod
- def _db_backup():
- """backup es to zip from settings page"""
- print("backing up database")
- run_backup.delay("manual")
- return {"success": True}
-
- def _db_restore(self):
- """restore es zip from settings page"""
- print("restoring index from backup zip")
- filename = self.exec_val
- run_restore_backup.delay(filename)
- return {"success": True}
-
- @staticmethod
- def _fs_rescan():
- """start file system rescan task"""
- print("start filesystem scan")
- rescan_filesystem.delay()
- return {"success": True}
-
- def _multi_search(self):
- """search through all indexes"""
- search_query = self.exec_val
- print("searching for: " + search_query)
- search_results = SearchForm().multi_search(search_query)
- return search_results
-
- def _delete_video(self):
- """delete media file, metadata and thumb"""
- youtube_id = self.exec_val
- YoutubeVideo(youtube_id).delete_media_file()
- return {"success": True}
-
- def _delete_channel(self):
- """delete channel and all matching videos"""
- channel_id = self.exec_val
- YoutubeChannel(channel_id).delete_channel()
- return {"success": True}
-
- def _delete_playlist(self):
- """delete playlist, only metadata or incl all videos"""
- playlist_dict = self.exec_val
- playlist_id = playlist_dict["playlist-id"]
- playlist_action = playlist_dict["playlist-action"]
- print(f"{playlist_id}: delete playlist {playlist_action}")
- if playlist_action == "metadata":
- YoutubePlaylist(playlist_id).delete_metadata()
- elif playlist_action == "all":
- YoutubePlaylist(playlist_id).delete_videos_playlist()
-
- return {"success": True}
-
- def _find_playlists(self):
- """add all playlists of a channel"""
- channel_id = self.exec_val
- index_channel_playlists.delay(channel_id)
- return {"success": True}
diff --git a/tubearchivist/home/src/frontend/forms.py b/tubearchivist/home/src/frontend/forms.py
deleted file mode 100644
index 1a25e6a..0000000
--- a/tubearchivist/home/src/frontend/forms.py
+++ /dev/null
@@ -1,215 +0,0 @@
-"""functionality:
-- hold all form classes used in the views
-"""
-
-from django import forms
-from django.contrib.auth.forms import AuthenticationForm
-from django.forms.widgets import PasswordInput, TextInput
-
-
-class CustomAuthForm(AuthenticationForm):
- """better styled login form"""
-
- username = forms.CharField(
- widget=TextInput(
- attrs={
- "placeholder": "Username",
- "autofocus": True,
- "autocomplete": True,
- }
- ),
- label=False,
- )
- password = forms.CharField(
- widget=PasswordInput(attrs={"placeholder": "Password"}), label=False
- )
- remember_me = forms.BooleanField(required=False)
-
-
-class UserSettingsForm(forms.Form):
- """user configurations values"""
-
- CHOICES = [
- ("", "-- change color scheme --"),
- ("dark", "Dark"),
- ("light", "Light"),
- ]
-
- colors = forms.ChoiceField(
- widget=forms.Select, choices=CHOICES, required=False
- )
- page_size = forms.IntegerField(required=False)
-
-
-class ApplicationSettingsForm(forms.Form):
- """handle all application settings"""
-
- METADATA_CHOICES = [
- ("", "-- change metadata embed --"),
- ("0", "don't embed metadata"),
- ("1", "embed metadata"),
- ]
-
- THUMBNAIL_CHOICES = [
- ("", "-- change thumbnail embed --"),
- ("0", "don't embed thumbnail"),
- ("1", "embed thumbnail"),
- ]
-
- RYD_CHOICES = [
- ("", "-- change ryd integrations"),
- ("0", "disable ryd integration"),
- ("1", "enable ryd integration"),
- ]
-
- SP_CHOICES = [
- ("", "-- change sponsorblock integrations"),
- ("0", "disable sponsorblock integration"),
- ("1", "enable sponsorblock integration"),
- ]
-
- CAST_CHOICES = [
- ("", "-- change Cast integration --"),
- ("0", "disable Cast"),
- ("1", "enable Cast"),
- ]
-
- SUBTITLE_SOURCE_CHOICES = [
- ("", "-- change subtitle source settings"),
- ("user", "only download user created"),
- ("auto", "also download auto generated"),
- ]
-
- SUBTITLE_INDEX_CHOICES = [
- ("", "-- change subtitle index settings --"),
- ("0", "disable subtitle index"),
- ("1", "enable subtitle index"),
- ]
-
- subscriptions_channel_size = forms.IntegerField(required=False)
- downloads_limit_count = forms.IntegerField(required=False)
- downloads_limit_speed = forms.IntegerField(required=False)
- downloads_throttledratelimit = forms.IntegerField(required=False)
- downloads_sleep_interval = forms.IntegerField(required=False)
- downloads_autodelete_days = forms.IntegerField(required=False)
- downloads_format = forms.CharField(required=False)
- downloads_add_metadata = forms.ChoiceField(
- widget=forms.Select, choices=METADATA_CHOICES, required=False
- )
- downloads_add_thumbnail = forms.ChoiceField(
- widget=forms.Select, choices=THUMBNAIL_CHOICES, required=False
- )
- downloads_subtitle = forms.CharField(required=False)
- downloads_subtitle_source = forms.ChoiceField(
- widget=forms.Select, choices=SUBTITLE_SOURCE_CHOICES, required=False
- )
- downloads_subtitle_index = forms.ChoiceField(
- widget=forms.Select, choices=SUBTITLE_INDEX_CHOICES, required=False
- )
- downloads_integrate_ryd = forms.ChoiceField(
- widget=forms.Select, choices=RYD_CHOICES, required=False
- )
- downloads_integrate_sponsorblock = forms.ChoiceField(
- widget=forms.Select, choices=SP_CHOICES, required=False
- )
- application_enable_cast = forms.ChoiceField(
- widget=forms.Select, choices=CAST_CHOICES, required=False
- )
-
-
-class SchedulerSettingsForm(forms.Form):
- """handle scheduler settings"""
-
- update_subscribed = forms.CharField(required=False)
- download_pending = forms.CharField(required=False)
- check_reindex = forms.CharField(required=False)
- check_reindex_days = forms.IntegerField(required=False)
- thumbnail_check = forms.CharField(required=False)
- run_backup = forms.CharField(required=False)
- run_backup_rotate = forms.IntegerField(required=False)
-
-
-class MultiSearchForm(forms.Form):
- """multi search form for /search/"""
-
- searchInput = forms.CharField(
- label="",
- widget=forms.TextInput(
- attrs={
- "autocomplete": "off",
- "oninput": "searchMulti(this.value)",
- "autofocus": True,
- }
- ),
- )
- home = forms.CharField(widget=forms.HiddenInput())
- channel = forms.CharField(widget=forms.HiddenInput())
- playlist = forms.CharField(widget=forms.HiddenInput())
-
-
-class AddToQueueForm(forms.Form):
- """text area form to add to downloads"""
-
- vid_url = forms.CharField(
- label=False,
- widget=forms.Textarea(
- attrs={
- "rows": 4,
- "placeholder": "Enter Video Urls or IDs here...",
- }
- ),
- )
-
-
-class SubscribeToChannelForm(forms.Form):
- """text area form to subscribe to multiple channels"""
-
- subscribe = forms.CharField(
- label="Subscribe to channels",
- widget=forms.Textarea(
- attrs={
- "rows": 3,
- "placeholder": "Input channel ID, URL or Video of a channel",
- }
- ),
- )
-
-
-class SubscribeToPlaylistForm(forms.Form):
- """text area form to subscribe to multiple playlists"""
-
- subscribe = forms.CharField(
- label="Subscribe to playlists",
- widget=forms.Textarea(
- attrs={
- "rows": 3,
- "placeholder": "Input playlist IDs or URLs",
- }
- ),
- )
-
-
-class ChannelOverwriteForm(forms.Form):
- """custom overwrites for channel settings"""
-
- PLAYLIST_INDEX = [
- ("", "-- change playlist index --"),
- ("0", "Disable playlist index"),
- ("1", "Enable playlist index"),
- ]
-
- SP_CHOICES = [
- ("", "-- change sponsorblock integrations"),
- ("disable", "disable sponsorblock integration"),
- ("1", "enable sponsorblock integration"),
- ("0", "unset sponsorblock integration"),
- ]
-
- download_format = forms.CharField(label=False, required=False)
- autodelete_days = forms.IntegerField(label=False, required=False)
- index_playlists = forms.ChoiceField(
- widget=forms.Select, choices=PLAYLIST_INDEX, required=False
- )
- integrate_sponsorblock = forms.ChoiceField(
- widget=forms.Select, choices=SP_CHOICES, required=False
- )
diff --git a/tubearchivist/home/src/frontend/searching.py b/tubearchivist/home/src/frontend/searching.py
deleted file mode 100644
index bca2742..0000000
--- a/tubearchivist/home/src/frontend/searching.py
+++ /dev/null
@@ -1,203 +0,0 @@
-"""
-Functionality:
-- handle search to populate results to view
-- cache youtube video thumbnails and channel artwork
-- parse values in hit_cleanup for frontend
-- calculate pagination values
-"""
-
-import urllib.parse
-from datetime import datetime
-
-from home.src.download.thumbnails import ThumbManager
-from home.src.es.connect import ElasticWrap
-from home.src.ta.config import AppConfig
-
-
-class SearchHandler:
- """search elastic search"""
-
- def __init__(self, path, config, data=False):
- self.max_hits = None
- self.path = path
- self.config = config
- self.data = data
-
- def get_data(self):
- """get the data"""
- response, _ = ElasticWrap(self.path, config=self.config).get(self.data)
-
- if "hits" in response.keys():
- self.max_hits = response["hits"]["total"]["value"]
- return_value = response["hits"]["hits"]
- else:
- # simulate list for single result to reuse rest of class
- return_value = [response]
-
- # stop if empty
- if not return_value:
- return False
-
- all_videos = []
- all_channels = []
- for idx, hit in enumerate(return_value):
- return_value[idx] = self.hit_cleanup(hit)
- if hit["_index"] == "ta_video":
- video_dict, channel_dict = self.vid_cache_link(hit)
- if video_dict not in all_videos:
- all_videos.append(video_dict)
- if channel_dict not in all_channels:
- all_channels.append(channel_dict)
- elif hit["_index"] == "ta_channel":
- channel_dict = self.channel_cache_link(hit)
- if channel_dict not in all_channels:
- all_channels.append(channel_dict)
-
- return return_value
-
- @staticmethod
- def vid_cache_link(hit):
- """download thumbnails into cache"""
- vid_thumb = hit["source"]["vid_thumb_url"]
- youtube_id = hit["source"]["youtube_id"]
- channel_id_hit = hit["source"]["channel"]["channel_id"]
- chan_thumb = hit["source"]["channel"]["channel_thumb_url"]
- try:
- chan_banner = hit["source"]["channel"]["channel_banner_url"]
- except KeyError:
- chan_banner = False
- video_dict = {"youtube_id": youtube_id, "vid_thumb": vid_thumb}
- channel_dict = {
- "channel_id": channel_id_hit,
- "chan_thumb": chan_thumb,
- "chan_banner": chan_banner,
- }
- return video_dict, channel_dict
-
- @staticmethod
- def channel_cache_link(hit):
- """build channel thumb links"""
- channel_id_hit = hit["source"]["channel_id"]
- chan_thumb = hit["source"]["channel_thumb_url"]
- try:
- chan_banner = hit["source"]["channel_banner_url"]
- except KeyError:
- chan_banner = False
- channel_dict = {
- "channel_id": channel_id_hit,
- "chan_thumb": chan_thumb,
- "chan_banner": chan_banner,
- }
- return channel_dict
-
- @staticmethod
- def hit_cleanup(hit):
- """clean up and parse data from a single hit"""
- hit["source"] = hit.pop("_source")
- hit_keys = hit["source"].keys()
- if "media_url" in hit_keys:
- parsed_url = urllib.parse.quote(hit["source"]["media_url"])
- hit["source"]["media_url"] = parsed_url
-
- if "published" in hit_keys:
- published = hit["source"]["published"]
- date_pub = datetime.strptime(published, "%Y-%m-%d")
- date_str = datetime.strftime(date_pub, "%d %b, %Y")
- hit["source"]["published"] = date_str
-
- if "vid_last_refresh" in hit_keys:
- vid_last_refresh = hit["source"]["vid_last_refresh"]
- date_refresh = datetime.fromtimestamp(vid_last_refresh)
- date_str = datetime.strftime(date_refresh, "%d %b, %Y")
- hit["source"]["vid_last_refresh"] = date_str
-
- if "playlist_last_refresh" in hit_keys:
- playlist_last_refresh = hit["source"]["playlist_last_refresh"]
- date_refresh = datetime.fromtimestamp(playlist_last_refresh)
- date_str = datetime.strftime(date_refresh, "%d %b, %Y")
- hit["source"]["playlist_last_refresh"] = date_str
-
- if "vid_thumb_url" in hit_keys:
- youtube_id = hit["source"]["youtube_id"]
- thumb_path = ThumbManager().vid_thumb_path(youtube_id)
- hit["source"]["vid_thumb_url"] = thumb_path
-
- if "channel_last_refresh" in hit_keys:
- refreshed = hit["source"]["channel_last_refresh"]
- date_refresh = datetime.fromtimestamp(refreshed)
- date_str = datetime.strftime(date_refresh, "%d %b, %Y")
- hit["source"]["channel_last_refresh"] = date_str
-
- if "channel" in hit_keys:
- channel_keys = hit["source"]["channel"].keys()
- if "channel_last_refresh" in channel_keys:
- refreshed = hit["source"]["channel"]["channel_last_refresh"]
- date_refresh = datetime.fromtimestamp(refreshed)
- date_str = datetime.strftime(date_refresh, "%d %b, %Y")
- hit["source"]["channel"]["channel_last_refresh"] = date_str
-
- return hit
-
-
-class SearchForm:
- """build query from search form data"""
-
- CONFIG = AppConfig().config
-
- def multi_search(self, search_query):
- """searching through index"""
- path = "ta_video,ta_channel,ta_playlist/_search"
- data = {
- "size": 30,
- "query": {
- "multi_match": {
- "query": search_query,
- "type": "bool_prefix",
- "operator": "and",
- "fuzziness": "auto",
- "fields": [
- "category",
- "channel_description",
- "channel_name._2gram",
- "channel_name._3gram",
- "channel_name.search_as_you_type",
- "playlist_description",
- "playlist_name._2gram",
- "playlist_name._3gram",
- "playlist_name.search_as_you_type",
- "tags",
- "title._2gram",
- "title._3gram",
- "title.search_as_you_type",
- ],
- }
- },
- }
- look_up = SearchHandler(path, config=self.CONFIG, data=data)
- search_results = look_up.get_data()
- all_results = self.build_results(search_results)
-
- return {"results": all_results}
-
- @staticmethod
- def build_results(search_results):
- """build the all_results dict"""
- video_results = []
- channel_results = []
- playlist_results = []
- if search_results:
- for result in search_results:
- if result["_index"] == "ta_video":
- video_results.append(result)
- elif result["_index"] == "ta_channel":
- channel_results.append(result)
- elif result["_index"] == "ta_playlist":
- playlist_results.append(result)
-
- all_results = {
- "video_results": video_results,
- "channel_results": channel_results,
- "playlist_results": playlist_results,
- }
-
- return all_results
diff --git a/tubearchivist/home/src/frontend/watched.py b/tubearchivist/home/src/frontend/watched.py
deleted file mode 100644
index 85aa1ab..0000000
--- a/tubearchivist/home/src/frontend/watched.py
+++ /dev/null
@@ -1,98 +0,0 @@
-"""
-functionality:
-- handle watched state for videos, channels and playlists
-"""
-
-from datetime import datetime
-
-from home.src.es.connect import ElasticWrap
-from home.src.ta.helper import UrlListParser
-
-
-class WatchState:
- """handle watched checkbox for videos and channels"""
-
- def __init__(self, youtube_id):
- self.youtube_id = youtube_id
- self.stamp = int(datetime.now().strftime("%s"))
-
- def mark_as_watched(self):
- """update es with new watched value"""
- url_type = self.dedect_type()
- if url_type == "video":
- self.mark_vid_watched()
- elif url_type == "channel":
- self.mark_channel_watched()
- elif url_type == "playlist":
- self.mark_playlist_watched()
-
- print(f"{self.youtube_id}: marked as watched")
-
- def mark_as_unwatched(self):
- """revert watched state to false"""
- url_type = self.dedect_type()
- if url_type == "video":
- self.mark_vid_watched(revert=True)
-
- print(f"{self.youtube_id}: revert as unwatched")
-
- def dedect_type(self):
- """find youtube id type"""
- print(self.youtube_id)
- url_process = UrlListParser(self.youtube_id).process_list()
- url_type = url_process[0]["type"]
- return url_type
-
- def mark_vid_watched(self, revert=False):
- """change watched status of single video"""
- path = f"ta_video/_update/{self.youtube_id}"
- data = {
- "doc": {"player": {"watched": True, "watched_date": self.stamp}}
- }
- if revert:
- data["doc"]["player"]["watched"] = False
-
- response, status_code = ElasticWrap(path).post(data=data)
- if status_code != 200:
- print(response)
- raise ValueError("failed to mark video as watched")
-
- def mark_channel_watched(self):
- """change watched status of every video in channel"""
- path = "ta_video/_update_by_query"
- must_list = [
- {"term": {"channel.channel_id": {"value": self.youtube_id}}},
- {"term": {"player.watched": {"value": False}}},
- ]
- data = {
- "query": {"bool": {"must": must_list}},
- "script": {
- "source": "ctx._source.player['watched'] = true",
- "lang": "painless",
- },
- }
-
- response, status_code = ElasticWrap(path).post(data=data)
- if status_code != 200:
- print(response)
- raise ValueError("failed mark channel as watched")
-
- def mark_playlist_watched(self):
- """change watched state of all videos in playlist"""
- path = "ta_video/_update_by_query"
- must_list = [
- {"term": {"playlist.keyword": {"value": self.youtube_id}}},
- {"term": {"player.watched": {"value": False}}},
- ]
- data = {
- "query": {"bool": {"must": must_list}},
- "script": {
- "source": "ctx._source.player['watched'] = true",
- "lang": "painless",
- },
- }
-
- response, status_code = ElasticWrap(path).post(data=data)
- if status_code != 200:
- print(response)
- raise ValueError("failed mark playlist as watched")
diff --git a/tubearchivist/home/src/index/__init__.py b/tubearchivist/home/src/index/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/index/channel.py b/tubearchivist/home/src/index/channel.py
deleted file mode 100644
index 06d0086..0000000
--- a/tubearchivist/home/src/index/channel.py
+++ /dev/null
@@ -1,375 +0,0 @@
-"""
-functionality:
-- get metadata from youtube for a channel
-- index and update in es
-"""
-
-import json
-import os
-import re
-from datetime import datetime
-
-import requests
-import yt_dlp
-from bs4 import BeautifulSoup
-from home.src.download import queue # partial import
-from home.src.download.thumbnails import ThumbManager
-from home.src.es.connect import ElasticWrap, IndexPaginate
-from home.src.index.generic import YouTubeItem
-from home.src.index.playlist import YoutubePlaylist
-from home.src.ta.helper import clean_string, requests_headers
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class ChannelScraper:
- """custom scraper using bs4 to scrape channel about page
- will be able to be integrated into yt-dlp
- once #2237 and #2350 are merged upstream
- """
-
- def __init__(self, channel_id):
- self.channel_id = channel_id
- self.soup = False
- self.yt_json = False
- self.json_data = False
-
- def get_json(self):
- """main method to return channel dict"""
- self.get_soup()
- self._extract_yt_json()
- self._parse_channel_main()
- self._parse_channel_meta()
- return self.json_data
-
- def get_soup(self):
- """return soup from youtube"""
- print(f"{self.channel_id}: scrape channel data from youtube")
- url = f"https://www.youtube.com/channel/{self.channel_id}/about?hl=en"
- cookies = {"CONSENT": "YES+xxxxxxxxxxxxxxxxxxxxxxxxxxx"}
- response = requests.get(
- url, cookies=cookies, headers=requests_headers()
- )
- if response.ok:
- channel_page = response.text
- else:
- print(f"{self.channel_id}: failed to extract channel info")
- raise ConnectionError
- self.soup = BeautifulSoup(channel_page, "html.parser")
-
- def _extract_yt_json(self):
- """parse soup and get ytInitialData json"""
- all_scripts = self.soup.find("body").find_all("script")
- for script in all_scripts:
- if "var ytInitialData = " in str(script):
- script_content = str(script)
- break
- # extract payload
- script_content = script_content.split("var ytInitialData = ")[1]
- json_raw = script_content.rstrip(";")
- self.yt_json = json.loads(json_raw)
-
- def _parse_channel_main(self):
- """extract maintab values from scraped channel json data"""
- main_tab = self.yt_json["header"]["c4TabbedHeaderRenderer"]
- # build and return dict
- self.json_data = {
- "channel_active": True,
- "channel_last_refresh": int(datetime.now().strftime("%s")),
- "channel_subs": self._get_channel_subs(main_tab),
- "channel_name": main_tab["title"],
- "channel_banner_url": self._get_thumbnails(main_tab, "banner"),
- "channel_tvart_url": self._get_thumbnails(main_tab, "tvBanner"),
- "channel_id": self.channel_id,
- "channel_subscribed": False,
- }
-
- @staticmethod
- def _get_thumbnails(main_tab, thumb_name):
- """extract banner url from main_tab"""
- try:
- all_banners = main_tab[thumb_name]["thumbnails"]
- banner = sorted(all_banners, key=lambda k: k["width"])[-1]["url"]
- except KeyError:
- banner = False
-
- return banner
-
- @staticmethod
- def _get_channel_subs(main_tab):
- """process main_tab to get channel subs as int"""
- try:
- sub_text_simple = main_tab["subscriberCountText"]["simpleText"]
- sub_text = sub_text_simple.split(" ")[0]
- if sub_text[-1] == "K":
- channel_subs = int(float(sub_text.replace("K", "")) * 1000)
- elif sub_text[-1] == "M":
- channel_subs = int(float(sub_text.replace("M", "")) * 1000000)
- elif int(sub_text) >= 0:
- channel_subs = int(sub_text)
- else:
- message = f"{sub_text} not dealt with"
- print(message)
- except KeyError:
- channel_subs = 0
-
- return channel_subs
-
- def _parse_channel_meta(self):
- """extract meta tab values from channel payload"""
- # meta tab
- meta_tab = self.yt_json["metadata"]["channelMetadataRenderer"]
- all_thumbs = meta_tab["avatar"]["thumbnails"]
- thumb_url = sorted(all_thumbs, key=lambda k: k["width"])[-1]["url"]
- # stats tab
- renderer = "twoColumnBrowseResultsRenderer"
- all_tabs = self.yt_json["contents"][renderer]["tabs"]
- for tab in all_tabs:
- if "tabRenderer" in tab.keys():
- if tab["tabRenderer"]["title"] == "About":
- about_tab = tab["tabRenderer"]["content"][
- "sectionListRenderer"
- ]["contents"][0]["itemSectionRenderer"]["contents"][0][
- "channelAboutFullMetadataRenderer"
- ]
- break
- try:
- channel_views_text = about_tab["viewCountText"]["simpleText"]
- channel_views = int(re.sub(r"\D", "", channel_views_text))
- except KeyError:
- channel_views = 0
-
- self.json_data.update(
- {
- "channel_description": meta_tab["description"],
- "channel_thumb_url": thumb_url,
- "channel_views": channel_views,
- }
- )
-
-
-class YoutubeChannel(YouTubeItem):
- """represents a single youtube channel"""
-
- es_path = False
- index_name = "ta_channel"
- yt_base = "https://www.youtube.com/channel/"
-
- def __init__(self, youtube_id):
- super().__init__(youtube_id)
- self.es_path = f"{self.index_name}/_doc/{youtube_id}"
- self.all_playlists = False
-
- def build_json(self, upload=False):
- """get from es or from youtube"""
- self.get_from_es()
- if self.json_data:
- return
-
- self.get_from_youtube()
- if upload:
- self.upload_to_es()
- return
-
- def get_from_youtube(self):
- """use bs4 to scrape channel about page"""
- self.json_data = ChannelScraper(self.youtube_id).get_json()
- self.get_channel_art()
-
- def get_channel_art(self):
- """download channel art for new channels"""
- channel_id = self.youtube_id
- channel_thumb = self.json_data["channel_thumb_url"]
- channel_banner = self.json_data["channel_banner_url"]
- ThumbManager().download_chan(
- [(channel_id, channel_thumb, channel_banner)]
- )
-
- def sync_to_videos(self):
- """sync new channel_dict to all videos of channel"""
- # add ingest pipeline
- processors = []
- for field, value in self.json_data.items():
- line = {"set": {"field": "channel." + field, "value": value}}
- processors.append(line)
- data = {"description": self.youtube_id, "processors": processors}
- ingest_path = f"_ingest/pipeline/{self.youtube_id}"
- _, _ = ElasticWrap(ingest_path).put(data)
- # apply pipeline
- data = {"query": {"match": {"channel.channel_id": self.youtube_id}}}
- update_path = f"ta_video/_update_by_query?pipeline={self.youtube_id}"
- _, _ = ElasticWrap(update_path).post(data)
-
- def get_folder_path(self):
- """get folder where media files get stored"""
- channel_name = self.json_data["channel_name"]
- folder_name = clean_string(channel_name)
- if len(folder_name) <= 3:
- # fall back to channel id
- folder_name = self.json_data["channel_id"]
- folder_path = os.path.join(self.app_conf["videos"], folder_name)
- return folder_path
-
- def delete_es_videos(self):
- """delete all channel documents from elasticsearch"""
- data = {
- "query": {
- "term": {"channel.channel_id": {"value": self.youtube_id}}
- }
- }
- _, _ = ElasticWrap("ta_video/_delete_by_query").post(data)
-
- def delete_playlists(self):
- """delete all indexed playlist from es"""
- all_playlists = self.get_indexed_playlists()
- for playlist in all_playlists:
- playlist_id = playlist["playlist_id"]
- YoutubePlaylist(playlist_id).delete_metadata()
-
- def delete_channel(self):
- """delete channel and all videos"""
- print(f"{self.youtube_id}: delete channel")
- self.get_from_es()
- folder_path = self.get_folder_path()
- print(f"{self.youtube_id}: delete all media files")
- try:
- all_videos = os.listdir(folder_path)
- for video in all_videos:
- video_path = os.path.join(folder_path, video)
- os.remove(video_path)
- os.rmdir(folder_path)
- except FileNotFoundError:
- print(f"no videos found for {folder_path}")
-
- print(f"{self.youtube_id}: delete indexed playlists")
- self.delete_playlists()
- print(f"{self.youtube_id}: delete indexed videos")
- self.delete_es_videos()
- self.del_in_es()
-
- def index_channel_playlists(self):
- """add all playlists of channel to index"""
- print(f"{self.youtube_id}: index all playlists")
- self.get_from_es()
- channel_name = self.json_data["channel_name"]
- mess_dict = {
- "status": "message:playlistscan",
- "level": "info",
- "title": "Looking for playlists",
- "message": f"{channel_name}: Scanning channel in progress",
- }
- RedisArchivist().set_message("message:playlistscan", mess_dict)
- self.get_all_playlists()
- if not self.all_playlists:
- print(f"{self.youtube_id}: no playlists found.")
- return
-
- all_youtube_ids = self.get_all_video_ids()
- for idx, playlist in enumerate(self.all_playlists):
- self._notify_single_playlist(idx, playlist)
- self._index_single_playlist(playlist, all_youtube_ids)
-
- def _notify_single_playlist(self, idx, playlist):
- """send notification"""
- channel_name = self.json_data["channel_name"]
- mess_dict = {
- "status": "message:playlistscan",
- "level": "info",
- "title": f"{channel_name}: Scanning channel for playlists",
- "message": f"Progress: {idx + 1}/{len(self.all_playlists)}",
- }
- RedisArchivist().set_message("message:playlistscan", mess_dict)
- print("add playlist: " + playlist[1])
-
- @staticmethod
- def _index_single_playlist(playlist, all_youtube_ids):
- """add single playlist if needed"""
- playlist = YoutubePlaylist(playlist[0])
- playlist.all_youtube_ids = all_youtube_ids
- playlist.build_json()
- if not playlist.json_data:
- return
-
- entries = playlist.json_data["playlist_entries"]
- downloaded = [i for i in entries if i["downloaded"]]
- if not downloaded:
- return
-
- playlist.upload_to_es()
- playlist.add_vids_to_playlist()
- playlist.get_playlist_art()
-
- @staticmethod
- def get_all_video_ids():
- """match all playlists with videos"""
- handler = queue.PendingList()
- handler.get_download()
- handler.get_indexed()
- all_youtube_ids = [i["youtube_id"] for i in handler.all_videos]
-
- return all_youtube_ids
-
- def get_all_playlists(self):
- """get all playlists owned by this channel"""
- url = (
- f"https://www.youtube.com/channel/{self.youtube_id}"
- + "/playlists?view=1&sort=dd&shelf_id=0"
- )
- obs = {
- "quiet": True,
- "skip_download": True,
- "extract_flat": True,
- }
- playlists = yt_dlp.YoutubeDL(obs).extract_info(url)
- all_entries = [(i["id"], i["title"]) for i in playlists["entries"]]
- self.all_playlists = all_entries
-
- def get_indexed_playlists(self):
- """get all indexed playlists from channel"""
- data = {
- "query": {
- "term": {"playlist_channel_id": {"value": self.youtube_id}}
- },
- "sort": [{"playlist_channel.keyword": {"order": "desc"}}],
- }
- all_playlists = IndexPaginate("ta_playlist", data).get_results()
- return all_playlists
-
- def get_overwrites(self):
- """get all per channel overwrites"""
- return self.json_data.get("channel_overwrites", False)
-
- def set_overwrites(self, overwrites):
- """set per channel overwrites"""
- valid_keys = [
- "download_format",
- "autodelete_days",
- "index_playlists",
- "integrate_sponsorblock",
- ]
-
- to_write = self.json_data.get("channel_overwrites", {})
- for key, value in overwrites.items():
- if key not in valid_keys:
- raise ValueError(f"invalid overwrite key: {key}")
- if value == "disable":
- to_write[key] = False
- continue
- if value in [0, "0"]:
- del to_write[key]
- continue
- if value == "1":
- to_write[key] = True
- continue
- if value:
- to_write.update({key: value})
-
- self.json_data["channel_overwrites"] = to_write
-
-
-def channel_overwrites(channel_id, overwrites):
- """collection to overwrite settings per channel"""
- channel = YoutubeChannel(channel_id)
- channel.build_json()
- channel.set_overwrites(overwrites)
- channel.upload_to_es()
- channel.sync_to_videos()
diff --git a/tubearchivist/home/src/index/filesystem.py b/tubearchivist/home/src/index/filesystem.py
deleted file mode 100644
index c6f1caa..0000000
--- a/tubearchivist/home/src/index/filesystem.py
+++ /dev/null
@@ -1,313 +0,0 @@
-"""
-Functionality:
-- reindexing old documents
-- syncing updated values between indexes
-- scan the filesystem to delete or index
-"""
-
-import json
-import os
-import re
-import shutil
-import subprocess
-
-from home.src.download.queue import PendingList
-from home.src.download.yt_dlp_handler import VideoDownloader
-from home.src.es.connect import ElasticWrap
-from home.src.index.reindex import Reindex
-from home.src.index.video import index_new_video
-from home.src.ta.config import AppConfig
-from home.src.ta.helper import clean_string, ignore_filelist
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class FilesystemScanner:
- """handle scanning and fixing from filesystem"""
-
- CONFIG = AppConfig().config
- VIDEOS = CONFIG["application"]["videos"]
-
- def __init__(self):
- self.all_downloaded = self.get_all_downloaded()
- self.all_indexed = self.get_all_indexed()
- self.mismatch = None
- self.to_rename = None
- self.to_index = None
- self.to_delete = None
-
- def get_all_downloaded(self):
- """get a list of all video files downloaded"""
- channels = os.listdir(self.VIDEOS)
- all_channels = ignore_filelist(channels)
- all_channels.sort()
- all_downloaded = []
- for channel_name in all_channels:
- channel_path = os.path.join(self.VIDEOS, channel_name)
- channel_files = os.listdir(channel_path)
- channel_files_clean = ignore_filelist(channel_files)
- all_videos = [i for i in channel_files_clean if i.endswith(".mp4")]
- for video in all_videos:
- youtube_id = video[9:20]
- all_downloaded.append((channel_name, video, youtube_id))
-
- return all_downloaded
-
- @staticmethod
- def get_all_indexed():
- """get a list of all indexed videos"""
- index_handler = PendingList()
- index_handler.get_download()
- index_handler.get_indexed()
-
- all_indexed = []
- for video in index_handler.all_videos:
- youtube_id = video["youtube_id"]
- media_url = video["media_url"]
- published = video["published"]
- title = video["title"]
- all_indexed.append((youtube_id, media_url, published, title))
- return all_indexed
-
- def list_comarison(self):
- """compare the lists to figure out what to do"""
- self.find_unindexed()
- self.find_missing()
- self.find_bad_media_url()
-
- def find_unindexed(self):
- """find video files without a matching document indexed"""
- all_indexed_ids = [i[0] for i in self.all_indexed]
- to_index = []
- for downloaded in self.all_downloaded:
- if downloaded[2] not in all_indexed_ids:
- to_index.append(downloaded)
-
- self.to_index = to_index
-
- def find_missing(self):
- """find indexed videos without matching media file"""
- all_downloaded_ids = [i[2] for i in self.all_downloaded]
- to_delete = []
- for video in self.all_indexed:
- youtube_id = video[0]
- if youtube_id not in all_downloaded_ids:
- to_delete.append(video)
-
- self.to_delete = to_delete
-
- def find_bad_media_url(self):
- """rename media files not matching the indexed title"""
- to_fix = []
- to_rename = []
- for downloaded in self.all_downloaded:
- channel, filename, downloaded_id = downloaded
- # find in indexed
- for indexed in self.all_indexed:
- indexed_id, media_url, published, title = indexed
- if indexed_id == downloaded_id:
- # found it
- title_c = clean_string(title)
- pub = published.replace("-", "")
- expected_filename = f"{pub}_{indexed_id}_{title_c}.mp4"
- new_url = os.path.join(channel, expected_filename)
- if expected_filename != filename:
- # file to rename
- to_rename.append(
- (channel, filename, expected_filename)
- )
- if media_url != new_url:
- # media_url to update in es
- to_fix.append((indexed_id, new_url))
-
- break
-
- self.mismatch = to_fix
- self.to_rename = to_rename
-
- def rename_files(self):
- """rename media files as identified by find_bad_media_url"""
- for bad_filename in self.to_rename:
- channel, filename, expected_filename = bad_filename
- print(f"renaming [{filename}] to [{expected_filename}]")
- old_path = os.path.join(self.VIDEOS, channel, filename)
- new_path = os.path.join(self.VIDEOS, channel, expected_filename)
- os.rename(old_path, new_path)
-
- def send_mismatch_bulk(self):
- """build bulk update"""
- bulk_list = []
- for video_mismatch in self.mismatch:
- youtube_id, media_url = video_mismatch
- print(f"{youtube_id}: fixing media url {media_url}")
- action = {"update": {"_id": youtube_id, "_index": "ta_video"}}
- source = {"doc": {"media_url": media_url}}
- bulk_list.append(json.dumps(action))
- bulk_list.append(json.dumps(source))
- # add last newline
- bulk_list.append("\n")
- data = "\n".join(bulk_list)
- _, _ = ElasticWrap("_bulk").post(data=data, ndjson=True)
-
- def delete_from_index(self):
- """find indexed but deleted mediafile"""
- for indexed in self.to_delete:
- youtube_id = indexed[0]
- print(f"deleting {youtube_id} from index")
- path = f"ta_video/_doc/{youtube_id}"
- _, _ = ElasticWrap(path).delete()
-
-
-class ManualImport:
- """import and indexing existing video files"""
-
- CONFIG = AppConfig().config
- CACHE_DIR = CONFIG["application"]["cache_dir"]
- IMPORT_DIR = os.path.join(CACHE_DIR, "import")
-
- def __init__(self):
- self.identified = self.import_folder_parser()
-
- def import_folder_parser(self):
- """detect files in import folder"""
- import_files = os.listdir(self.IMPORT_DIR)
- to_import = ignore_filelist(import_files)
- to_import.sort()
- video_files = [i for i in to_import if not i.endswith(".json")]
-
- identified = []
-
- for file_path in video_files:
-
- file_dict = {"video_file": file_path}
- file_name, _ = os.path.splitext(file_path)
-
- matching_json = [
- i
- for i in to_import
- if i.startswith(file_name) and i.endswith(".json")
- ]
- if matching_json:
- json_file = matching_json[0]
- youtube_id = self.extract_id_from_json(json_file)
- file_dict.update({"json_file": json_file})
- else:
- youtube_id = self.extract_id_from_filename(file_name)
- file_dict.update({"json_file": False})
-
- file_dict.update({"youtube_id": youtube_id})
- identified.append(file_dict)
-
- return identified
-
- @staticmethod
- def extract_id_from_filename(file_name):
- """
- look at the file name for the youtube id
- expects filename ending in [].
- """
- id_search = re.search(r"\[([a-zA-Z0-9_-]{11})\]$", file_name)
- if id_search:
- youtube_id = id_search.group(1)
- return youtube_id
-
- print("failed to extract youtube id for: " + file_name)
- raise Exception
-
- def extract_id_from_json(self, json_file):
- """open json file and extract id"""
- json_path = os.path.join(self.CACHE_DIR, "import", json_file)
- with open(json_path, "r", encoding="utf-8") as f:
- json_content = f.read()
-
- youtube_id = json.loads(json_content)["id"]
-
- return youtube_id
-
- def process_import(self):
- """go through identified media files"""
-
- all_videos_added = []
-
- for media_file in self.identified:
- json_file = media_file["json_file"]
- video_file = media_file["video_file"]
- youtube_id = media_file["youtube_id"]
-
- video_path = os.path.join(self.CACHE_DIR, "import", video_file)
-
- self.move_to_cache(video_path, youtube_id)
-
- # identify and archive
- vid_dict = index_new_video(youtube_id)
- VideoDownloader([youtube_id]).move_to_archive(vid_dict)
- youtube_id = vid_dict["youtube_id"]
- thumb_url = vid_dict["vid_thumb_url"]
- all_videos_added.append((youtube_id, thumb_url))
-
- # cleanup
- if os.path.exists(video_path):
- os.remove(video_path)
- if json_file:
- json_path = os.path.join(self.CACHE_DIR, "import", json_file)
- os.remove(json_path)
-
- return all_videos_added
-
- def move_to_cache(self, video_path, youtube_id):
- """move identified video file to cache, convert to mp4"""
- file_name = os.path.split(video_path)[-1]
- video_file, ext = os.path.splitext(file_name)
-
- # make sure youtube_id is in filename
- if youtube_id not in video_file:
- video_file = f"{video_file}_{youtube_id}"
-
- # move, convert if needed
- if ext == ".mp4":
- new_file = video_file + ext
- dest_path = os.path.join(self.CACHE_DIR, "download", new_file)
- shutil.move(video_path, dest_path)
- else:
- print(f"processing with ffmpeg: {video_file}")
- new_file = video_file + ".mp4"
- dest_path = os.path.join(self.CACHE_DIR, "download", new_file)
- subprocess.run(
- [
- "ffmpeg",
- "-i",
- video_path,
- dest_path,
- "-loglevel",
- "warning",
- "-stats",
- ],
- check=True,
- )
-
-
-def scan_filesystem():
- """grouped function to delete and update index"""
- filesystem_handler = FilesystemScanner()
- filesystem_handler.list_comarison()
- if filesystem_handler.to_rename:
- print("renaming files")
- filesystem_handler.rename_files()
- if filesystem_handler.mismatch:
- print("fixing media urls in index")
- filesystem_handler.send_mismatch_bulk()
- if filesystem_handler.to_delete:
- print("delete metadata from index")
- filesystem_handler.delete_from_index()
- if filesystem_handler.to_index:
- print("index new videos")
- for missing_vid in filesystem_handler.to_index:
- youtube_id = missing_vid[2]
- index_new_video(youtube_id)
-
-
-def reindex_old_documents():
- """daily refresh of old documents"""
- handler = Reindex()
- handler.check_outdated()
- handler.reindex()
- RedisArchivist().set_message("last_reindex", handler.now, expire=False)
diff --git a/tubearchivist/home/src/index/generic.py b/tubearchivist/home/src/index/generic.py
deleted file mode 100644
index 709dde9..0000000
--- a/tubearchivist/home/src/index/generic.py
+++ /dev/null
@@ -1,150 +0,0 @@
-"""
-functionality:
-- generic base class to inherit from for video, channel and playlist
-"""
-
-import math
-
-import yt_dlp
-from home.src.es.connect import ElasticWrap
-from home.src.ta.config import AppConfig
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class YouTubeItem:
- """base class for youtube"""
-
- es_path = False
- index_name = False
- yt_base = False
- yt_obs = {
- "quiet": True,
- "default_search": "ytsearch",
- "skip_download": True,
- "check_formats": "selected",
- "noplaylist": True,
- }
-
- def __init__(self, youtube_id):
- self.youtube_id = youtube_id
- self.config = False
- self.app_conf = False
- self.youtube_meta = False
- self.json_data = False
- self._get_conf()
-
- def _get_conf(self):
- """read user conf"""
- self.config = AppConfig().config
- self.app_conf = self.config["application"]
-
- def get_from_youtube(self):
- """use yt-dlp to get meta data from youtube"""
- print(f"{self.youtube_id}: get metadata from youtube")
- try:
- yt_item = yt_dlp.YoutubeDL(self.yt_obs)
- response = yt_item.extract_info(self.yt_base + self.youtube_id)
- except (
- yt_dlp.utils.ExtractorError,
- yt_dlp.utils.DownloadError,
- ):
- print(f"{self.youtube_id}: failed to get info from youtube")
- response = False
-
- self.youtube_meta = response
-
- def get_from_es(self):
- """get indexed data from elastic search"""
- print(f"{self.youtube_id}: get metadata from es")
- response, _ = ElasticWrap(f"{self.es_path}").get()
- source = response.get("_source")
- self.json_data = source
-
- def upload_to_es(self):
- """add json_data to elastic"""
- _, _ = ElasticWrap(self.es_path).put(self.json_data, refresh=True)
-
- def deactivate(self):
- """deactivate document in es"""
- print(f"{self.youtube_id}: deactivate document")
- key_match = {
- "ta_video": "active",
- "ta_channel": "channel_active",
- "ta_playlist": "playlist_active",
- }
- update_path = f"{self.index_name}/_update/{self.youtube_id}"
- data = {
- "script": f"ctx._source.{key_match.get(self.index_name)} = false"
- }
- _, _ = ElasticWrap(update_path).post(data)
-
- def del_in_es(self):
- """delete item from elastic search"""
- print(f"{self.youtube_id}: delete from es")
- _, _ = ElasticWrap(self.es_path).delete()
-
-
-class Pagination:
- """
- figure out the pagination based on page size and total_hits
- """
-
- def __init__(self, page_get, user_id, search_get=False):
- self.user_id = user_id
- self.page_size = self.get_page_size()
- self.page_get = page_get
- self.search_get = search_get
- self.pagination = self.first_guess()
-
- def get_page_size(self):
- """get default or user modified page_size"""
- key = f"{self.user_id}:page_size"
- page_size = RedisArchivist().get_message(key)["status"]
- if not page_size:
- config = AppConfig().config
- page_size = config["archive"]["page_size"]
-
- return page_size
-
- def first_guess(self):
- """build first guess before api call"""
- page_get = self.page_get
- if page_get in [0, 1]:
- page_from = 0
- prev_pages = False
- elif page_get > 1:
- page_from = (page_get - 1) * self.page_size
- prev_pages = [
- i for i in range(page_get - 1, page_get - 6, -1) if i > 1
- ]
- prev_pages.reverse()
- pagination = {
- "page_size": self.page_size,
- "page_from": page_from,
- "prev_pages": prev_pages,
- "current_page": page_get,
- "max_hits": False,
- }
- if self.search_get:
- pagination.update({"search_get": self.search_get})
- return pagination
-
- def validate(self, total_hits):
- """validate pagination with total_hits after making api call"""
- page_get = self.page_get
- max_pages = math.ceil(total_hits / self.page_size)
- if total_hits >= 10000:
- # es returns maximal 10000 results
- self.pagination["max_hits"] = True
- max_pages = max_pages - 1
-
- if page_get < max_pages and max_pages > 1:
- self.pagination["last_page"] = max_pages
- else:
- self.pagination["last_page"] = False
- next_pages = [
- i for i in range(page_get + 1, page_get + 6) if 1 < i < max_pages
- ]
-
- self.pagination["next_pages"] = next_pages
- self.pagination["total_hits"] = total_hits
diff --git a/tubearchivist/home/src/index/playlist.py b/tubearchivist/home/src/index/playlist.py
deleted file mode 100644
index fce019d..0000000
--- a/tubearchivist/home/src/index/playlist.py
+++ /dev/null
@@ -1,208 +0,0 @@
-"""
-functionality:
-- get metadata from youtube for a playlist
-- index and update in es
-"""
-
-import json
-from datetime import datetime
-
-from home.src.download.thumbnails import ThumbManager
-from home.src.es.connect import ElasticWrap
-from home.src.index.generic import YouTubeItem
-from home.src.index.video import YoutubeVideo
-
-
-class YoutubePlaylist(YouTubeItem):
- """represents a single youtube playlist"""
-
- es_path = False
- index_name = "ta_playlist"
- yt_obs = {
- "default_search": "ytsearch",
- "quiet": True,
- "skip_download": True,
- "extract_flat": True,
- }
- yt_base = "https://www.youtube.com/playlist?list="
-
- def __init__(self, youtube_id):
- super().__init__(youtube_id)
- self.es_path = f"{self.index_name}/_doc/{youtube_id}"
- self.all_members = False
- self.nav = False
- self.all_youtube_ids = []
-
- def build_json(self, scrape=False):
- """collection to create json_data"""
- self.get_from_es()
- if self.json_data:
- subscribed = self.json_data.get("playlist_subscribed")
- else:
- subscribed = False
-
- if scrape or not self.json_data:
- self.get_from_youtube()
- self.process_youtube_meta()
- self.get_entries()
- self.json_data["playlist_entries"] = self.all_members
- self.get_playlist_art()
- self.json_data["playlist_subscribed"] = subscribed
-
- def process_youtube_meta(self):
- """extract relevant fields from youtube"""
- self.json_data = {
- "playlist_id": self.youtube_id,
- "playlist_active": True,
- "playlist_name": self.youtube_meta["title"],
- "playlist_channel": self.youtube_meta["channel"],
- "playlist_channel_id": self.youtube_meta["channel_id"],
- "playlist_thumbnail": self.youtube_meta["thumbnails"][-1]["url"],
- "playlist_description": self.youtube_meta["description"] or False,
- "playlist_last_refresh": int(datetime.now().strftime("%s")),
- }
-
- def get_entries(self, playlistend=False):
- """get all videos in playlist"""
- if playlistend:
- # implement playlist end
- print(playlistend)
- all_members = []
- for idx, entry in enumerate(self.youtube_meta["entries"]):
- if self.all_youtube_ids:
- downloaded = entry["id"] in self.all_youtube_ids
- else:
- downloaded = False
- if not entry["uploader"]:
- continue
- to_append = {
- "youtube_id": entry["id"],
- "title": entry["title"],
- "uploader": entry["uploader"],
- "idx": idx,
- "downloaded": downloaded,
- }
- all_members.append(to_append)
-
- self.all_members = all_members
-
- @staticmethod
- def get_playlist_art():
- """download artwork of playlist"""
- thumbnails = ThumbManager()
- missing_playlists = thumbnails.get_missing_playlists()
- thumbnails.download_playlist(missing_playlists)
-
- def add_vids_to_playlist(self):
- """sync the playlist id to videos"""
- script = (
- 'if (!ctx._source.containsKey("playlist")) '
- + "{ctx._source.playlist = [params.playlist]} "
- + "else if (!ctx._source.playlist.contains(params.playlist)) "
- + "{ctx._source.playlist.add(params.playlist)} "
- + "else {ctx.op = 'none'}"
- )
-
- bulk_list = []
- for entry in self.json_data["playlist_entries"]:
- video_id = entry["youtube_id"]
- action = {"update": {"_id": video_id, "_index": "ta_video"}}
- source = {
- "script": {
- "source": script,
- "lang": "painless",
- "params": {"playlist": self.youtube_id},
- }
- }
- bulk_list.append(json.dumps(action))
- bulk_list.append(json.dumps(source))
-
- # add last newline
- bulk_list.append("\n")
- query_str = "\n".join(bulk_list)
-
- ElasticWrap("_bulk").post(query_str, ndjson=True)
-
- def update_playlist(self):
- """update metadata for playlist with data from YouTube"""
- self.get_from_es()
- subscribed = self.json_data["playlist_subscribed"]
- self.get_from_youtube()
- if not self.json_data:
- # return false to deactivate
- return False
-
- self.json_data["playlist_subscribed"] = subscribed
- self.upload_to_es()
- return True
-
- def build_nav(self, youtube_id):
- """find next and previous in playlist of a given youtube_id"""
- all_entries_available = self.json_data["playlist_entries"]
- all_entries = [i for i in all_entries_available if i["downloaded"]]
- current = [i for i in all_entries if i["youtube_id"] == youtube_id]
- # stop if not found or playlist of 1
- if not current or not len(all_entries) > 1:
- return
-
- current_idx = all_entries.index(current[0])
- if current_idx == 0:
- previous_item = False
- else:
- previous_item = all_entries[current_idx - 1]
- prev_thumb = ThumbManager().vid_thumb_path(
- previous_item["youtube_id"]
- )
- previous_item["vid_thumb"] = prev_thumb
-
- if current_idx == len(all_entries) - 1:
- next_item = False
- else:
- next_item = all_entries[current_idx + 1]
- next_thumb = ThumbManager().vid_thumb_path(next_item["youtube_id"])
- next_item["vid_thumb"] = next_thumb
-
- self.nav = {
- "playlist_meta": {
- "current_idx": current[0]["idx"],
- "playlist_id": self.youtube_id,
- "playlist_name": self.json_data["playlist_name"],
- "playlist_channel": self.json_data["playlist_channel"],
- },
- "playlist_previous": previous_item,
- "playlist_next": next_item,
- }
- return
-
- def delete_metadata(self):
- """delete metadata for playlist"""
- script = (
- "ctx._source.playlist.removeAll("
- + "Collections.singleton(params.playlist)) "
- )
- data = {
- "query": {
- "term": {"playlist.keyword": {"value": self.youtube_id}}
- },
- "script": {
- "source": script,
- "lang": "painless",
- "params": {"playlist": self.youtube_id},
- },
- }
- _, _ = ElasticWrap("ta_video/_update_by_query").post(data)
- self.del_in_es()
-
- def delete_videos_playlist(self):
- """delete playlist with all videos"""
- print(f"{self.youtube_id}: delete playlist")
- self.get_from_es()
- all_youtube_id = [
- i["youtube_id"]
- for i in self.json_data["playlist_entries"]
- if i["downloaded"]
- ]
- for youtube_id in all_youtube_id:
- YoutubeVideo(youtube_id).delete_media_file()
-
- self.delete_metadata()
diff --git a/tubearchivist/home/src/index/reindex.py b/tubearchivist/home/src/index/reindex.py
deleted file mode 100644
index ed29e89..0000000
--- a/tubearchivist/home/src/index/reindex.py
+++ /dev/null
@@ -1,233 +0,0 @@
-"""
-functionality:
-- periodically refresh documents
-- index and update in es
-"""
-
-from datetime import datetime
-from math import ceil
-from time import sleep
-
-from home.src.download.queue import PendingList
-from home.src.download.thumbnails import ThumbManager
-from home.src.es.connect import ElasticWrap
-from home.src.index.channel import YoutubeChannel
-from home.src.index.playlist import YoutubePlaylist
-from home.src.index.video import YoutubeVideo
-from home.src.ta.config import AppConfig
-
-
-class Reindex:
- """check for outdated documents and refresh data from youtube"""
-
- MATCH_FIELD = {
- "ta_video": "active",
- "ta_channel": "channel_active",
- "ta_playlist": "playlist_active",
- }
- MULTIPLY = 1.2
-
- def __init__(self):
- # config
- self.now = int(datetime.now().strftime("%s"))
- self.config = AppConfig().config
- self.interval = self.config["scheduler"]["check_reindex_days"]
- # scan
- self.all_youtube_ids = False
- self.all_channel_ids = False
- self.all_playlist_ids = False
-
- def _get_daily(self):
- """get daily refresh values"""
- total_videos = self._get_total_hits("ta_video")
- video_daily = ceil(total_videos / self.interval * self.MULTIPLY)
- if video_daily >= 10000:
- video_daily = 9999
-
- total_channels = self._get_total_hits("ta_channel")
- channel_daily = ceil(total_channels / self.interval * self.MULTIPLY)
- total_playlists = self._get_total_hits("ta_playlist")
- playlist_daily = ceil(total_playlists / self.interval * self.MULTIPLY)
- return (video_daily, channel_daily, playlist_daily)
-
- def _get_total_hits(self, index):
- """get total hits from index"""
- match_field = self.MATCH_FIELD[index]
- path = f"{index}/_search?filter_path=hits.total"
- data = {"query": {"match": {match_field: True}}}
- response, _ = ElasticWrap(path).post(data=data)
- total_hits = response["hits"]["total"]["value"]
- return total_hits
-
- def _get_unrated_vids(self):
- """get max 200 videos without rating if ryd integration is enabled"""
- data = {
- "size": 200,
- "query": {
- "bool": {
- "must_not": [{"exists": {"field": "stats.average_rating"}}]
- }
- },
- }
- response, _ = ElasticWrap("ta_video/_search").get(data=data)
-
- missing_rating = [i["_id"] for i in response["hits"]["hits"]]
- self.all_youtube_ids = self.all_youtube_ids + missing_rating
-
- def _get_outdated_vids(self, size):
- """get daily videos to refresh"""
- now_lte = self.now - self.interval * 24 * 60 * 60
- must_list = [
- {"match": {"active": True}},
- {"range": {"vid_last_refresh": {"lte": now_lte}}},
- ]
- data = {
- "size": size,
- "query": {"bool": {"must": must_list}},
- "sort": [{"vid_last_refresh": {"order": "asc"}}],
- "_source": False,
- }
- response, _ = ElasticWrap("ta_video/_search").get(data=data)
-
- all_youtube_ids = [i["_id"] for i in response["hits"]["hits"]]
- return all_youtube_ids
-
- def _get_outdated_channels(self, size):
- """get daily channels to refresh"""
- now_lte = self.now - self.interval * 24 * 60 * 60
- must_list = [
- {"match": {"channel_active": True}},
- {"range": {"channel_last_refresh": {"lte": now_lte}}},
- ]
- data = {
- "size": size,
- "query": {"bool": {"must": must_list}},
- "sort": [{"channel_last_refresh": {"order": "asc"}}],
- "_source": False,
- }
- response, _ = ElasticWrap("ta_channel/_search").get(data=data)
-
- all_channel_ids = [i["_id"] for i in response["hits"]["hits"]]
- return all_channel_ids
-
- def _get_outdated_playlists(self, size):
- """get daily outdated playlists to refresh"""
- now_lte = self.now - self.interval * 24 * 60 * 60
- must_list = [
- {"match": {"playlist_active": True}},
- {"range": {"playlist_last_refresh": {"lte": now_lte}}},
- ]
- data = {
- "size": size,
- "query": {"bool": {"must": must_list}},
- "sort": [{"playlist_last_refresh": {"order": "asc"}}],
- "_source": False,
- }
- response, _ = ElasticWrap("ta_playlist/_search").get(data=data)
-
- all_playlist_ids = [i["_id"] for i in response["hits"]["hits"]]
- return all_playlist_ids
-
- def check_outdated(self):
- """add missing vids and channels"""
- video_daily, channel_daily, playlist_daily = self._get_daily()
- self.all_youtube_ids = self._get_outdated_vids(video_daily)
- self.all_channel_ids = self._get_outdated_channels(channel_daily)
- self.all_playlist_ids = self._get_outdated_playlists(playlist_daily)
-
- integrate_ryd = self.config["downloads"]["integrate_ryd"]
- if integrate_ryd:
- self._get_unrated_vids()
-
- @staticmethod
- def _reindex_single_video(youtube_id):
- """refresh data for single video"""
- video = YoutubeVideo(youtube_id)
-
- # read current state
- video.get_from_es()
- player = video.json_data["player"]
- date_downloaded = video.json_data["date_downloaded"]
- channel_dict = video.json_data["channel"]
- playlist = video.json_data.get("playlist")
-
- # get new
- video.build_json()
- if not video.youtube_meta:
- video.deactivate()
- return
-
- video.delete_subtitles()
- video.check_subtitles()
-
- # add back
- video.json_data["player"] = player
- video.json_data["date_downloaded"] = date_downloaded
- video.json_data["channel"] = channel_dict
- if playlist:
- video.json_data["playlist"] = playlist
-
- video.upload_to_es()
-
- thumb_handler = ThumbManager()
- thumb_handler.delete_vid_thumb(youtube_id)
- to_download = (youtube_id, video.json_data["vid_thumb_url"])
- thumb_handler.download_vid([to_download], notify=False)
- return
-
- @staticmethod
- def _reindex_single_channel(channel_id):
- """refresh channel data and sync to videos"""
- channel = YoutubeChannel(channel_id)
- channel.get_from_es()
- subscribed = channel.json_data["channel_subscribed"]
- overwrites = channel.json_data.get("channel_overwrites", False)
- channel.get_from_youtube()
- channel.json_data["channel_subscribed"] = subscribed
- if overwrites:
- channel.json_data["channel_overwrites"] = overwrites
- channel.upload_to_es()
- channel.sync_to_videos()
-
- @staticmethod
- def _reindex_single_playlist(playlist_id, all_indexed_ids):
- """refresh playlist data"""
- playlist = YoutubePlaylist(playlist_id)
- playlist.get_from_es()
- subscribed = playlist.json_data["playlist_subscribed"]
- playlist.all_youtube_ids = all_indexed_ids
- playlist.build_json(scrape=True)
- if not playlist.json_data:
- playlist.deactivate()
- return
-
- playlist.json_data["playlist_subscribed"] = subscribed
- playlist.upload_to_es()
- return
-
- def reindex(self):
- """reindex what's needed"""
- sleep_interval = self.config["downloads"]["sleep_interval"]
- # videos
- print(f"reindexing {len(self.all_youtube_ids)} videos")
- for youtube_id in self.all_youtube_ids:
- self._reindex_single_video(youtube_id)
- if sleep_interval:
- sleep(sleep_interval)
- # channels
- print(f"reindexing {len(self.all_channel_ids)} channels")
- for channel_id in self.all_channel_ids:
- self._reindex_single_channel(channel_id)
- if sleep_interval:
- sleep(sleep_interval)
- # playlist
- print(f"reindexing {len(self.all_playlist_ids)} playlists")
- if self.all_playlist_ids:
- handler = PendingList()
- handler.get_download()
- handler.get_indexed()
- all_indexed_ids = [i["youtube_id"] for i in handler.all_videos]
- for playlist_id in self.all_playlist_ids:
- self._reindex_single_playlist(playlist_id, all_indexed_ids)
- if sleep_interval:
- sleep(sleep_interval)
diff --git a/tubearchivist/home/src/index/video.py b/tubearchivist/home/src/index/video.py
deleted file mode 100644
index 290e0ce..0000000
--- a/tubearchivist/home/src/index/video.py
+++ /dev/null
@@ -1,611 +0,0 @@
-"""
-functionality:
-- get metadata from youtube for a video
-- index and update in es
-"""
-
-import json
-import os
-from datetime import datetime
-
-import requests
-from django.conf import settings
-from home.src.es.connect import ElasticWrap
-from home.src.index import channel as ta_channel
-from home.src.index.generic import YouTubeItem
-from home.src.ta.helper import (
- DurationConverter,
- clean_string,
- randomizor,
- requests_headers,
-)
-from home.src.ta.ta_redis import RedisArchivist
-from ryd_client import ryd_client
-
-
-class YoutubeSubtitle:
- """handle video subtitle functionality"""
-
- def __init__(self, video):
- self.video = video
- self.languages = False
-
- def _sub_conf_parse(self):
- """add additional conf values to self"""
- languages_raw = self.video.config["downloads"]["subtitle"]
- if languages_raw:
- self.languages = [i.strip() for i in languages_raw.split(",")]
-
- def get_subtitles(self):
- """check what to do"""
- self._sub_conf_parse()
- if not self.languages:
- # no subtitles
- return False
-
- relevant_subtitles = []
- for lang in self.languages:
- user_sub = self._get_user_subtitles(lang)
- if user_sub:
- relevant_subtitles.append(user_sub)
- continue
-
- if self.video.config["downloads"]["subtitle_source"] == "auto":
- auto_cap = self._get_auto_caption(lang)
- if auto_cap:
- relevant_subtitles.append(auto_cap)
-
- return relevant_subtitles
-
- def _get_auto_caption(self, lang):
- """get auto_caption subtitles"""
- print(f"{self.video.youtube_id}-{lang}: get auto generated subtitles")
- all_subtitles = self.video.youtube_meta.get("automatic_captions")
-
- if not all_subtitles:
- return False
-
- video_media_url = self.video.json_data["media_url"]
- media_url = video_media_url.replace(".mp4", f"-{lang}.vtt")
- all_formats = all_subtitles.get(lang)
- if not all_formats:
- return False
-
- subtitle = [i for i in all_formats if i["ext"] == "json3"][0]
- subtitle.update(
- {"lang": lang, "source": "auto", "media_url": media_url}
- )
-
- return subtitle
-
- def _normalize_lang(self):
- """normalize country specific language keys"""
- all_subtitles = self.video.youtube_meta.get("subtitles")
- if not all_subtitles:
- return False
-
- all_keys = list(all_subtitles.keys())
- for key in all_keys:
- lang = key.split("-")[0]
- old = all_subtitles.pop(key)
- if lang == "live_chat":
- continue
- all_subtitles[lang] = old
-
- return all_subtitles
-
- def _get_user_subtitles(self, lang):
- """get subtitles uploaded from channel owner"""
- print(f"{self.video.youtube_id}-{lang}: get user uploaded subtitles")
- all_subtitles = self._normalize_lang()
- if not all_subtitles:
- return False
-
- video_media_url = self.video.json_data["media_url"]
- media_url = video_media_url.replace(".mp4", f"-{lang}.vtt")
- all_formats = all_subtitles.get(lang)
- if not all_formats:
- # no user subtitles found
- return False
-
- subtitle = [i for i in all_formats if i["ext"] == "json3"][0]
- subtitle.update(
- {"lang": lang, "source": "user", "media_url": media_url}
- )
-
- return subtitle
-
- def download_subtitles(self, relevant_subtitles):
- """download subtitle files to archive"""
- videos_base = self.video.config["application"]["videos"]
- for subtitle in relevant_subtitles:
- dest_path = os.path.join(videos_base, subtitle["media_url"])
- source = subtitle["source"]
- lang = subtitle.get("lang")
- response = requests.get(
- subtitle["url"], headers=requests_headers()
- )
- if not response.ok:
- print(f"{self.video.youtube_id}: failed to download subtitle")
- print(response.text)
- continue
-
- parser = SubtitleParser(response.text, lang, source)
- parser.process()
- subtitle_str = parser.get_subtitle_str()
- self._write_subtitle_file(dest_path, subtitle_str)
- if self.video.config["downloads"]["subtitle_index"]:
- query_str = parser.create_bulk_import(self.video, source)
- self._index_subtitle(query_str)
-
- @staticmethod
- def _write_subtitle_file(dest_path, subtitle_str):
- """write subtitle file to disk"""
- # create folder here for first video of channel
- os.makedirs(os.path.split(dest_path)[0], exist_ok=True)
- with open(dest_path, "w", encoding="utf-8") as subfile:
- subfile.write(subtitle_str)
-
- @staticmethod
- def _index_subtitle(query_str):
- """send subtitle to es for indexing"""
- _, _ = ElasticWrap("_bulk").post(data=query_str, ndjson=True)
-
-
-class SubtitleParser:
- """parse subtitle str from youtube"""
-
- def __init__(self, subtitle_str, lang, source):
- self.subtitle_raw = json.loads(subtitle_str)
- self.lang = lang
- self.source = source
- self.all_cues = False
-
- def process(self):
- """extract relevant que data"""
- all_events = self.subtitle_raw.get("events")
- if self.source == "auto":
- all_events = self._flat_auto_caption(all_events)
-
- self.all_cues = []
- for idx, event in enumerate(all_events):
- if "dDurationMs" not in event:
- # some events won't have a duration
- print(f"failed to parse event without duration: {event}")
- continue
-
- cue = {
- "start": self._ms_conv(event["tStartMs"]),
- "end": self._ms_conv(event["tStartMs"] + event["dDurationMs"]),
- "text": "".join([i.get("utf8") for i in event["segs"]]),
- "idx": idx + 1,
- }
- self.all_cues.append(cue)
-
- @staticmethod
- def _flat_auto_caption(all_events):
- """flatten autocaption segments"""
- flatten = []
- for event in all_events:
- if "segs" not in event.keys():
- continue
- text = "".join([i.get("utf8") for i in event.get("segs")])
- if not text.strip():
- continue
-
- if flatten:
- # fix overlapping retiming issue
- if "dDurationMs" not in flatten[-1]:
- # some events won't have a duration
- print(f"failed to parse event without duration: {event}")
- continue
-
- last_end = flatten[-1]["tStartMs"] + flatten[-1]["dDurationMs"]
- if event["tStartMs"] < last_end:
- joined = flatten[-1]["segs"][0]["utf8"] + "\n" + text
- flatten[-1]["segs"][0]["utf8"] = joined
- continue
-
- event.update({"segs": [{"utf8": text}]})
- flatten.append(event)
-
- return flatten
-
- @staticmethod
- def _ms_conv(ms):
- """convert ms to timestamp"""
- hours = str((ms // (1000 * 60 * 60)) % 24).zfill(2)
- minutes = str((ms // (1000 * 60)) % 60).zfill(2)
- secs = str((ms // 1000) % 60).zfill(2)
- millis = str(ms % 1000).zfill(3)
-
- return f"{hours}:{minutes}:{secs}.{millis}"
-
- def get_subtitle_str(self):
- """create vtt text str from cues"""
- subtitle_str = f"WEBVTT\nKind: captions\nLanguage: {self.lang}"
-
- for cue in self.all_cues:
- stamp = f"{cue.get('start')} --> {cue.get('end')}"
- cue_text = f"\n\n{cue.get('idx')}\n{stamp}\n{cue.get('text')}"
- subtitle_str = subtitle_str + cue_text
-
- return subtitle_str
-
- def create_bulk_import(self, video, source):
- """subtitle lines for es import"""
- documents = self._create_documents(video, source)
- bulk_list = []
-
- for document in documents:
- document_id = document.get("subtitle_fragment_id")
- action = {"index": {"_index": "ta_subtitle", "_id": document_id}}
- bulk_list.append(json.dumps(action))
- bulk_list.append(json.dumps(document))
-
- bulk_list.append("\n")
- query_str = "\n".join(bulk_list)
-
- return query_str
-
- def _create_documents(self, video, source):
- """process documents"""
- documents = self._chunk_list(video.youtube_id)
- channel = video.json_data.get("channel")
- meta_dict = {
- "youtube_id": video.youtube_id,
- "title": video.json_data.get("title"),
- "subtitle_channel": channel.get("channel_name"),
- "subtitle_channel_id": channel.get("channel_id"),
- "subtitle_last_refresh": int(datetime.now().strftime("%s")),
- "subtitle_lang": self.lang,
- "subtitle_source": source,
- }
-
- _ = [i.update(meta_dict) for i in documents]
-
- return documents
-
- def _chunk_list(self, youtube_id):
- """join cues for bulk import"""
- chunk_list = []
-
- chunk = {}
- for cue in self.all_cues:
- if chunk:
- text = f"{chunk.get('subtitle_line')} {cue.get('text')}\n"
- chunk["subtitle_line"] = text
- else:
- idx = len(chunk_list) + 1
- chunk = {
- "subtitle_index": idx,
- "subtitle_line": cue.get("text"),
- "subtitle_start": cue.get("start"),
- }
-
- chunk["subtitle_fragment_id"] = f"{youtube_id}-{self.lang}-{idx}"
-
- if cue["idx"] % 5 == 0:
- chunk["subtitle_end"] = cue.get("end")
- chunk_list.append(chunk)
- chunk = {}
-
- return chunk_list
-
-
-class SponsorBlock:
- """handle sponsor block integration"""
-
- API = "https://sponsor.ajay.app/api"
-
- def __init__(self, user_id=False):
- self.user_id = user_id
- self.user_agent = f"{settings.TA_UPSTREAM} {settings.TA_VERSION}"
- self.last_refresh = int(datetime.now().strftime("%s"))
-
- def get_sb_id(self):
- """get sponsorblock userid or generate if needed"""
- if not self.user_id:
- print("missing request user id")
- raise ValueError
-
- key = f"{self.user_id}:id_sponsorblock"
- sb_id = RedisArchivist().get_message(key)
- if not sb_id["status"]:
- sb_id = {"status": randomizor(32)}
- RedisArchivist().set_message(key, sb_id, expire=False)
-
- return sb_id
-
- def get_timestamps(self, youtube_id):
- """get timestamps from the API"""
- url = f"{self.API}/skipSegments?videoID={youtube_id}"
- headers = {"User-Agent": self.user_agent}
- print(f"{youtube_id}: get sponsorblock timestamps")
- response = requests.get(url, headers=headers)
- if not response.ok:
- print(f"{youtube_id}: sponsorblock failed: {response.text}")
- sponsor_dict = {
- "last_refresh": self.last_refresh,
- "is_enabled": True,
- "segments": [],
- }
- else:
- all_segments = response.json()
- sponsor_dict = self._get_sponsor_dict(all_segments)
-
- return sponsor_dict
-
- def _get_sponsor_dict(self, all_segments):
- """format and process response"""
- has_unlocked = False
- cleaned_segments = []
- for segment in all_segments:
- if not segment["locked"]:
- has_unlocked = True
- del segment["userID"]
- del segment["description"]
- cleaned_segments.append(segment)
-
- sponsor_dict = {
- "last_refresh": self.last_refresh,
- "has_unlocked": has_unlocked,
- "is_enabled": True,
- "segments": cleaned_segments,
- }
- return sponsor_dict
-
- def post_timestamps(self, youtube_id, start_time, end_time):
- """post timestamps to api"""
- user_id = self.get_sb_id().get("status")
- data = {
- "videoID": youtube_id,
- "startTime": start_time,
- "endTime": end_time,
- "category": "sponsor",
- "userID": user_id,
- "userAgent": self.user_agent,
- }
- url = f"{self.API}/skipSegments?videoID={youtube_id}"
- print(f"post: {data}")
- print(f"to: {url}")
-
- return {"success": True}, 200
-
- def vote_on_segment(self, uuid, vote):
- """send vote on existing segment"""
- user_id = self.get_sb_id().get("status")
- data = {
- "UUID": uuid,
- "userID": user_id,
- "type": vote,
- }
- url = f"{self.API}/api/voteOnSponsorTime"
- print(f"post: {data}")
- print(f"to: {url}")
-
- return {"success": True}, 200
-
-
-class YoutubeVideo(YouTubeItem, YoutubeSubtitle):
- """represents a single youtube video"""
-
- es_path = False
- index_name = "ta_video"
- yt_base = "https://www.youtube.com/watch?v="
-
- def __init__(self, youtube_id, video_overwrites=False):
- super().__init__(youtube_id)
- self.channel_id = False
- self.video_overwrites = video_overwrites
- self.es_path = f"{self.index_name}/_doc/{youtube_id}"
-
- def build_json(self):
- """build json dict of video"""
- self.get_from_youtube()
- if not self.youtube_meta:
- return
-
- self._process_youtube_meta()
- self._add_channel()
- self._add_stats()
- self.add_file_path()
- self.add_player()
- if self.config["downloads"]["integrate_ryd"]:
- self._get_ryd_stats()
-
- if self._check_get_sb():
- self._get_sponsorblock()
-
- return
-
- def _check_get_sb(self):
- """check if need to run sponsor block"""
- integrate = self.config["downloads"]["integrate_sponsorblock"]
-
- if self.video_overwrites:
- single_overwrite = self.video_overwrites.get(self.youtube_id)
- if not single_overwrite:
- return integrate
-
- if "integrate_sponsorblock" in single_overwrite:
- return single_overwrite.get("integrate_sponsorblock")
-
- return integrate
-
- def _process_youtube_meta(self):
- """extract relevant fields from youtube"""
- # extract
- self.channel_id = self.youtube_meta["channel_id"]
- upload_date = self.youtube_meta["upload_date"]
- upload_date_time = datetime.strptime(upload_date, "%Y%m%d")
- published = upload_date_time.strftime("%Y-%m-%d")
- last_refresh = int(datetime.now().strftime("%s"))
- # base64_blur = ThumbManager().get_base64_blur(self.youtube_id)
- base64_blur = False
- # build json_data basics
- self.json_data = {
- "title": self.youtube_meta["title"],
- "description": self.youtube_meta["description"],
- "category": self.youtube_meta["categories"],
- "vid_thumb_url": self.youtube_meta["thumbnail"],
- "vid_thumb_base64": base64_blur,
- "tags": self.youtube_meta["tags"],
- "published": published,
- "vid_last_refresh": last_refresh,
- "date_downloaded": last_refresh,
- "youtube_id": self.youtube_id,
- "active": True,
- }
-
- def _add_channel(self):
- """add channel dict to video json_data"""
- channel = ta_channel.YoutubeChannel(self.channel_id)
- channel.build_json(upload=True)
- self.json_data.update({"channel": channel.json_data})
-
- def _add_stats(self):
- """add stats dicst to json_data"""
- # likes
- like_count = self.youtube_meta.get("like_count", 0)
- dislike_count = self.youtube_meta.get("dislike_count", 0)
- self.json_data.update(
- {
- "stats": {
- "view_count": self.youtube_meta["view_count"],
- "like_count": like_count,
- "dislike_count": dislike_count,
- "average_rating": self.youtube_meta["average_rating"],
- }
- }
- )
-
- def build_dl_cache_path(self):
- """find video path in dl cache"""
- cache_dir = self.app_conf["cache_dir"]
- cache_path = f"{cache_dir}/download/"
- all_cached = os.listdir(cache_path)
- for file_cached in all_cached:
- if self.youtube_id in file_cached:
- vid_path = os.path.join(cache_path, file_cached)
- return vid_path
-
- raise FileNotFoundError
-
- def add_player(self):
- """add player information for new videos"""
- try:
- # when indexing from download task
- vid_path = self.build_dl_cache_path()
- except FileNotFoundError as err:
- # when reindexing needs to handle title rename
- channel = os.path.split(self.json_data["media_url"])[0]
- channel_dir = os.path.join(self.app_conf["videos"], channel)
- all_files = os.listdir(channel_dir)
- for file in all_files:
- if self.youtube_id in file and file.endswith(".mp4"):
- vid_path = os.path.join(channel_dir, file)
- break
- else:
- raise FileNotFoundError("could not find video file") from err
-
- duration_handler = DurationConverter()
- duration = duration_handler.get_sec(vid_path)
- duration_str = duration_handler.get_str(duration)
- self.json_data.update(
- {
- "player": {
- "watched": False,
- "duration": duration,
- "duration_str": duration_str,
- }
- }
- )
-
- def add_file_path(self):
- """build media_url for where file will be located"""
- channel_name = self.json_data["channel"]["channel_name"]
- clean_channel_name = clean_string(channel_name)
- if len(clean_channel_name) <= 3:
- # fall back to channel id
- clean_channel_name = self.json_data["channel"]["channel_id"]
-
- timestamp = self.json_data["published"].replace("-", "")
- youtube_id = self.json_data["youtube_id"]
- title = self.json_data["title"]
- clean_title = clean_string(title)
- filename = f"{timestamp}_{youtube_id}_{clean_title}.mp4"
- media_url = os.path.join(clean_channel_name, filename)
- self.json_data["media_url"] = media_url
-
- def delete_media_file(self):
- """delete video file, meta data"""
- self.get_from_es()
- video_base = self.app_conf["videos"]
- to_del = [self.json_data.get("media_url")]
-
- all_subtitles = self.json_data.get("subtitles")
- if all_subtitles:
- to_del = to_del + [i.get("media_url") for i in all_subtitles]
-
- for media_url in to_del:
- file_path = os.path.join(video_base, media_url)
- try:
- os.remove(file_path)
- except FileNotFoundError:
- print(f"{self.youtube_id}: failed {media_url}, continue.")
-
- self.del_in_es()
- self.delete_subtitles()
-
- def _get_ryd_stats(self):
- """get optional stats from returnyoutubedislikeapi.com"""
- try:
- print(f"{self.youtube_id}: get ryd stats")
- result = ryd_client.get(self.youtube_id)
- except requests.exceptions.ConnectionError:
- print(f"{self.youtube_id}: failed to query ryd api, skipping")
- return False
-
- if result["status"] == 404:
- return False
-
- dislikes = {
- "dislike_count": result["dislikes"],
- "average_rating": result["rating"],
- }
- self.json_data["stats"].update(dislikes)
-
- return True
-
- def _get_sponsorblock(self):
- """get optional sponsorblock timestamps from sponsor.ajay.app"""
- sponsorblock = SponsorBlock().get_timestamps(self.youtube_id)
- if sponsorblock:
- self.json_data["sponsorblock"] = sponsorblock
-
- def check_subtitles(self):
- """optionally add subtitles"""
- handler = YoutubeSubtitle(self)
- subtitles = handler.get_subtitles()
- if subtitles:
- self.json_data["subtitles"] = subtitles
- handler.download_subtitles(relevant_subtitles=subtitles)
-
- def delete_subtitles(self):
- """delete indexed subtitles"""
- path = "ta_subtitle/_delete_by_query?refresh=true"
- data = {"query": {"term": {"youtube_id": {"value": self.youtube_id}}}}
- _, _ = ElasticWrap(path).post(data=data)
-
-
-def index_new_video(youtube_id, video_overwrites=False):
- """combined classes to create new video in index"""
- video = YoutubeVideo(youtube_id, video_overwrites=video_overwrites)
- video.build_json()
- if not video.json_data:
- raise ValueError("failed to get metadata for " + youtube_id)
-
- video.check_subtitles()
- video.upload_to_es()
- return video.json_data
diff --git a/tubearchivist/home/src/ta/__init__.py b/tubearchivist/home/src/ta/__init__.py
deleted file mode 100644
index e69de29..0000000
diff --git a/tubearchivist/home/src/ta/config.py b/tubearchivist/home/src/ta/config.py
deleted file mode 100644
index 4b98c4a..0000000
--- a/tubearchivist/home/src/ta/config.py
+++ /dev/null
@@ -1,273 +0,0 @@
-"""
-Functionality:
-- read and write config
-- load config variables into redis
-"""
-
-import json
-import os
-import re
-
-from celery.schedules import crontab
-from home.src.ta.ta_redis import RedisArchivist
-
-
-class AppConfig:
- """handle user settings and application variables"""
-
- def __init__(self, user_id=False):
- self.user_id = user_id
- self.config = self.get_config()
- self.colors = self.get_colors()
-
- def get_config(self):
- """get config from default file or redis if changed"""
- config = self.get_config_redis()
- if not config:
- config = self.get_config_file()
-
- if self.user_id:
- key = f"{self.user_id}:page_size"
- page_size = RedisArchivist().get_message(key)["status"]
- if page_size:
- config["archive"]["page_size"] = page_size
-
- config["application"].update(self.get_config_env())
- return config
-
- def get_config_file(self):
- """read the defaults from config.json"""
- with open("home/config.json", "r", encoding="utf-8") as f:
- config_file = json.load(f)
-
- config_file["application"].update(self.get_config_env())
-
- return config_file
-
- @staticmethod
- def get_config_env():
- """read environment application variables"""
- host_uid_env = os.environ.get("HOST_UID")
- if host_uid_env:
- host_uid = int(host_uid_env)
- else:
- host_uid = False
-
- host_gid_env = os.environ.get("HOST_GID")
- if host_gid_env:
- host_gid = int(host_gid_env)
- else:
- host_gid = False
-
- es_pass = os.environ.get("ELASTIC_PASSWORD")
- es_user = os.environ.get("ELASTIC_USER", default="elastic")
-
- application = {
- "REDIS_HOST": os.environ.get("REDIS_HOST"),
- "es_url": os.environ.get("ES_URL"),
- "es_auth": (es_user, es_pass),
- "HOST_UID": host_uid,
- "HOST_GID": host_gid,
- }
-
- return application
-
- @staticmethod
- def get_config_redis():
- """read config json set from redis to overwrite defaults"""
- config = RedisArchivist().get_message("config")
- if not list(config.values())[0]:
- return False
-
- return config
-
- def update_config(self, form_post):
- """update config values from settings form"""
- for key, value in form_post.items():
- if not value and not isinstance(value, int):
- continue
-
- if value in ["0", 0]:
- to_write = False
- elif value == "1":
- to_write = True
- else:
- to_write = value
-
- config_dict, config_value = key.split("_", maxsplit=1)
- self.config[config_dict][config_value] = to_write
-
- RedisArchivist().set_message("config", self.config, expire=False)
-
- @staticmethod
- def set_user_config(form_post, user_id):
- """set values in redis for user settings"""
- for key, value in form_post.items():
- if not value:
- continue
-
- message = {"status": value}
- redis_key = f"{user_id}:{key}"
- RedisArchivist().set_message(redis_key, message, expire=False)
-
- def get_colors(self):
- """overwrite config if user has set custom values"""
- colors = False
- if self.user_id:
- col_dict = RedisArchivist().get_message(f"{self.user_id}:colors")
- colors = col_dict["status"]
-
- if not colors:
- colors = self.config["application"]["colors"]
-
- self.config["application"]["colors"] = colors
- return colors
-
- def load_new_defaults(self):
- """check config.json for missing defaults"""
- default_config = self.get_config_file()
- redis_config = self.get_config_redis()
-
- # check for customizations
- if not redis_config:
- return
-
- needs_update = False
-
- for key, value in default_config.items():
- # missing whole main key
- if key not in redis_config:
- redis_config.update({key: value})
- needs_update = True
- continue
-
- # missing nested values
- for sub_key, sub_value in value.items():
- if sub_key not in redis_config[key].keys():
- redis_config[key].update({sub_key: sub_value})
- needs_update = True
-
- if needs_update:
- RedisArchivist().set_message("config", redis_config, expire=False)
-
-
-class ScheduleBuilder:
- """build schedule dicts for beat"""
-
- SCHEDULES = {
- "update_subscribed": "0 8 *",
- "download_pending": "0 16 *",
- "check_reindex": "0 12 *",
- "thumbnail_check": "0 17 *",
- "run_backup": "0 18 0",
- }
- CONFIG = ["check_reindex_days", "run_backup_rotate"]
-
- def __init__(self):
- self.config = AppConfig().config
-
- def update_schedule_conf(self, form_post):
- """process form post"""
- print("processing form, restart container for changes to take effect")
- redis_config = self.config
- for key, value in form_post.items():
- if key in self.SCHEDULES and value:
- try:
- to_write = self.value_builder(key, value)
- except ValueError:
- print(f"failed: {key} {value}")
- mess_dict = {
- "status": "message:setting",
- "level": "error",
- "title": "Scheduler update failed.",
- "message": "Invalid schedule input",
- }
- RedisArchivist().set_message("message:setting", mess_dict)
- return
-
- redis_config["scheduler"][key] = to_write
- if key in self.CONFIG and value:
- redis_config["scheduler"][key] = int(value)
- RedisArchivist().set_message("config", redis_config, expire=False)
- mess_dict = {
- "status": "message:setting",
- "level": "info",
- "title": "Scheduler changed.",
- "message": "Please restart container for changes to take effect",
- }
- RedisArchivist().set_message("message:setting", mess_dict)
-
- def value_builder(self, key, value):
- """validate single cron form entry and return cron dict"""
- print(f"change schedule for {key} to {value}")
- if value == "0":
- # deactivate this schedule
- return False
- if re.search(r"[\d]{1,2}\/[\d]{1,2}", value):
- # number/number cron format will fail in celery
- print("number/number schedule formatting not supported")
- raise ValueError
-
- keys = ["minute", "hour", "day_of_week"]
- if value == "auto":
- # set to sensible default
- values = self.SCHEDULES[key].split()
- else:
- values = value.split()
-
- if len(keys) != len(values):
- print(f"failed to parse {value} for {key}")
- raise ValueError("invalid input")
-
- to_write = dict(zip(keys, values))
- self._validate_cron(to_write)
-
- return to_write
-
- @staticmethod
- def _validate_cron(to_write):
- """validate all fields, raise value error for impossible schedule"""
- all_hours = list(re.split(r"\D+", to_write["hour"]))
- for hour in all_hours:
- if hour.isdigit() and int(hour) > 23:
- print("hour can not be greater than 23")
- raise ValueError("invalid input")
-
- all_days = list(re.split(r"\D+", to_write["day_of_week"]))
- for day in all_days:
- if day.isdigit() and int(day) > 6:
- print("day can not be greater than 6")
- raise ValueError("invalid input")
-
- if not to_write["minute"].isdigit():
- print("too frequent: only number in minutes are supported")
- raise ValueError("invalid input")
-
- if int(to_write["minute"]) > 59:
- print("minutes can not be greater than 59")
- raise ValueError("invalid input")
-
- def build_schedule(self):
- """build schedule dict as expected by app.conf.beat_schedule"""
- schedule_dict = {}
-
- for schedule_item in self.SCHEDULES:
- item_conf = self.config["scheduler"][schedule_item]
- if not item_conf:
- continue
-
- minute = item_conf["minute"]
- hour = item_conf["hour"]
- day_of_week = item_conf["day_of_week"]
- schedule_name = f"schedule_{schedule_item}"
- to_add = {
- schedule_name: {
- "task": schedule_item,
- "schedule": crontab(
- minute=minute, hour=hour, day_of_week=day_of_week
- ),
- }
- }
- schedule_dict.update(to_add)
-
- return schedule_dict
diff --git a/tubearchivist/home/src/ta/helper.py b/tubearchivist/home/src/ta/helper.py
deleted file mode 100644
index c572ccc..0000000
--- a/tubearchivist/home/src/ta/helper.py
+++ /dev/null
@@ -1,252 +0,0 @@
-"""
-Loose collection of helper functions
-- don't import AppConfig class here to avoid circular imports
-"""
-
-import random
-import re
-import string
-import subprocess
-import unicodedata
-from datetime import datetime
-from urllib.parse import parse_qs, urlparse
-
-import yt_dlp
-
-
-def clean_string(file_name):
- """clean string to only asci characters"""
- whitelist = "-_.() " + string.ascii_letters + string.digits
- normalized = unicodedata.normalize("NFKD", file_name)
- ascii_only = normalized.encode("ASCII", "ignore").decode().strip()
- white_listed = "".join(c for c in ascii_only if c in whitelist)
- cleaned = re.sub(r"[ ]{2,}", " ", white_listed)
- return cleaned
-
-
-def ignore_filelist(filelist):
- """ignore temp files for os.listdir sanitizer"""
- to_ignore = ["Icon\r\r", "Temporary Items", "Network Trash Folder"]
- cleaned = []
- for file_name in filelist:
- if file_name.startswith(".") or file_name in to_ignore:
- continue
-
- cleaned.append(file_name)
-
- return cleaned
-
-
-def randomizor(length):
- """generate random alpha numeric string"""
- pool = string.digits + string.ascii_letters
- return "".join(random.choice(pool) for i in range(length))
-
-
-def requests_headers():
- """build header with random user agent for requests outside of yt-dlp"""
-
- chrome_versions = (
- "90.0.4430.212",
- "90.0.4430.24",
- "90.0.4430.70",
- "90.0.4430.72",
- "90.0.4430.85",
- "90.0.4430.93",
- "91.0.4472.101",
- "91.0.4472.106",
- "91.0.4472.114",
- "91.0.4472.124",
- "91.0.4472.164",
- "91.0.4472.19",
- "91.0.4472.77",
- "92.0.4515.107",
- "92.0.4515.115",
- "92.0.4515.131",
- "92.0.4515.159",
- "92.0.4515.43",
- "93.0.4556.0",
- "93.0.4577.15",
- "93.0.4577.63",
- "93.0.4577.82",
- "94.0.4606.41",
- "94.0.4606.54",
- "94.0.4606.61",
- "94.0.4606.71",
- "94.0.4606.81",
- "94.0.4606.85",
- "95.0.4638.17",
- "95.0.4638.50",
- "95.0.4638.54",
- "95.0.4638.69",
- "95.0.4638.74",
- "96.0.4664.18",
- "96.0.4664.45",
- "96.0.4664.55",
- "96.0.4664.93",
- "97.0.4692.20",
- )
- template = (
- "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
- + "AppleWebKit/537.36 (KHTML, like Gecko) "
- + f"Chrome/{random.choice(chrome_versions)} Safari/537.36"
- )
-
- return {"User-Agent": template}
-
-
-def date_praser(timestamp):
- """return formatted date string"""
- if isinstance(timestamp, int):
- date_obj = datetime.fromtimestamp(timestamp)
- elif isinstance(timestamp, str):
- date_obj = datetime.strptime(timestamp, "%Y-%m-%d")
-
- return datetime.strftime(date_obj, "%d %b, %Y")
-
-
-class UrlListParser:
- """take a multi line string and detect valid youtube ids"""
-
- def __init__(self, url_str):
- self.url_list = [i.strip() for i in url_str.split()]
-
- def process_list(self):
- """loop through the list"""
- youtube_ids = []
- for url in self.url_list:
- parsed = urlparse(url)
- print(f"processing: {url}")
- print(parsed)
- if not parsed.netloc:
- # is not a url
- id_type = self.find_valid_id(url)
- youtube_id = url
- elif "youtube.com" not in url and "youtu.be" not in url:
- raise ValueError(f"{url} is not a youtube link")
- elif parsed.path:
- # is a url
- youtube_id, id_type = self.detect_from_url(parsed)
- else:
- # not detected
- raise ValueError(f"failed to detect {url}")
-
- youtube_ids.append({"url": youtube_id, "type": id_type})
-
- return youtube_ids
-
- def detect_from_url(self, parsed):
- """detect from parsed url"""
- if parsed.netloc == "youtu.be":
- # shortened
- youtube_id = parsed.path.strip("/")
- _ = self.find_valid_id(youtube_id)
- return youtube_id, "video"
-
- if parsed.query:
- # detect from query string
- query_parsed = parse_qs(parsed.query)
- if "v" in query_parsed.keys():
- youtube_id = query_parsed["v"][0]
- _ = self.find_valid_id(youtube_id)
- return youtube_id, "video"
-
- if "list" in query_parsed.keys():
- youtube_id = query_parsed["list"][0]
- return youtube_id, "playlist"
-
- if parsed.path.startswith("/channel/"):
- # channel id in url
- youtube_id = parsed.path.split("/")[2]
- _ = self.find_valid_id(youtube_id)
- return youtube_id, "channel"
-
- # dedect channel with yt_dlp
- youtube_id = self.extract_channel_name(parsed.geturl())
- return youtube_id, "channel"
-
- @staticmethod
- def find_valid_id(id_str):
- """dedect valid id from length of string"""
- str_len = len(id_str)
- if str_len == 11:
- id_type = "video"
- elif str_len == 24:
- id_type = "channel"
- elif str_len in [34, 18]:
- id_type = "playlist"
- else:
- # unable to parse
- raise ValueError("not a valid id_str: " + id_str)
-
- return id_type
-
- @staticmethod
- def extract_channel_name(url):
- """find channel id from channel name with yt-dlp help"""
- obs = {
- "default_search": "ytsearch",
- "quiet": True,
- "skip_download": True,
- "extract_flat": True,
- "playlistend": 0,
- }
- url_info = yt_dlp.YoutubeDL(obs).extract_info(url, download=False)
- try:
- channel_id = url_info["channel_id"]
- except KeyError as error:
- print(f"failed to extract channel id from {url}")
- raise ValueError from error
-
- return channel_id
-
-
-class DurationConverter:
- """
- using ffmpeg to get and parse duration from filepath
- """
-
- @staticmethod
- def get_sec(file_path):
- """read duration from file"""
- duration = subprocess.run(
- [
- "ffprobe",
- "-v",
- "error",
- "-show_entries",
- "format=duration",
- "-of",
- "default=noprint_wrappers=1:nokey=1",
- file_path,
- ],
- capture_output=True,
- check=True,
- )
- duration_raw = duration.stdout.decode().strip()
- if duration_raw == "N/A":
- return 0
-
- duration_sec = int(float(duration_raw))
- return duration_sec
-
- @staticmethod
- def get_str(duration_sec):
- """takes duration in sec and returns clean string"""
- if not duration_sec:
- # failed to extract
- return "NA"
-
- hours = duration_sec // 3600
- minutes = (duration_sec - (hours * 3600)) // 60
- secs = duration_sec - (hours * 3600) - (minutes * 60)
-
- duration_str = str()
- if hours:
- duration_str = str(hours).zfill(2) + ":"
- if minutes:
- duration_str = duration_str + str(minutes).zfill(2) + ":"
- else:
- duration_str = duration_str + "00:"
- duration_str = duration_str + str(secs).zfill(2)
- return duration_str
diff --git a/tubearchivist/home/src/ta/ta_redis.py b/tubearchivist/home/src/ta/ta_redis.py
deleted file mode 100644
index 0e9c23f..0000000
--- a/tubearchivist/home/src/ta/ta_redis.py
+++ /dev/null
@@ -1,161 +0,0 @@
-"""
-functionality:
-- interact with redis
-- hold temporary download queue in redis
-"""
-
-import json
-import os
-
-import redis
-from home.src.ta.helper import ignore_filelist
-
-
-class RedisBase:
- """connection base for redis"""
-
- REDIS_HOST = os.environ.get("REDIS_HOST")
- REDIS_PORT = os.environ.get("REDIS_PORT") or 6379
- NAME_SPACE = "ta:"
-
- def __init__(self):
- self.conn = redis.Redis(host=self.REDIS_HOST, port=self.REDIS_PORT)
-
-
-class RedisArchivist(RedisBase):
- """collection of methods to interact with redis"""
-
- CHANNELS = [
- "download",
- "add",
- "rescan",
- "subchannel",
- "subplaylist",
- "playlistscan",
- "setting",
- ]
-
- def set_message(self, key, message, expire=True):
- """write new message to redis"""
- self.conn.execute_command(
- "JSON.SET", self.NAME_SPACE + key, ".", json.dumps(message)
- )
-
- if expire:
- if isinstance(expire, bool):
- secs = 20
- else:
- secs = expire
- self.conn.execute_command("EXPIRE", self.NAME_SPACE + key, secs)
-
- def get_message(self, key):
- """get message dict from redis"""
- reply = self.conn.execute_command("JSON.GET", self.NAME_SPACE + key)
- if reply:
- json_str = json.loads(reply)
- else:
- json_str = {"status": False}
-
- return json_str
-
- def list_items(self, query):
- """list all matches"""
- reply = self.conn.execute_command(
- "KEYS", self.NAME_SPACE + query + "*"
- )
- all_matches = [i.decode().lstrip(self.NAME_SPACE) for i in reply]
- all_results = []
- for match in all_matches:
- json_str = self.get_message(match)
- all_results.append(json_str)
-
- return all_results
-
- def del_message(self, key):
- """delete key from redis"""
- response = self.conn.execute_command("DEL", self.NAME_SPACE + key)
- return response
-
- def get_lock(self, lock_key):
- """handle lock for task management"""
- redis_lock = self.conn.lock(self.NAME_SPACE + lock_key)
- return redis_lock
-
- def get_progress(self):
- """get a list of all progress messages"""
- all_messages = []
- for channel in self.CHANNELS:
- key = "message:" + channel
- reply = self.conn.execute_command(
- "JSON.GET", self.NAME_SPACE + key
- )
- if reply:
- json_str = json.loads(reply)
- all_messages.append(json_str)
-
- return all_messages
-
- @staticmethod
- def monitor_cache_dir(cache_dir):
- """
- look at download cache dir directly as alternative progress info
- """
- dl_cache = os.path.join(cache_dir, "download")
- all_cache_file = os.listdir(dl_cache)
- cache_file = ignore_filelist(all_cache_file)
- if cache_file:
- filename = cache_file[0][12:].replace("_", " ").split(".")[0]
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Downloading: " + filename,
- "message": "",
- }
- else:
- return False
-
- return mess_dict
-
-
-class RedisQueue(RedisBase):
- """dynamically interact with the download queue in redis"""
-
- def __init__(self):
- super().__init__()
- self.key = self.NAME_SPACE + "dl_queue"
-
- def get_all(self):
- """return all elements in list"""
- result = self.conn.execute_command("LRANGE", self.key, 0, -1)
- all_elements = [i.decode() for i in result]
- return all_elements
-
- def add_list(self, to_add):
- """add list to queue"""
- self.conn.execute_command("RPUSH", self.key, *to_add)
-
- def add_priority(self, to_add):
- """add single video to front of queue"""
- self.clear_item(to_add)
- self.conn.execute_command("LPUSH", self.key, to_add)
-
- def get_next(self):
- """return next element in the queue, False if none"""
- result = self.conn.execute_command("LPOP", self.key)
- if not result:
- return False
-
- next_element = result.decode()
- return next_element
-
- def clear(self):
- """delete list from redis"""
- self.conn.execute_command("DEL", self.key)
-
- def clear_item(self, to_clear):
- """remove single item from list if it's there"""
- self.conn.execute_command("LREM", self.key, 0, to_clear)
-
- def trim(self, size):
- """trim the queue based on settings amount"""
- self.conn.execute_command("LTRIM", self.key, 0, size)
diff --git a/tubearchivist/home/tasks.py b/tubearchivist/home/tasks.py
deleted file mode 100644
index b8f419b..0000000
--- a/tubearchivist/home/tasks.py
+++ /dev/null
@@ -1,278 +0,0 @@
-"""
-Functionality:
-- initiate celery app
-- collect tasks
-- user config changes won't get applied here
- because tasks are initiated at application start
-"""
-
-import os
-
-from celery import Celery, shared_task
-from home.apps import StartupCheck
-from home.src.download.queue import PendingList
-from home.src.download.subscriptions import (
- ChannelSubscription,
- PlaylistSubscription,
-)
-from home.src.download.thumbnails import ThumbManager, validate_thumbnails
-from home.src.download.yt_dlp_handler import VideoDownloader
-from home.src.es.index_setup import backup_all_indexes, restore_from_backup
-from home.src.index.channel import YoutubeChannel
-from home.src.index.filesystem import (
- ManualImport,
- reindex_old_documents,
- scan_filesystem,
-)
-from home.src.ta.config import AppConfig, ScheduleBuilder
-from home.src.ta.helper import UrlListParser
-from home.src.ta.ta_redis import RedisArchivist, RedisQueue
-
-CONFIG = AppConfig().config
-REDIS_HOST = os.environ.get("REDIS_HOST")
-REDIS_PORT = os.environ.get("REDIS_PORT") or 6379
-
-os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings")
-app = Celery("tasks", broker=f"redis://{REDIS_HOST}:{REDIS_PORT}")
-app.config_from_object("django.conf:settings", namespace="ta:")
-app.autodiscover_tasks()
-app.conf.timezone = os.environ.get("TZ") or "UTC"
-
-
-@shared_task(name="update_subscribed")
-def update_subscribed():
- """look for missing videos and add to pending"""
- message = {
- "status": "message:rescan",
- "level": "info",
- "title": "Rescanning channels and playlists.",
- "message": "Looking for new videos.",
- }
- RedisArchivist().set_message("message:rescan", message)
-
- have_lock = False
- my_lock = RedisArchivist().get_lock("rescan")
-
- try:
- have_lock = my_lock.acquire(blocking=False)
- if have_lock:
- channel_handler = ChannelSubscription()
- missing_from_channels = channel_handler.find_missing()
- playlist_handler = PlaylistSubscription()
- missing_from_playlists = playlist_handler.find_missing()
- missing = missing_from_channels + missing_from_playlists
- if missing:
- youtube_ids = [{"type": "video", "url": i} for i in missing]
- pending_handler = PendingList(youtube_ids=youtube_ids)
- pending_handler.parse_url_list()
- pending_handler.add_to_pending()
-
- else:
- print("Did not acquire rescan lock.")
-
- finally:
- if have_lock:
- my_lock.release()
-
-
-@shared_task(name="download_pending")
-def download_pending():
- """download latest pending videos"""
- have_lock = False
- my_lock = RedisArchivist().get_lock("downloading")
-
- try:
- have_lock = my_lock.acquire(blocking=False)
- if have_lock:
- downloader = VideoDownloader()
- downloader.add_pending()
- downloader.run_queue()
- else:
- print("Did not acquire download lock.")
-
- finally:
- if have_lock:
- my_lock.release()
-
-
-@shared_task
-def download_single(youtube_id):
- """start download single video now"""
- queue = RedisQueue()
- queue.add_priority(youtube_id)
- print("Added to queue with priority: " + youtube_id)
- # start queue if needed
- have_lock = False
- my_lock = RedisArchivist().get_lock("downloading")
-
- try:
- have_lock = my_lock.acquire(blocking=False)
- if have_lock:
- mess_dict = {
- "status": "message:download",
- "level": "info",
- "title": "Download single video",
- "message": "processing",
- }
- RedisArchivist().set_message("message:download", mess_dict)
- VideoDownloader().run_queue()
- else:
- print("Download queue already running.")
-
- finally:
- # release if only single run
- if have_lock and not queue.get_next():
- my_lock.release()
-
-
-@shared_task
-def extrac_dl(youtube_ids):
- """parse list passed and add to pending"""
- pending_handler = PendingList(youtube_ids=youtube_ids)
- pending_handler.parse_url_list()
- pending_handler.add_to_pending()
-
-
-@shared_task(name="check_reindex")
-def check_reindex():
- """run the reindex main command"""
- reindex_old_documents()
-
-
-@shared_task
-def run_manual_import():
- """called from settings page, to go through import folder"""
- print("starting media file import")
- have_lock = False
- my_lock = RedisArchivist().get_lock("manual_import")
-
- try:
- have_lock = my_lock.acquire(blocking=False)
- if have_lock:
- import_handler = ManualImport()
- if import_handler.identified:
- all_videos_added = import_handler.process_import()
- ThumbManager().download_vid(all_videos_added)
- else:
- print("Did not acquire lock form import.")
-
- finally:
- if have_lock:
- my_lock.release()
-
-
-@shared_task(name="run_backup")
-def run_backup(reason="auto"):
- """called from settings page, dump backup to zip file"""
- backup_all_indexes(reason)
- print("backup finished")
-
-
-@shared_task
-def run_restore_backup(filename):
- """called from settings page, dump backup to zip file"""
- restore_from_backup(filename)
- print("index restore finished")
-
-
-def kill_dl(task_id):
- """kill download worker task by ID"""
- if task_id:
- app.control.revoke(task_id, terminate=True)
-
- _ = RedisArchivist().del_message("dl_queue_id")
- RedisQueue().clear()
-
- # clear cache
- cache_dir = os.path.join(CONFIG["application"]["cache_dir"], "download")
- for cached in os.listdir(cache_dir):
- to_delete = os.path.join(cache_dir, cached)
- os.remove(to_delete)
-
- # notify
- mess_dict = {
- "status": "message:download",
- "level": "error",
- "title": "Canceling download process",
- "message": "Canceling download queue now.",
- }
- RedisArchivist().set_message("message:download", mess_dict)
-
-
-@shared_task
-def rescan_filesystem():
- """check the media folder for mismatches"""
- scan_filesystem()
- validate_thumbnails()
-
-
-@shared_task(name="thumbnail_check")
-def thumbnail_check():
- """validate thumbnails"""
- validate_thumbnails()
-
-
-@shared_task
-def re_sync_thumbs():
- """sync thumbnails to mediafiles"""
- handler = ThumbManager()
- video_list = handler.get_thumb_list()
- handler.write_all_thumbs(video_list)
-
-
-@shared_task
-def subscribe_to(url_str):
- """take a list of urls to subscribe to"""
- to_subscribe_list = UrlListParser(url_str).process_list()
- counter = 1
- for item in to_subscribe_list:
- to_sub_id = item["url"]
- if item["type"] == "playlist":
- new_thumbs = PlaylistSubscription().process_url_str([item])
- if new_thumbs:
- ThumbManager().download_playlist(new_thumbs)
- continue
-
- if item["type"] == "video":
- vid_details = PendingList().get_youtube_details(to_sub_id)
- channel_id_sub = vid_details["channel_id"]
- elif item["type"] == "channel":
- channel_id_sub = to_sub_id
- else:
- raise ValueError("failed to subscribe to: " + to_sub_id)
-
- ChannelSubscription().change_subscribe(
- channel_id_sub, channel_subscribed=True
- )
- # notify for channels
- message = {
- "status": "message:subchannel",
- "level": "info",
- "title": "Subscribing to Channels",
- "message": f"Processing {counter} of {len(to_subscribe_list)}",
- }
- RedisArchivist().set_message("message:subchannel", message=message)
- counter = counter + 1
-
-
-@shared_task
-def index_channel_playlists(channel_id):
- """add all playlists of channel to index"""
- channel = YoutubeChannel(channel_id)
- # notify
- mess_dict = {
- "status": "message:playlistscan",
- "level": "info",
- "title": "Looking for playlists",
- "message": f'Scanning channel "{channel.youtube_id}" in progress',
- }
- RedisArchivist().set_message("message:playlistscan", mess_dict)
- channel.index_channel_playlists()
-
-
-try:
- app.conf.beat_schedule = ScheduleBuilder().build_schedule()
-except KeyError:
- # update path from v0.0.8 to v0.0.9 to load new defaults
- StartupCheck().sync_redis_state()
- app.conf.beat_schedule = ScheduleBuilder().build_schedule()
diff --git a/tubearchivist/home/templates/home/about.html b/tubearchivist/home/templates/home/about.html
deleted file mode 100644
index 23e01a8..0000000
--- a/tubearchivist/home/templates/home/about.html
+++ /dev/null
@@ -1,19 +0,0 @@
-{% extends "home/base.html" %}
-{% load static %}
-{% block content %}
-
-
-
About The Tube Archivist
-
-
-
Useful Links
-
This project is in active and constant development, take a look at the roadmap for a overview.
-
For any questions on what a button or a function does, You can find the up-to-date user documentation on Github.
-
All contributions are welcome: Open an issue for any bugs and errors, join us on Discord to discuss details. The contributing page is a good place to get started.
-
-
-
Donate
-
Here are some links, if you want to buy the developer a coffee. Thank you for your support!
- `;
- }
- }
- }
- var videoProgress = getVideoProgress(videoId).position;
- var videoName = videoData.data.title;
-
- var videoTag = createVideoTag(videoData, videoProgress);
-
- var playlist = '';
- var videoPlaylists = videoData.data.playlist; // Array of playlists the video is in
- if (typeof(videoPlaylists) != 'undefined') {
- var subbedPlaylists = getSubbedPlaylists(videoPlaylists); // Array of playlist the video is in that are subscribed
- if (subbedPlaylists.length != 0) {
- var playlistData = getPlaylistData(subbedPlaylists[0]); // Playlist data for first subscribed playlist
- var playlistId = playlistData.playlist_id;
- var playlistName = playlistData.playlist_name;
- var playlist = `
- `;
- const divPlayer = document.getElementById("player");
- divPlayer.innerHTML = markup;
-}
-
-// Add video tag to video page when passed a video id, function loaded on page load `video.html (115-117)`
-function insertVideoTag(videoData, videoProgress) {
- var videoTag = createVideoTag(videoData, videoProgress);
- var videoMain = document.getElementsByClassName("video-main");
- videoMain[0].innerHTML = videoTag;
-}
-
-// Generates a video tag with subtitles when passed videoData and videoProgress.
-function createVideoTag(videoData, videoProgress) {
- var videoId = videoData.data.youtube_id;
- var videoUrl = videoData.data.media_url;
- var videoThumbUrl = videoData.data.vid_thumb_url;
- var subtitles = '';
- var videoSubtitles = videoData.data.subtitles; // Array of subtitles
- if (typeof(videoSubtitles) != 'undefined' && videoData.config.downloads.subtitle) {
- for (var i = 0; i < videoSubtitles.length; i++) {
- let label = videoSubtitles[i].name;
- if (videoSubtitles[i].source == "auto") {
- label += " - auto";
- }
- subtitles += `