tubearchivist/README.md

![Tube Archivist](assets/tube-archivist-banner.jpg?raw=true "Tube Archivist Banner")  

<center><h1>Your self hosted YouTube media server</h1></center>

## Table of contents:
* [Wiki](https://github.com/bbilly1/tubearchivist/wiki) for a detailed documentation
* [Core functionality](#core-functionality)
* [Screenshots](#screenshots)
* [Problem Tube Archivist tries to solve](#problem-tube-archivist-tries-to-solve)
* [Installing and updating](#installing-and-updating)
* [Getting Started](#getting-started)
* [Potential pitfalls](#potential-pitfalls)
* [Roadmap](#roadmap)
* [Known limitations](#known-limitations)
* [Donate](#donate)

------------------------

## Core functionality
* Subscribe to your favorite YouTube channels
* Download Videos using **yt-dlp**
* Index and make videos searchable
* Play videos
* Keep track of viewed and unviewed videos

## Screenshots
![home screenshot](assets/tube-archivist-screenshot-home.png?raw=true "Tube Archivist Home")  
*Home Page*

![channels screenshot](assets/tube-archivist-screenshot-channels.png?raw=true "Tube Archivist Channels")  
*All Channels*

![single channel screenshot](assets/tube-archivist-screenshot-single-channel.png?raw=true "Tube Archivist Single Channel")  
*Single Channel*

![video page screenshot](assets/tube-archivist-screenshot-video.png?raw=true "Tube Archivist Video Page")  
*Video Page*

![video page screenshot](assets/tube-archivist-screenshot-download.png?raw=true "Tube Archivist Video Page")  
*Downloads Page*
  
## Problem Tube Archivist tries to solve
Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.

## Installing and updating
Take a look at the example `docker-compose.yml` file provided. Tube Archivist depends on three main components split up into separate docker containers:  

### Tube Archivist
The main Python application that displays and serves your video collection, built with Django.
  - Serves the interface on port `8000`
  - Needs a volume for the video archive at **/youtube**
  - And another volume to save application data at **/cache**.
  - The environment variables `ES_URL` and `REDIS_HOST` are needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
  - The environment variables `HOST_UID` and `HOST_GID` allows Tube Archivist to `chown` the video files to the main host system user instead of the container user. Those two variables are optional, not setting them will disable that functionality. That might be needed if the underlying filesystem doesn't support `chown` like *NFS*. 
  - Change the environment variables `TA_USERNAME` and `TA_PASSWORD` to create the initial credentials. 
  - `ELASTIC_PASSWORD` is for the password for Elasticsearch. The environment variable `ELASTIC_USER` is optional, should you want to change the username from the default *elastic*.

### Elasticsearch
Stores video meta data and makes everything searchable. Also keeps track of the download queue.
  - Needs to be accessible over the default port `9200`
  - Needs a volume at **/usr/share/elasticsearch/data** to store data

Follow the [documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) for additional installation details.

### Redis JSON
Functions as a cache and temporary link between the application and the file system. Used to store and display messages and configuration variables.
  - Needs to be accessible over the default port `6379`
  - Needs a volume at **/data** to make your configuration changes permanent.

### Redis on a custom port
For some architectures it might be required to run Redis JSON on a nonstandard port. To for example change the Redis port to **6380**, set the following values:
- Set the environment variable `REDIS_PORT=6380` to the *tubearchivist* service.
- For the *archivist-redis* service, change the ports to `6380:6380`
- Additionally set the following value to the *archivist-redis* service: `command: --port 6380 --loadmodule /usr/lib/redis/modules/rejson.so`

### Updating Tube Archivist
You will see the current version number of **Tube Archivist** in the footer of the interface so you can compare it with the latest release to make sure you are running the *latest and greatest*.  
* There can be breaking changes between updates, particularly as the application grows, new environment variables or settings might be required for you to set in the your docker-compose file. *Always* check the **release notes**: Any breaking changes will be marked there.  
* All testing and development is done with the Elasticsearch version number as mentioned in the provided *docker-compose.yml* file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned.

### Alternative installation instructions:
- **arm64**: Newest Tube Archivist container is multi arch, so is Elasticsearch. RedisJSON doesn't offer arm builds, you can use `bbilly1/rejson`, an unofficial rebuild for arm64.
  - NOTE: This is untested, looking for feedback.
- **Synology**: There is a [discussion thread](https://github.com/bbilly1/tubearchivist/discussions/48) with Synology installation instructions.
- **Unraid**: The three containers needed are all in the Community Applications. First install `TubeArchivist RedisJSON` followed by `TubeArchivist ES`, and finally you can install `TubeArchivist`. If you have unraid specific issues, report those to the [support thread](https://forums.unraid.net/topic/114073-support-crocs-tube-archivist/ "support thread").


## Potential pitfalls
### vm.max_map_count
**Elastic Search** in Docker requires the kernel setting of the host machine `vm.max_map_count` to be set to at least 262144.

To temporary set the value run:  
```
sudo sysctl -w vm.max_map_count=262144
```  

To apply the change permanently depends on your host operating system:  
- For example on Ubuntu Server add `vm.max_map_count = 262144` to the file */etc/sysctl.conf*.
- On Arch based systems create a file */etc/sysctl.d/max_map_count.conf* with the content `vm.max_map_count = 262144`. 
- On any other platform look up in the documentation on how to pass kernel parameters.

### Permissions for elasticsearch
If you see a message similar to `AccessDeniedException[/usr/share/elasticsearch/data/nodes]` when initially starting elasticsearch, that means the container is not allowed to write files to the volume.  
That's most likely the case when you run `docker-compose` as an unprivileged user. To fix that issue, shutdown the container and on your host machine run:
```
chown 1000:0 /path/to/mount/point
```
This will match the permissions with the **UID** and **GID** of elasticsearch within the container and should fix the issue.

### Disk usage
The Elasticsearch index will turn to *read only* if the disk usage of the container goes above 95% until the usage drops below 90% again. Similar to that, TubeArchivist will become all sorts of messed up when running out of disk space. There are some error messages in the logs when that happens, but it's best to make sure to have enough disk space before starting to download.

## Getting Started
1. Go through the **settings** page and look at the available options. Particularly set *Download Format* to your desired video quality before downloading. **Tube Archivist** downloads the best available quality by default. To support iOS or MacOS a compatible format must be specified. For example:
```
bestvideo[VCODEC=avc1]+bestaudio[ACODEC=mp4a]/mp4
```
2. Subscribe to some of your favorite YouTube channels on the **channels** page. 
3. On the **downloads** page, click on *Rescan subscriptions* to add videos from the subscribed channels to your Download queue or click on *Add to download queue* to manually add Video IDs, links, channels or playlists.
4. Click on *Start download* and let **Tube Archivist** to it's thing. 
5. Enjoy your archived collection!
  
## Roadmap
This should be considered as a **minimal viable product**, there is an extensive list of future functions and improvements planned.

### Functionality
- [ ] User roles
- [ ] Create playlists
- [ ] Podcast mode to serve channel as mp3
- [ ] Implement [PyFilesystem](https://github.com/PyFilesystem/pyfilesystem2) for flexible video storage
- [ ] Optional automatic deletion of watched items after a specified time
- [ ] Subtitle download & indexing
- [X] Access control [2021-11-01]
- [X] Delete videos and channel [2021-10-16]
- [X] Add thumbnail embed option [2021-10-16]
- [X] Un-ignore videos [2021-10-03]
- [X] Dynamic download queue [2021-09-26]
- [X] Backup and restore [2021-09-22]
- [X] Scan your file system to index already downloaded videos [2021-09-14]

### UI
- [ ] Show similar videos on video page
- [ ] Multi language support
- [ ] Show total video downloaded vs total videos available in channel
- [X] Grid and list view for both channel and video list pages [2021-10-03]
- [X] Create a github wiki for user documentation [2021-10-03]


## Known limitations
- Video files created by Tube Archivist need to be **mp4** video files for best browser compatibility.
- Every limitation of **yt-dlp** will also be present in Tube Archivist. If **yt-dlp** can't download or extract a video for any reason, Tube Archivist won't be able to either.
- For now this is meant to be run in a trusted network environment. Not everything is properly authenticated.
- There is currently no flexibility in naming of the media files.


## Donate
The best donation to **Tube Archivist** is your time, take a look at the [contribution page](CONTRIBUTING.md) to get started.  
Second best way to support the development is to provide for caffeinated beverages:
* [Paypal.me](https://paypal.me/bbilly1) for a one time coffee
* [Paypal Subscription](https://www.paypal.com/webapps/billing/plans/subscribe?plan_id=P-03770005GR991451KMFGVPMQ) for a monthly coffee
* [co-fi.com](https://ko-fi.com/bbilly1) for an alternative platform
method desc for media import 2021-09-14 11:33:55 +00:00			`![Tube Archivist](assets/tube-archivist-banner.jpg?raw=true "Tube Archivist Banner")`
minimal viable product 2021-09-05 17:10:14 +00:00
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`<center><h1>Your self hosted YouTube media server</h1></center>`
minimal viable product 2021-09-05 17:10:14 +00:00
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`## Table of contents:`
consolidate documentation in wiki 2021-10-03 05:05:39 +00:00			`* [Wiki](https://github.com/bbilly1/tubearchivist/wiki) for a detailed documentation`
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`* [Core functionality](#core-functionality)`
			`* [Screenshots](#screenshots)`
			`* [Problem Tube Archivist tries to solve](#problem-tube-archivist-tries-to-solve)`
			`* [Installing and updating](#installing-and-updating)`
			`* [Getting Started](#getting-started)`
			`* [Potential pitfalls](#potential-pitfalls)`
			`* [Roadmap](#roadmap)`
			`* [Known limitations](#known-limitations)`
			`* [Donate](#donate)`

			`------------------------`
minimal viable product 2021-09-05 17:10:14 +00:00
			`## Core functionality`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`* Subscribe to your favorite YouTube channels`
minimal viable product 2021-09-05 17:10:14 +00:00			`* Download Videos using yt-dlp`
			`* Index and make videos searchable`
			`* Play videos`
			`* Keep track of viewed and unviewed videos`

add some screenshots to readme 2021-09-09 09:47:37 +00:00			`## Screenshots`
			`![home screenshot](assets/tube-archivist-screenshot-home.png?raw=true "Tube Archivist Home")`
			`Home Page`

			`![channels screenshot](assets/tube-archivist-screenshot-channels.png?raw=true "Tube Archivist Channels")`
			`All Channels`

			`![single channel screenshot](assets/tube-archivist-screenshot-single-channel.png?raw=true "Tube Archivist Single Channel")`
			`Single Channel`

			`![video page screenshot](assets/tube-archivist-screenshot-video.png?raw=true "Tube Archivist Video Page")`
			`Video Page`

			`![video page screenshot](assets/tube-archivist-screenshot-download.png?raw=true "Tube Archivist Video Page")`
			`Downloads Page`

minimal viable product 2021-09-05 17:10:14 +00:00			`## Problem Tube Archivist tries to solve`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.`
minimal viable product 2021-09-05 17:10:14 +00:00
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`## Installing and updating`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			Take a look at the example `docker-compose.yml` file provided. Tube Archivist depends on three main components split up into separate docker containers:
minimal viable product 2021-09-05 17:10:14 +00:00
			`### Tube Archivist`
			`The main Python application that displays and serves your video collection, built with Django.`
			- Serves the interface on port `8000`
update for v0.0.7 2021-11-01 10:16:00 +00:00			`- Needs a volume for the video archive at /youtube`
			`- And another volume to save application data at /cache.`
minimal viable product 2021-09-05 17:10:14 +00:00			- The environment variables `ES_URL` and `REDIS_HOST` are needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
make chown command optional by omitting HOST_UID and HOST_GID, #58 2021-10-20 11:41:39 +00:00			- The environment variables `HOST_UID` and `HOST_GID` allows Tube Archivist to `chown` the video files to the main host system user instead of the container user. Those two variables are optional, not setting them will disable that functionality. That might be needed if the underlying filesystem doesn't support `chown` like NFS.
update for v0.0.7 2021-11-01 10:16:00 +00:00			- Change the environment variables `TA_USERNAME` and `TA_PASSWORD` to create the initial credentials.
			- `ELASTIC_PASSWORD` is for the password for Elasticsearch. The environment variable `ELASTIC_USER` is optional, should you want to change the username from the default elastic.
minimal viable product 2021-09-05 17:10:14 +00:00
			`### Elasticsearch`
			`Stores video meta data and makes everything searchable. Also keeps track of the download queue.`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			- Needs to be accessible over the default port `9200`
minimal viable product 2021-09-05 17:10:14 +00:00			`- Needs a volume at /usr/share/elasticsearch/data to store data`

			`Follow the [documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) for additional installation details.`

			`### Redis JSON`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`Functions as a cache and temporary link between the application and the file system. Used to store and display messages and configuration variables.`
			- Needs to be accessible over the default port `6379`
update for v0.0.7 2021-11-01 10:16:00 +00:00			`- Needs a volume at /data to make your configuration changes permanent.`
minimal viable product 2021-09-05 17:10:14 +00:00
allowing custom redis port 2021-09-30 11:03:23 +00:00			`### Redis on a custom port`
			`For some architectures it might be required to run Redis JSON on a nonstandard port. To for example change the Redis port to 6380, set the following values:`
			- Set the environment variable `REDIS_PORT=6380` to the tubearchivist service.
			- For the archivist-redis service, change the ports to `6380:6380`
			- Additionally set the following value to the archivist-redis service: `command: --port 6380 --loadmodule /usr/lib/redis/modules/rejson.so`

restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`### Updating Tube Archivist`
			`You will see the current version number of Tube Archivist in the footer of the interface so you can compare it with the latest release to make sure you are running the latest and greatest.`
update docs for v0.0.6 take 2 2021-10-17 04:31:21 +00:00			`* There can be breaking changes between updates, particularly as the application grows, new environment variables or settings might be required for you to set in the your docker-compose file. Always check the release notes: Any breaking changes will be marked there.`
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`* All testing and development is done with the Elasticsearch version number as mentioned in the provided docker-compose.yml file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned.`

update docs for v0.0.6 take 2 2021-10-17 04:31:21 +00:00			`### Alternative installation instructions:`
			- arm64: Newest Tube Archivist container is multi arch, so is Elasticsearch. RedisJSON doesn't offer arm builds, you can use `bbilly1/rejson`, an unofficial rebuild for arm64.
			`- NOTE: This is untested, looking for feedback.`
			`- Synology: There is a [discussion thread](https://github.com/bbilly1/tubearchivist/discussions/48) with Synology installation instructions.`
update for v0.0.7 2021-11-01 10:16:00 +00:00			- Unraid: The three containers needed are all in the Community Applications. First install `TubeArchivist RedisJSON` followed by `TubeArchivist ES`, and finally you can install `TubeArchivist`. If you have unraid specific issues, report those to the [support thread](https://forums.unraid.net/topic/114073-support-crocs-tube-archivist/ "support thread").
Update README.md (#57) Unraid installation instructions 2021-10-18 03:20:59 +00:00
update docs for v0.0.6 take 2 2021-10-17 04:31:21 +00:00
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`## Potential pitfalls`
			`### vm.max_map_count`
			Elastic Search in Docker requires the kernel setting of the host machine `vm.max_map_count` to be set to at least 262144.

			`To temporary set the value run:`
			```
			`sudo sysctl -w vm.max_map_count=262144`
			```

			`To apply the change permanently depends on your host operating system:`
			- For example on Ubuntu Server add `vm.max_map_count = 262144` to the file /etc/sysctl.conf.
			- On Arch based systems create a file /etc/sysctl.d/max_map_count.conf with the content `vm.max_map_count = 262144`.
			`- On any other platform look up in the documentation on how to pass kernel parameters.`

			`### Permissions for elasticsearch`
			If you see a message similar to `AccessDeniedException[/usr/share/elasticsearch/data/nodes]` when initially starting elasticsearch, that means the container is not allowed to write files to the volume.
			That's most likely the case when you run `docker-compose` as an unprivileged user. To fix that issue, shutdown the container and on your host machine run:
			```
			`chown 1000:0 /path/to/mount/point`
			```
			`This will match the permissions with the UID and GID of elasticsearch within the container and should fix the issue.`

handling genering index write errors, #91 2021-11-25 15:10:12 +00:00			`### Disk usage`
			`The Elasticsearch index will turn to read only if the disk usage of the container goes above 95% until the usage drops below 90% again. Similar to that, TubeArchivist will become all sorts of messed up when running out of disk space. There are some error messages in the logs when that happens, but it's best to make sure to have enough disk space before starting to download.`

minimal viable product 2021-09-05 17:10:14 +00:00			`## Getting Started`
Updated readme to specify how to operate with iOS and MacOS. (#66) issue #61 2021-10-21 04:13:39 +00:00			`1. Go through the settings page and look at the available options. Particularly set Download Format to your desired video quality before downloading. Tube Archivist downloads the best available quality by default. To support iOS or MacOS a compatible format must be specified. For example:`
			```
			`bestvideo[VCODEC=avc1]+bestaudio[ACODEC=mp4a]/mp4`
			```
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`2. Subscribe to some of your favorite YouTube channels on the channels page.`
extended getting started and fixed typo 2021-09-10 09:06:36 +00:00			`3. On the downloads page, click on Rescan subscriptions to add videos from the subscribed channels to your Download queue or click on Add to download queue to manually add Video IDs, links, channels or playlists.`
renaming 'download queue' button to 'start download' 2021-10-03 05:25:56 +00:00			`4. Click on Start download and let Tube Archivist to it's thing.`
minimal viable product 2021-09-05 17:10:14 +00:00			`5. Enjoy your archived collection!`
manual import and extended roadmap after feedback 2021-09-13 15:58:27 +00:00
minimal viable product 2021-09-05 17:10:14 +00:00			`## Roadmap`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`This should be considered as a minimal viable product, there is an extensive list of future functions and improvements planned.`
manual import and extended roadmap after feedback 2021-09-13 15:58:27 +00:00
			`### Functionality`
minimal viable product 2021-09-05 17:10:14 +00:00			`- [ ] User roles`
			`- [ ] Create playlists`
manual import and extended roadmap after feedback 2021-09-13 15:58:27 +00:00			`- [ ] Podcast mode to serve channel as mp3`
			`- [ ] Implement [PyFilesystem](https://github.com/PyFilesystem/pyfilesystem2) for flexible video storage`
update for v0.0.7 2021-11-01 10:16:00 +00:00			`- [ ] Optional automatic deletion of watched items after a specified time`
			`- [ ] Subtitle download & indexing`
			`- [X] Access control [2021-11-01]`
update docs for v0.0.6 take 2 2021-10-17 04:31:21 +00:00			`- [X] Delete videos and channel [2021-10-16]`
			`- [X] Add thumbnail embed option [2021-10-16]`
update roadmap 2021-10-03 13:25:35 +00:00			`- [X] Un-ignore videos [2021-10-03]`
updated roadmap 2021-09-26 05:59:58 +00:00			`- [X] Dynamic download queue [2021-09-26]`
updated roadmap and section about backup-restore 2021-09-22 07:56:29 +00:00			`- [X] Backup and restore [2021-09-22]`
Readme grammar and spelling Updated grammar, spelling and punctuation 2021-09-17 01:18:03 +00:00			`- [X] Scan your file system to index already downloaded videos [2021-09-14]`
manual import and extended roadmap after feedback 2021-09-13 15:58:27 +00:00
			`### UI`
minimal viable product 2021-09-05 17:10:14 +00:00			`- [ ] Show similar videos on video page`
			`- [ ] Multi language support`
manual import and extended roadmap after feedback 2021-09-13 15:58:27 +00:00			`- [ ] Show total video downloaded vs total videos available in channel`
update roadmap 2021-10-03 13:25:35 +00:00			`- [X] Grid and list view for both channel and video list pages [2021-10-03]`
			`- [X] Create a github wiki for user documentation [2021-10-03]`
minimal viable product 2021-09-05 17:10:14 +00:00

			`## Known limitations`
			`- Video files created by Tube Archivist need to be mp4 video files for best browser compatibility.`
extended getting started and fixed typo 2021-09-10 09:06:36 +00:00			`- Every limitation of yt-dlp will also be present in Tube Archivist. If yt-dlp can't download or extract a video for any reason, Tube Archivist won't be able to either.`
update for v0.0.7 2021-11-01 10:16:00 +00:00			`- For now this is meant to be run in a trusted network environment. Not everything is properly authenticated.`
			`- There is currently no flexibility in naming of the media files.`
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00

			`## Donate`
consolidate documentation in wiki 2021-10-03 05:05:39 +00:00			`The best donation to Tube Archivist is your time, take a look at the [contribution page](CONTRIBUTING.md) to get started.`
restructured, added comment about updating and donating 2021-09-26 04:34:54 +00:00			`Second best way to support the development is to provide for caffeinated beverages:`
			`* [Paypal.me](https://paypal.me/bbilly1) for a one time coffee`
			`* [Paypal Subscription](https://www.paypal.com/webapps/billing/plans/subscribe?plan_id=P-03770005GR991451KMFGVPMQ) for a monthly coffee`
			`* [co-fi.com](https://ko-fi.com/bbilly1) for an alternative platform`