merge testing into master before repo migration

2022-04-15 21:16:28 +07:00 · 2022-04-15 21:16:28 +07:00 · b819c2f723
parent 5f90d21234 fe610fdaca
commit b819c2f723
13 changed files with 94 additions and 22 deletions
--- a/README.md
+++ b/README.md
@ -47,8 +47,8 @@
 Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.

 ## Connect
- [Discord](https://discord.gg/AFwz8nE7BK): Connect with us on our brand new Discord server.
- [r/TubeArchivist](https://www.reddit.com/r/TubeArchivist/): Join our brand new Subreddit.
+- [Discord](https://discord.gg/AFwz8nE7BK): Connect with us on our Discord server.
+- [r/TubeArchivist](https://www.reddit.com/r/TubeArchivist/): Join our Subreddit.

 ## Installing and updating
 Take a look at the example `docker-compose.yml` file provided. Use the *latest* or the named semantic version tag. The *unstable* tag is for intermediate testing and as the name implies, is **unstable** and not be used on your main installation but in a [testing environment](CONTRIBUTING.md).  
@ -76,7 +76,9 @@ Should that not be an option, the Tube Archivist container takes these two addit
 Changing any of these two environment variables will change the files *nginx.conf* and *uwsgi.ini* at startup using `sed` in your container.

 ### Elasticsearch
-**Note**: Newest Tube Archivist depends on Elasticsearch version 7.17 to provide an automatic updatepath. 
+**Note**: Newest Tube Archivist depends on Elasticsearch version 7.17 to provide an automatic updatepath in the future. 
+
+Use `bbilly1/tubearchivist-es` to automatically get the recommended version, or use the official image with the version tag in the docker-compose file.

 Stores video meta data and makes everything searchable. Also keeps track of the download queue.
  - Needs to be accessible over the default port `9200`
@ -98,7 +100,7 @@ For some architectures it might be required to run Redis JSON on a nonstandard p
 ### Updating Tube Archivist
 You will see the current version number of **Tube Archivist** in the footer of the interface so you can compare it with the latest release to make sure you are running the *latest and greatest*.  
 * There can be breaking changes between updates, particularly as the application grows, new environment variables or settings might be required for you to set in the your docker-compose file. *Always* check the **release notes**: Any breaking changes will be marked there.  
-* All testing and development is done with the Elasticsearch version number as mentioned in the provided *docker-compose.yml* file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned.
+* All testing and development is done with the Elasticsearch version number as mentioned in the provided *docker-compose.yml* file. This will be updated when a new release of Elasticsearch is available. Running an older version of Elasticsearch is most likely not going to result in any issues, but it's still recommended to run the same version as mentioned. Use `bbilly1/tubearchivist-es` to automatically get the recommended version.

 ### Alternative installation instructions:
 - **arm64**: The Tube Archivist container is multi arch, so is Elasticsearch. RedisJSON doesn't offer arm builds, you can use `bbilly1/rejson`, an unofficial rebuild for arm64.
--- a/deploy.sh
+++ b/deploy.sh
@ -103,8 +103,9 @@ function validate {
 # update latest tag compatible es for set and forget
 function sync_latest_es {

-    printf "\nsync new es version:\n"
-    read -r VERSION
+    VERSION=$(grep "bbilly1/tubearchivist-es" docker-compose.yml | awk '{print $NF}')
+    printf "\nsync new ES version %s\nContinue?\n" "$VERSION"
+    read -rn 1

    if [[ $(systemctl is-active docker) != 'active' ]]; then
        echo "starting docker"
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -11,19 +11,19 @@ services:
      - media:/youtube
      - cache:/cache
    environment:
-      - ES_URL=http://archivist-es:9200
-      - REDIS_HOST=archivist-redis
+      - ES_URL=http://archivist-es:9200     # needs protocol e.g. http and port
+      - REDIS_HOST=archivist-redis          # don't add protocol
      - HOST_UID=1000
      - HOST_GID=1000
-      - TA_USERNAME=tubearchivist
-      - TA_PASSWORD=verysecret
-      - ELASTIC_PASSWORD=verysecret
-      - TZ=America/New_York
+      - TA_USERNAME=tubearchivist           # your initial TA credentials
+      - TA_PASSWORD=verysecret              # your initial TA credentials
+      - ELASTIC_PASSWORD=verysecret         # set password for Elasticsearch
+      - TZ=America/New_York                 # set your time zone
    depends_on:
      - archivist-es
      - archivist-redis
  archivist-redis:
-    image: redislabs/rejson:latest      # For arm64 just update this line with bbilly1/rejson:latest
+    image: redislabs/rejson:latest          # for arm64 use bbilly1/rejson
    container_name: archivist-redis
    restart: always
    expose:
@ -33,12 +33,12 @@ services:
    depends_on:
      - archivist-es
  archivist-es:
-    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.1
+    image: bbilly1/tubearchivist-es         # only for amd64, or use official es 7.17.2
    container_name: archivist-es
    restart: always
    environment:
      - "xpack.security.enabled=true"
-      - "ELASTIC_PASSWORD=verysecret"
+      - "ELASTIC_PASSWORD=verysecret"       # matching Elasticsearch password
      - "discovery.type=single-node"
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
@ -46,7 +46,7 @@ services:
        soft: -1
        hard: -1
    volumes:
-      - es:/usr/share/elasticsearch/data
+      - es:/usr/share/elasticsearch/data    # check for permission error when using bind mount, see readme
    expose:
      - "9200"

--- a/docs/Channels.md
+++ b/docs/Channels.md
@ -28,4 +28,5 @@ Each channel will get a dedicated channel detail page accessible at `/channel/<c
 Clicking on the *Configure* button will open a form with options to configure settings on a per channel basis. Any configurations here will overwrite your settings from the [settings](Settings) page.
 - **Download Format**: Overwrite the download qualities for videos from this channel.
 - **Auto Delete**: Automatically delete watched videos from this channel after selected days.
- **Index Playlists**: Automatically add all Playlists with at least a video downloaded to your index. Only do this for channels where you care about playlists as this will slow down indexing new videos for having to check which playlist this belongs to.
+- **Index Playlists**: Automatically add all Playlists with at least a video downloaded to your index. Only do this for channels where you care about playlists as this will slow down indexing new videos for having to check which playlist this belongs to.
+- **SponsorBlock**: Using [SponsorBlock](https://sponsor.ajay.app/) to get and skip sponsored content. Customize per channel: You can *disable* or *enable* SponsorBlock for certain channels only to overwrite the behavior set on the [Settings](settings) page. Selecting *unset* will remove the overwrite and your setting will fall back to the default on the settings page.
--- a/docs/Settings.md
+++ b/docs/Settings.md
@ -36,6 +36,7 @@ Additional settings passed to yt-dlp.
 All third party integrations of TubeArchivist will **always** be *opt in*.
 - **API**: Your access token for the Tube Archivist API.
 - **returnyoutubedislike.com**: This will get return dislikes and average ratings for each video by integrating with the API from [returnyoutubedislike.com](https://www.returnyoutubedislike.com/).
+- **SponsorBlock**: Using [SponsorBlock](https://sponsor.ajay.app/) to get and skip sponsored content. If a video doesn't have timestamps, or has unlocked timestamps, use the browser addon to contribute to this excellent project. Can also be activated and deactivated as a per [channel overwrite](Settings#channel-customize).
 - **Cast**: Enabling the cast integration in the settings page will load an additional JS library from **Google**.
    * Requirements
        - HTTPS
--- a/tubearchivist/home/src/download/yt_dlp_handler.py
+++ b/tubearchivist/home/src/download/yt_dlp_handler.py
@ -181,7 +181,22 @@ class VideoDownloader:
                youtube_id, video_overwrites=self.video_overwrites
            )
            self.channels.add(vid_dict["channel"]["channel_id"])
+            mess_dict = {
+                "status": "message:download",
+                "level": "info",
+                "title": "Moving....",
+                "message": "Moving downloaded file to storage folder",
+            }
+            RedisArchivist().set_message("message:download", mess_dict, False)
+
            self.move_to_archive(vid_dict)
+            mess_dict = {
+                "status": "message:download",
+                "level": "info",
+                "title": "Completed",
+                "message": "",
+            }
+            RedisArchivist().set_message("message:download", mess_dict, 4)
            self._delete_from_pending(youtube_id)

        # post processing
--- a/tubearchivist/home/src/es/index_mapping.json
+++ b/tubearchivist/home/src/es/index_mapping.json
@ -193,6 +193,22 @@
                        }
                    }
                },
+                "stats" : {
+                    "properties" : {
+                        "average_rating" : {
+                            "type" : "float"
+                        },
+                        "dislike_count" : {
+                            "type" : "long"
+                        },
+                        "like_count" : {
+                            "type" : "long"
+                        },
+                        "view_count" : {
+                            "type" : "long"
+                        }
+                    }
+                },
                "subtitles": {
                    "properties": {
                        "ext": {
@ -229,6 +245,31 @@
                        },
                        "is_enabled": {
                            "type": "boolean"
+                        },
+                        "segments" : {
+                            "properties" : {
+                                "UUID" : {
+                                    "type": "keyword"
+                                },
+                                "actionType" : {
+                                    "type": "keyword"
+                                },
+                                "category" : {
+                                    "type": "keyword"
+                                },
+                                "locked" : {
+                                    "type" : "short"
+                                },
+                                "segment" : {
+                                    "type" : "float"
+                                },
+                                "videoDuration" : {
+                                    "type" : "float"
+                                },
+                                "votes" : {
+                                    "type" : "long"
+                                }
+                            }
                        }
                    }
                }
--- a/tubearchivist/home/src/es/index_setup.py
+++ b/tubearchivist/home/src/es/index_setup.py
@ -377,6 +377,7 @@ def backup_all_indexes(reason):

    for index in backup_handler.index_config:
        index_name = index["index_name"]
+        print(f"backup: export in progress for {index_name}")
        if not backup_handler.index_exists(index_name):
            continue
        all_results = backup_handler.get_all_documents(index_name)
--- a/tubearchivist/home/src/frontend/forms.py
+++ b/tubearchivist/home/src/frontend/forms.py
@ -202,6 +202,7 @@ class ChannelOverwriteForm(forms.Form):
        ("", "-- change sponsorblock integrations"),
        ("disable", "disable sponsorblock integration"),
        ("1", "enable sponsorblock integration"),
+        ("0", "unset sponsorblock integration"),
    ]

    download_format = forms.CharField(label=False, required=False)
--- a/tubearchivist/home/src/index/generic.py
+++ b/tubearchivist/home/src/index/generic.py
@ -133,7 +133,7 @@ class Pagination:
        """validate pagination with total_hits after making api call"""
        page_get = self.page_get
        max_pages = math.ceil(total_hits / self.page_size)
-        if total_hits > 10000:
+        if total_hits >= 10000:
            # es returns maximal 10000 results
            self.pagination["max_hits"] = True
            max_pages = max_pages - 1
--- a/tubearchivist/home/src/index/reindex.py
+++ b/tubearchivist/home/src/index/reindex.py
@ -41,6 +41,9 @@ class Reindex:
        """get daily refresh values"""
        total_videos = self._get_total_hits("ta_video")
        video_daily = ceil(total_videos / self.interval * self.MULTIPLY)
+        if video_daily >= 10000:
+            video_daily = 9999
+
        total_channels = self._get_total_hits("ta_channel")
        channel_daily = ceil(total_channels / self.interval * self.MULTIPLY)
        total_playlists = self._get_total_hits("ta_playlist")
--- a/tubearchivist/home/src/index/video.py
+++ b/tubearchivist/home/src/index/video.py
@ -170,6 +170,11 @@ class SubtitleParser:

        self.all_cues = []
        for idx, event in enumerate(all_events):
+            if "dDurationMs" not in event:
+                # some events won't have a duration
+                print(f"failed to parse event without duration: {event}")
+                continue
+
            cue = {
                "start": self._ms_conv(event["tStartMs"]),
                "end": self._ms_conv(event["tStartMs"] + event["dDurationMs"]),
@ -412,16 +417,15 @@ class YoutubeVideo(YouTubeItem, YoutubeSubtitle):

    def _check_get_sb(self):
        """check if need to run sponsor block"""
-        integrate = False
-        if self.config["downloads"]["integrate_sponsorblock"]:
-            integrate = True
+        integrate = self.config["downloads"]["integrate_sponsorblock"]

        if self.video_overwrites:
            single_overwrite = self.video_overwrites.get(self.youtube_id)
            if not single_overwrite:
                return integrate

-            integrate = single_overwrite.get("integrate_sponsorblock", False)
+            if "integrate_sponsorblock" in single_overwrite:
+                return single_overwrite.get("integrate_sponsorblock")

        return integrate

--- a/tubearchivist/home/templates/home/channel_id.html
+++ b/tubearchivist/home/templates/home/channel_id.html
@ -93,6 +93,8 @@
                    <p>Enable <a href="https://sponsor.ajay.app/" target="_blank">SponsorBlock</a>: <span class="settings-current">
                        {% if channel_info.channel_overwrites.integrate_sponsorblock %}
                            {{ channel_info.channel_overwrites.integrate_sponsorblock }}
+                        {% elif channel_info.channel_overwrites.integrate_sponsorblock == False %}
+                            Disabled
                        {% else %}
                            False
                        {% endif %}</span></p>