-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails on large channels #71
Comments
I've been looking for a way to make the metadata step smaller because it includes a lot of extra information which Yark's archive format doesn't use; I'll look into download-archive. At least theres Can you send the error of the failed download? That might be a seperate bug |
Thanks, that sounds great! There was a yt-dlp update today that might have helped, as I'm not seeing anything since tonight. The last instance was:
PS: {name} probably needs a |
Yup, this is really not worth a PR, so here's the line: Line 637 in d616ae9
Needs a |
Are you on yark v1.2.3? This should be fixed as of last night
Whoops yep, will add |
Yes I updated yesterday but thought the error persisted - sorry if that was wrong! Regardless, some sort of chunked metadata + download stage would definitely be a nice addition to reduce memory consumption and make everything smoother. BTW, I guess youtube doesn't like parallel downloads. I don't know your stance towards lots of external dependencies, but my experience with the
before performing yt-dlp options to prevent multiple instances. |
Yep definately. I've purposefully let yt-dlp download using default values so far to reduce complexity in these early versions, but chunking + parralelism (if youtube can do it) is needed. I don't mind having extra dependencies as long as they're worth it compared to downloading and the vuln risk. When downloads are being processed Yark generates a full list of the videos to download and pipes it into yt-dlp so hopefully it'll be easy to parrelelise using yt-dlp's options or otherwise. Downloading videos is safe to stop at any time so I think metadata is the main concern when tackling this issue because its all or nothing and has that issue with RAM. |
If I have time this should hopefully be in v1.3 in a months time :) |
Hi, thanks for this very nice project. It's really polished and takes a lot of complexity out of yt-dlp, which is great.
I tried running yark on a couple of large-ish channels (10.000s of videos), and it seems to have some issues that yt-dlp also exhibits (if I recall correctly): The initial metadata download takes several hours and requires short of 10GB of RAM, then the subsequent downloads fail after only a handfull of videos.
I haven't looked into the details, but this might be due to some download tokens expiring, or perhaps it's just insufficient retries or so. In any case, it would benefit yark enormously to keep some sort of record regarding the videos that were already downloaded and to then continue archival in chunks, instead of trying to do all in one. yt-dlp has some of this functionality with
--download-archive
, but that doesn't have any "comfort features", i.e. no checking, pruning, displaying, or automatic management of that resume file.The text was updated successfully, but these errors were encountered: