You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please include any additional information about how to reproduce the problem:
Expected Behavior
ripping should take into account anti-ripping rate limiting, and allow you to adjust download rate to prevent 429 errors, and notify you if files are not downloaded due to rate limit block (429 error), as well as allow you to redownload an url to get the missed files due to exceeding rate limit
Actual Behavior
these 3 things do not seem to occur. rate limiting and 429 errors are common when one rips sites, so i think a good ripping tool should have functions to prevent or workaround 429 rate limiting
downloading urls from 4chan thebarchive gets rate limited quickly, so on average 2 out of 3 pictures are fetched when ripping a dozen threads
the rate limited files are declared 'unretrievable' when a simple wait or retry will actually work fine, not sure why they are declared unretrievable, they are also tagged as completed in the final result list, but they were never downloaded
after the scrape is over, all unretrievable 429 rate limit blocked files are placed in the 'completed' list of history, so it tells you 150 files succeeded, by putting 429 unretrieved files as well as completed files together in the same list, so if you don't check the log or have debug mode on, you will think it downloaded properly, when it didnt
if you attempt to fix it and redownload the threads, by check mark and click redownload button, it will ignore all the missed 429 files and you don't have the option to re-download them, because the log says "Already downloaded" when they are not
In the configuration - the only configuration you can do to attempt to reduce rate limit 429 errors is to reduce the threads to 1, but this is not enough, I suggest adding a delay between each download - similar to gallery-dl's --sleep or --sleep-request function, this will allow users to bypass 429 rate limit errors
Also, the retry option is for 10 retries, but it does not try to retry, so I am unsure if that option is working
The text was updated successfully, but these errors were encountered:
Oh, my bad. I am using 2.1.9-7 - the latest release. I mis-read it as 2.1.7 (you can confirm by looking at the green version text in my last screenshot)
stubkan
changed the title
bug/request functionality to bypass or work around rate limiting 429 errors when getting urls is absent
bug/request functionality to bypass or work around rate limiting 429 errors
Apr 6, 2024
Ripme version: 2.1.9-7 (latest release)
Java version: openjdk 17.0.10
Operating system: Ubuntu 22.04
Exact URL you were trying to rip when the problem occurred: thebarchive multiple threads
Please include any additional information about how to reproduce the problem:
Expected Behavior
ripping should take into account anti-ripping rate limiting, and allow you to adjust download rate to prevent 429 errors, and notify you if files are not downloaded due to rate limit block (429 error), as well as allow you to redownload an url to get the missed files due to exceeding rate limit
Actual Behavior
these 3 things do not seem to occur. rate limiting and 429 errors are common when one rips sites, so i think a good ripping tool should have functions to prevent or workaround 429 rate limiting
downloading urls from 4chan thebarchive gets rate limited quickly, so on average 2 out of 3 pictures are fetched when ripping a dozen threads
the rate limited files are declared 'unretrievable' when a simple wait or retry will actually work fine, not sure why they are declared unretrievable, they are also tagged as completed in the final result list, but they were never downloaded
after the scrape is over, all unretrievable 429 rate limit blocked files are placed in the 'completed' list of history, so it tells you 150 files succeeded, by putting 429 unretrieved files as well as completed files together in the same list, so if you don't check the log or have debug mode on, you will think it downloaded properly, when it didnt
if you attempt to fix it and redownload the threads, by check mark and click redownload button, it will ignore all the missed 429 files and you don't have the option to re-download them, because the log says "Already downloaded" when they are not
In the configuration - the only configuration you can do to attempt to reduce rate limit 429 errors is to reduce the threads to 1, but this is not enough, I suggest adding a delay between each download - similar to gallery-dl's --sleep or --sleep-request function, this will allow users to bypass 429 rate limit errors
Also, the retry option is for 10 retries, but it does not try to retry, so I am unsure if that option is working
The text was updated successfully, but these errors were encountered: