-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
panicked at 'internal error: entered unreachable code: received unknown error (timeout) #120
Comments
Also, to be clear. It doesn't matter if I'm unable to scrape that specific page, I just want to keep geckodriver from dying. |
Huh, that's interesting. The webdriver spec does say that "timeout" is a valid error code, specifically with the meaning:
What operation were you trying to do when this error occurred? |
Sorry, at this time I don't know exactly which operation causes it, but these are the only ones I use: client.goto(url).await?;
client.find_all(Locator::Css("a")).await?;
// Then for each <a> tag
link.attr("href").await?; |
The error suggests to me that it's the browser window that basically ends up hanging. What do you see in the window? |
Interesting thought, I'll try running it non-headless. |
Hello! I got some time for this again, very sorry for the late update. So apparently when running it non-headless, I saw a download window pop up, asking me to save something somewhere. After this happens, it just stands there and eventually fantoccini times out the connection to the webdriver, the webdriver however sits there alive and well until my crawler reaches the finish line. Thus the "error" is clearly not related to fantoccini, it just waits until it times out because the webdriver did not respond in time. But if at all possible, I'd be very happy to hear some ideas on how one could circumvent this. Could you for example:
Thanks in advance! |
After looking around a bit, I've seen no clear solution for disabling it, in fact- it gets worse, apparently this would happen with any sort of browser prompt e.g.: push notifications, downloads, printing, HTTP Auth and so on. Not all of these can (from what I've found) be disabled, so whenever any of these prompts appear, fantoccini will be waiting for a response and will remain stuck there until it decides to timeout the connection. My ideas:
|
Okay, I've narrowed down which timeout causes fantoccini to drop it. It's the pageLoad timeout, which by default is 5 minutes. I've now set it to 5 seconds for testing, and I'm getting the same error after those 5 seconds. Unless fantoccini is really just getting thrown out after that timeout hits, it might be a bug. The geckodriver debug is clean. If it is a bug, would it be possible to make fantoccini simply return the error without destroying the connection? As far as I can tell, the geckodriver and the session within, is still running. Though, I can't be sure if it's still usable- I'm only assuming a .goto() after would invalidate the previous request, well I'm hoping so. But I've been unable to test this as I can't reconnect to the same session, but I saw this #100 which I might try. |
When crawling a website, I get this when it happens upon a certain page:
It also seems geckodriver dies at this point, as I'll get the following on the next pages.
Is there a way to circumvent this error? Anything I could do about it?
The text was updated successfully, but these errors were encountered: