Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle timeout exception from selenium and still return the page #58

Open
michelts opened this issue Mar 3, 2020 · 3 comments
Open

Comments

@michelts
Copy link

michelts commented Mar 3, 2020

Hi @clemfromspace

I'm using the wait_time and wait_until to wait for a page to be rendered but, sometimes, the page renders a way I'm not expecting. If I don't use wait_time, I will see the rendered content (if it was faster enough), but using wait time, selenium will trigger a timeout exception and scrapy won't parse the result after all.

I wonder if this is something useful somehow, but I'm not sure. I think the approach should be the opposite, I mean, we should handle the exception and still return the found content to scrapy, so I can at least see the snapshot or see the HTML content.

@michelts michelts changed the title Parse timeout exception from selenium and still return the page Handle timeout exception from selenium and still return the page Mar 3, 2020
@michelts
Copy link
Author

michelts commented Mar 3, 2020

Just to note, the exception got from scrapy is:

Traceback (most recent call last):
  File ".../lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File ".../lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 38, in process_request
    response = yield method(request=request, spider=spider)
  File ".../lib/python3.6/site-packages/scrapy_selenium/middlewares.py", line 115, in process_request
    request.wait_until
  File ".../lib/python3.6/site-packages/selenium/webdriver/support/wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 

@dustinmichels
Copy link

dustinmichels commented Jan 29, 2021

I am also wondering how to correctly handle the TimeoutException, so I can still parse the page with scrapy even if the content doesn't load.

@aivoric
Copy link

aivoric commented Sep 24, 2021

I have the same issue. In my case I want to "Retry" the request which hit a selenium.common.exceptions.TimeoutException, however that also doesn't seem to work because scrapy doesn't know there was a Timeout so it can't pass the response object to the Retry Middleware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants