-
Notifications
You must be signed in to change notification settings - Fork 522
Description
In the JavaScript version, the PuppeteerCrawler has gotoOptions, which I believe allows you to define what wait_until state you want.
https://crawlee.dev/js/api/puppeteer-crawler#PuppeteerGoToOptions
The PlaywrightCrawler just uses the default page.goto, which defaults to "load".
https://github.com/apify/crawlee-python/blob/9d4ae6439c301abe7439281a5786b8f166d67623/src/crawlee/crawlers/_playwright/_playwright_crawler.py#L300C1-L301C1
Some sites take ages to load and I would like my request_handler to run after "domcontentloaded", since I don't need to wait for the full page to load to get what I need. As it is now, my request_handler will never be called because the site has an issue preventing it from loading all of the way.
I don't just want to increase the timeout, I want to be able to specify what options _navigate should use when calling goto.