-
-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solving Cloudflare, block_images
detection
#170
Comments
your code example includes username / password for your proxy, if they're not just example values make sure to change them in your proxy provider settings or people can abuse that |
Completely forgot to remove that part, thank you. |
Hello, Seems to be an issue with |
@sebhansen I am not familiar with Zyte. Did you have images disabled on their browser as well? |
I can't seem to find any specifics on whether or not they block images. |
I now see I am able to manually press the checkbox when testing with the browser showing. Before, it used to just throw me straight back to the same exact challenge page. Now my question is, is Camoufox able to find that box and click it itself, or will I have to manually find the coordinates for the element, which doesnt seem dynamic at all? |
Unfortunately support for searching within iframes is limited right now due to restrictions in how Camoufox handles content isolation (to prevent detection, cross-process iframe isolation is kept enabled). However it should be possible to find the checkbox coordinates given the offset of the turnstile frame. or as a more full proof solution, finding the checkbox given a screenshot using OpenCV. |
@daijro Do you possibly have anything you have done with the coordinates before? I just cant seem to find anything that I can use with Camoufox. I have tried using boundingBox etc, but it isnt supported etc. I can show you what I have tried to so far.
The issue with this is that it seems like it doesnt directly act like playwright. |
Also, just tested on a new website I had issues with: https://www.nettiauto.com/vaihtoautot?posted_by=dealer I get to the same page with the cloudflare checkbox, but this one sends me directly back to the same exact cloudflare challenge. Using the same config as for the other site. Here, Zyte is also able to go directly through (Zyte is an API used for scraping, getting json/html, or whatever you want, sent back). They maintain Scrapy if you know what that is. |
From my own testing, turnstile captchas usually autosolve, but you can just click on the turnstile frame if it doesn’t.
Or if the turnstile itself is not centered in the frame, you can click on a position relative to the top right of the frame:
block_images also breaks turnstile, and turnstiles will fail even if you click on them. |
@sadikhan918 I'm not quite sure how you are able to give the locator a click function. It just doesnt want to click for me at all. If you have a full piece of code, I'd love to see how it works! |
block_images
detection
Sorry, I went back and checked my tests and noticed that I wasn't clicking on the iframe itself, but the parent container. That's why the positioning can be important, since the parent can span across the entire width of the page but the turnstile element is only a portion of it. This is some working code to solve it:
|
I agree with the theory in this, but it just doesnt seem to be working with anything other than that exact site. Have you tried with https://sergiodemo.com/security/challenge/legacy-challenge? It's more like real life a scenario. Sadly, everything useful is inside the #shadow-root, and I just cant seem to make it press anything or find anything that describes any locators I can use. |
I see what you mean. When I click on the turnstile by selecting the parent div in your example, it “clicks” but doesn’t solve the challenge. I’m not sure how to fix that, but I know that the challenge can be skipped altogether with a good proxy. |
Oh for sure, sadly those will be way too expensive to keep up haha |
Will close this as I dont think it'll be possible to solve the challenge without just clicking popular spots for the checkbox with coordinates. If anyone figures out a better way, feel free to let me know. |
Website detecting Camoufox:
The "website" I am trying to go to, is moneysupermarket.com. It's a link to a specific cars JSON data. In my normal code I'd simply take all text and scrape that so I can build it elsewhere, but it just doesnt get past Cloudflare.
Screenshots:
To Reproduce:
Simply trying to scrape all text from the link provided. The code provided just shows we dont get past cloudflare.
Other questions:
Is it because of the webGL lib not being crazy good right now, and then will be fixed for the next release of the updated webGL lib?
Yes, I am using datacenter proxies. This shouldnt be an issue however, since Zyte is able to get through every time, using the same proxies.
It works. It shows the cloudflare site for a second and then goes on to show me the correct page.
Every time.
Win 11
Version:
Pip package: v0.4.9
Camoufox: v134.0.2-beta.20 (Up to date!)
The text was updated successfully, but these errors were encountered: