Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a ChromoteSession$nagivate() method? #177

Open
gadenbuie opened this issue Sep 12, 2024 · 6 comments
Open

Create a ChromoteSession$nagivate() method? #177

gadenbuie opened this issue Sep 12, 2024 · 6 comments

Comments

@gadenbuie
Copy link
Member

gadenbuie commented Sep 12, 2024

As described in the README under Loading a page reliably, one cannot use the following pattern to reliably wait for the page to load.

# Not reliable
b$Page$navigate("https://www.r-project.org/")
b$Page$loadEventFired()  # Block until page has loaded
b$screenshot("browser.png")

Instead, we suggest an synchronous or asynchronous approach that involves correctly threading the $Page$loadEventFired() and $Page$navigate() commands.

In practice, however, most people just sys.sleep() for some reasonable or acceptable amount of time (#176).

Considering that we have a few convenience methods already, e.g. ChromoteSession$screenshot(), I'd like to propose that we offer a similar convenience method, ChromoteSession$navigate(), that automatically waits for the page load event and awaits the value (wait_ = TRUE) or returns the promise (wait_ = FALSE).

In the base case, that would simplify the above example code to just these two lines:

b$navigate("https://www.r-project.org/")
b$screenshot("browser.png")
@jonthegeek
Copy link

I think this would be very helpful! FWIW, if the page is making API calls, $Page$loadEventFired() might not be enough for the page to really be "done". I don't have it fully loaded in my memory right now, but I remember that being the source of a lot of the complexity in {apisniffer}. I never felt like what I was doing was absolutely correct, but I was using callbacks for session$Network$requestWillBeSent and session$Network$responseReceived to make sure all of the calls that fired off had received a response (or a maximum wait time was reached) before considering the page fully loaded. That might be beyond the scope of this ticket, but I wanted to bring it up (because if you CAN handle that piece, too, that would be very cool 😊 )!

@gadenbuie
Copy link
Member Author

gadenbuie commented Sep 12, 2024

Great point @jonthegeek. I like what pupeteer does in taking a predefined lifecycle event. In R would be a wait_until argument that takes one of the following values

wait_until Event
"load" Waits for the load event, i.e. b$Page$loadEventFired().
"DOMContentLoaded" Waits for the DOMContentLoaded event.
"networkIdle0" Waits until there are no network connections for at least 500ms.
"networkIdle2" Waits until there are no more than 2 network connections for at least 500ms.

I also really like the page.waitFor*() methods (listed here) and would love to see some of those make it into chromote. In particular

  • waitForSelector() – waits for a CSS selector to have matching elements
  • waitForFunction() – waits for a function to return a truthy value
  • waitForNetworkIdle() – see above
  • waitForNavigation() – wait for the page to navigate to a new URL
  • waitForRequest() – wait for a request to a URL (or request identified by a predicate function)
  • waitForResponse() – wait for a response from a URL (or a response identified by a predicate function)

@jonthegeek
Copy link

Oooh, I like waitForNetworkIdle()! I think for my specific use case I also want to wait for responses to sent events (with a timeout), but I think mixing in network idle would help deal with part of my "but is that everything?" problem.

@jonthegeek
Copy link

Oh, ha, I just saw waitForResponse() is in there, too!

@gadenbuie
Copy link
Member Author

Good to know! I hadn't included waitForResponse() in my smaller list because I wasn't sure if that would be immediately useful to chromote users. Glad to get that feedback right away. (I'll edit the list above.) I do think waitForRequest() and waitForResponse() are a package deal.

@jonthegeek
Copy link

Yeah, definitely! I'd have to understand how it works a little better; what I do right now is log responses until a certain time has passed (simulating waitForNetworkIdle(), kinda), and then check that there are corresponding response events for each of those requests. I think what I'd want to do is fire off a waitForResponse() (with a timeout) as a callback on session$Network$requestWillBeSent(), disabling that callback with waitForNetworkIdle(). That assumes waitForResponse() ties (or at least can tie) to a particular request.

I very much doubt that my use case will fall neatly onto the standard path, but having the helpers would definitely make my code feel less hacky!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants