Skip to content

Commit

Permalink
Add info about using loadEventFired. Closes #102
Browse files Browse the repository at this point in the history
  • Loading branch information
wch committed May 6, 2024
1 parent 5a630b4 commit a900c46
Show file tree
Hide file tree
Showing 2 changed files with 166 additions and 114 deletions.
125 changes: 74 additions & 51 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,8 @@ $timestamp
[1] 699232.3
```

> **Note:** This sequence of commands, with `Page$navigate()` and then `Page$loadEventFired()` will not work 100% of the time. See [Loading a Page Reliably](#loading-a-page-reliably) for more information.
> **Technical note:** Chromote insulates the user from some of the details of how the CDP implements event notifications. Event notifications are not sent from the browser to the R process by default; you must first send a command to enable event notifications for a domain. For example `Page.enable` enables event notifications for the `Page` domain -- the browser will send messages for _all_ `Page` events. (See the Events section in [this page](https://chromedevtools.github.io/devtools-protocol/tot/Page/)). These notifications will continue to be sent until the browser receives a `Page.disable` command.
>
> By default, Chromote hides this implementation detail. When you call `b$Page$loadEventFired()`, Chromote sends a `Page.enable` command automatically, and then waits until it receives the `Page.loadEventFired` event notification. Then it sends a `Page.disable` command.
Expand Down Expand Up @@ -702,51 +704,27 @@ b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)$

There will be a short delay after running the code before the value is printed.

If you want to schedule a chain of promises and then wait for them to resolve, you can once again use the `wait_for()` method. For example:
However, even this is not perfectly reliable, because in some cases, the browser will navigate to the page before it receives the `loadEventFired` command from Chromote. If that happens, the load even will have already happened before the browser starts waiting for it, and it will hang. The correct way to deal with this is to issue the `loadEventFired` command _before_ navigating to the page, and then wait for the `loadEventFired` promise to resolve.

```R
p <- b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)$
then(function(value) {
b$Page$loadEventFired(wait_ = FALSE)
})

# wait_for returns the last value in the chain, so we can call str() on it
str(b$wait_for(p))
#> List of 1
#> $ timestamp: num 683
```

This particular example has a twist to it: After sending the `Page.navigate` command, the R process doesn't really need to wait for browser's response before it starts waiting for the `Page.loadEventFired` event. So instead of chaining, you could just do this:
# This is the correct way to wait for a page to load with async and then chain more commands
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)

```R
p <- promise(function(resolve, reject) {
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)
resolve(b$Page$loadEventFired(wait_ = FALSE))
# A promise chain of more commands after the page has loaded
p$then(function(value) {
str(value)
})

str(b$wait_for(p))
#> List of 1
#> $ timestamp: num 683
```

Essentially, the `Page.navigate` command gets sent off and we don't need to wait for the browser's reply. We can tell R to immediately start waiting for the `Page.loadEventFired` event.

We can simplify it by not wrapping both method calls in a promise. We can just fire off the navigation command, and then directly use the promise that's returned by the event method:
If you want to block until the page has loaded, you can once again use `wait_for()`. For example:

```R
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)
p <- b$Page$loadEventFired(wait_ = FALSE)
str(b$wait_for(p))
#> List of 1
#> $ timestamp: num 683
```

And we can make it yet simpler by firing off the navigation command and then calling `b$Page$loadEventFired()` in synchronous mode (with the default `wait_=TRUE`), which already calls `wait_for()`.

```R
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)
x <- b$Page$loadEventFired()
str(x)

# wait_for returns the last value in the chain, so we can call str() on it
str(b$wait_for(p))
#> List of 1
#> $ timestamp: num 683
```
Expand Down Expand Up @@ -865,12 +843,13 @@ b$screenshot("browser_string.png", selector = ".string-major")

When you use `$view()` on the remote browser, your local browser may block scripts for security reasons, which means that you won't be able to view the remote browser. If your local browser is Chrome, there will be a shield-shaped icon in the location bar that you can click in order to enable loading the scripts. (Note: Some browsers don't seem to work at all with the viewer.)

**Technical note:** There seem to be some timing issues with remote browsers. In the example above, the browser may finish navigating to the web site before the R process receives the response message for `$navigate()`, and therefore before R starts waiting for `Page.loadEventFired`. In order to avoid these timing problems, it may be better to write code like this:
**Technical note:** There seem to be some timing issues with remote browsers. In the example above, the browser may finish navigating to the web site before the R process receives the response message for `$navigate()`, and therefore before R starts waiting for `Page.loadEventFired`. In order to avoid these timing problems, it is better to write code like this:

```R
{
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("https://www.whatismybrowser.com/", wait_ = FALSE)
b$Page$loadEventFired()
b$wait_for(p)
}
b$screenshot("browser.png")
```
Expand Down Expand Up @@ -911,24 +890,68 @@ b$Runtime$evaluate('alert("this is the first tab")')

## Examples

### Loading a page reliably

In many cases, the commands `Page$navigate()` and then `$Page$loadEventFired()` will not reliably block until the page loads. For example:

```R
# Not reliable
b$Page$navigate("https://www.r-project.org/")
b$Page$loadEventFired() # Block until page has loaded
```

This is because the browser might successfully navigate to the page before it receives the `loadEventFired` command from R.

In order to navigate to a page reliably, you must issue the `loadEventFired` command first in async mode, then issue the `navigate` command, and then wait for the `loadEventFired` promise to resolve. (If it has already resolved at this point, then the code will continue.)

```R
# Reliable method 1: for use with synchronous API
p <- b$Page$loadEventFired(wait_ = FALSE) # Get the promise for the loadEventFired
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)

# Block until p resolves
b$wait_for(p)

# Add more synchronous commands here
b$screenshot("browser.png")
```

The above code uses the async API to do the waiting, but then assumes that you want to write subsequent code with the synchronous API.

If you want to go fully async, then instead of calling `wait_for(p)`, you would simply chain more promises from `p`, using `$then()`.

```R
# Reliable method 2: for use with asynchronous API
p <- b$Page$loadEventFired(wait_ = FALSE) # Get the promise for the loadEventFired
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)

# Chain more async commands after the page has loaded
p$then(function(value) {
b$screenshot("browser.png", wait_ = FALSE)
})
```

This is explained in more detail in the [Async Events](#async-events) section.


### Taking a screenshot of a web page

Take a screenshot of the viewport and display it using the [showimage](https://github.com/r-lib/showimage#readme) package. This uses Chromote's `$screenshot()` method, which wraps up many calls to the Chrome DevTools Protocol.

```R
b <- ChromoteSession$new()

# ==== Synchronous version ====
# ==== Semi-synchronous version ====
# Run the next two lines together, without any delay in between.
b$Page$navigate("https://www.r-project.org/")
b$Page$loadEventFired()

p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)
b$wait_for(p)
b$screenshot(show = TRUE) # Saves to screenshot.png and displays in viewer

# ==== Async version ====
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("https://www.r-project.org/", wait_ = FALSE)
b$Page$loadEventFired(wait_ = FALSE)$
then(function(value) {
p$then(function(value) {
b$screenshot(show = TRUE)
})
```
Expand Down Expand Up @@ -1012,9 +1035,9 @@ screenshot_p <- function(url, filename = NULL) {
}

b <- ChromoteSession$new()
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate(url, wait_ = FALSE)
b$Page$loadEventFired(wait_ = FALSE)$
then(function(value) {
p$then(function(value) {
b$screenshot(filename, wait_ = FALSE)
})$
then(function(value) {
Expand Down Expand Up @@ -1093,9 +1116,9 @@ b$screenshot(show = TRUE)

# ==== Async version ====
b$Network$setUserAgentOverride(userAgent = "My fake browser", wait_ = FALSE)
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("http://scooterlabs.com/echo", wait_ = FALSE)
b$Page$loadEventFired(wait_ = FALSE)$
then(function(value) {
p$then(function(value) {
b$screenshot(show = TRUE)
})
```
Expand All @@ -1116,9 +1139,9 @@ x$result$value


# ==== Async version ====
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("https://www.whatismybrowser.com/", wait_ = FALSE)
b$Page$loadEventFired(wait_ = FALSE)$
then(function(value) {
p$then(function(value) {
b$Runtime$evaluate(
'document.querySelector(".corset .string-major a").innerText'
)
Expand All @@ -1141,9 +1164,9 @@ b$DOM$getOuterHTML(x$nodeId)


# ==== Async version ====
p <- b$Page$loadEventFired(wait_ = FALSE)
b$Page$navigate("https://www.whatismybrowser.com/", wait_ = FALSE)
b$Page$loadEventFired(wait_ = FALSE)$
then(function(value) {
p$then(function(value) {
b$DOM$getDocument()
})$
then(function(value) {
Expand Down
Loading

0 comments on commit a900c46

Please sign in to comment.