Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with chromote package #154

Open
saleforecast1 opened this issue May 5, 2024 · 12 comments
Open

Problem with chromote package #154

saleforecast1 opened this issue May 5, 2024 · 12 comments

Comments

@saleforecast1
Copy link

saleforecast1 commented May 5, 2024

Dear community,

I have created a shiny app to scrape the "google title" from the google page ("https://google.com"). To scrape this, I have used R chromote package. The app works fine while running on the desktop. However, once It is hosted on shinyapps.io server, two users can not use it concurrently. The code for the app is in below,

library(shiny)
library(curl)
library(chromote)
library(pagedown)

ui <- fluidPage(
  textOutput("result")
)

server <- function(session, input, output) {
  driver <- ChromoteSession$new()
  driver$Page$navigate("https://google.com") # open Google page
  Sys.sleep(7)

  output$result <- renderText(
    # scrape Google title 
    driver$Runtime$evaluate('document.querySelector("title").innerText')$result$value
  ) 
}

shinyApp(ui = ui, server = server)

Output:

  1. Click on: https://sale4cast.shinyapps.io/findGoogleTitle/
  2. Wait 5 seconds.
  3. Get the google title "Google"

Question: How can two users access the app concurrently via shinyapps.io?.

Best Regards,

SaleForecast

@wch
Copy link
Collaborator

wch commented May 6, 2024

I think the problem is the use of Sys.sleep(). That will block the entire R process.

You should do something like this:

library(shiny)
library(chromote)

ui <- fluidPage(
  textOutput("result")
)

server <- function(session, input, output) {
  driver <- ChromoteSession$new()
  p <- driver$Page$loadEventFired(wait_ = FALSE)
  driver$Page$navigate("https://google.com", wait_ = FALSE)

  output$result <- renderText({
    p$then(function(value) {
      # scrape Google title 
      driver$Runtime$evaluate('document.querySelector("title").innerText')$result$value
    })
  }) 
}

shinyApp(ui, server)

To properly navigate to a page and wait for it to load without blocking the R process, see this section of the README:
https://github.com/rstudio/chromote?tab=readme-ov-file#taking-a-screenshot-of-a-web-page

The example above also makes uses of Promises in Shiny. See here for more information:
https://rstudio.github.io/promises/articles/promises_06_shiny.html

@saleforecast1
Copy link
Author

saleforecast1 commented May 6, 2024 via email

@gadenbuie
Copy link
Member

@saleforecast1 Your app currently does work on shinyapps.io. Maybe your example doesn't completely reproduce your issue or I don't understand what you mean by "work" or "two users can not use it concurrently". But if I open https://sale4cast.shinyapps.io/findGoogleTitle/ in two different tabs or browsers, they both eventually (after about 7 seconds) show me the word "Google".

@aronatkins
Copy link

This question was also cross-posted to https://forum.posit.co/t/problem-with-chromote-package/186346

@saleforecast1
Copy link
Author

saleforecast1 commented May 8, 2024 via email

@wch
Copy link
Collaborator

wch commented May 8, 2024

Have you tried the code that I provided? The problem is that your Sys.sleep() blocks the entire process.

@gadenbuie
Copy link
Member

Oh, in that case, what Winston said is exactly right:

I think the problem is the use of Sys.sleep(). That will block the entire R process.

If you put Sys.sleep(7) in your app, it causes your app to wait 7 seconds. Sys.sleep() blocks R from doing anything until it finishes. If you open a second tab with the app while the first tab is processing, the second tab has to wait for the first user's app to finish loading, and then has to wait 7 more seconds.

Here's a simple diagram outlining the interaction.

sequenceDiagram
    User 1->>+Shiny: Opens app
    User 2-->Shiny: Opens app
    Shiny-->>-User 1: responds after 7s
    activate Shiny
    Note over Shiny: starts user 2 request
    Shiny-->>-User 2: responds after 7+s 
Loading

To fix it please follow Winston's guidance:

To properly navigate to a page and wait for it to load without blocking the R process, see this section of the README: rstudio/chromote#taking-a-screenshot-of-a-web-page

The example above also makes uses of Promises in Shiny. See here for more information: rstudio.github.io/promises/articles/promises_06_shiny.html

@saleforecast1
Copy link
Author

saleforecast1 commented May 12, 2024

library(shiny)
library(curl)
library(chromote)
library(pagedown)

ui <- fluidPage(
  textOutput("result")
)

server <- function(session, input, output) {
  driver <- ChromoteSession$new()
  p <- driver$Page$loadEventFired(wait_ = FALSE)
  driver$Page$navigate("https://google.com", wait_ = FALSE)
  
  p$then(function(value){
    googleSearchText <- "4 star hotel in barcelona"
    driver$Runtime$evaluate(paste0('document.querySelector("textarea").value = "', googleSearchText,'"'))
    driver$Runtime$evaluate('document.querySelector("input[aria-label=\'Google Search\']").click()')
  })$then(function(value){
    print(driver$Runtime$evaluate('document.querySelector("title").innerText'))
  })
}

shinyApp(ui, server)

@wch can you say please why this code doesn't return the title? It returns an error "TypeError: Cannot read properties of null (reading 'innerText')\n at :1:32"

@wch
Copy link
Collaborator

wch commented May 14, 2024

It sounds like the document.querySelector('title') isn't returning anything.

I think the problem is that clicking on the search button causes another page load, and when you grab the <title> in the middle of that page load, it might be happening too early, before there is a <title> element.

I believe that you'll have to wait for another loadEventFired inside of the promise chain.

library(shiny)
library(chromote)

ui <- fluidPage(
  textOutput("result")
)

server <- function(session, input, output) {
  driver <- ChromoteSession$new()
  p <- driver$Page$loadEventFired(wait_ = FALSE)
  driver$Page$navigate("https://google.com", wait_ = FALSE)

  p$then(function(value){
    googleSearchText <- "4 star hotel in barcelona"
    p2 <- driver$Page$loadEventFired(wait_ = FALSE)
    driver$Runtime$evaluate(paste0('document.querySelector("textarea").value = "', googleSearchText,'"'))
    driver$Runtime$evaluate('document.querySelector("input[aria-label=\'Google Search\']").click()')
    p2
  })$then(function(value){
    v <- driver$Runtime$evaluate('document.querySelector("title").innerText')
    print(v)
  })
}

shinyApp(ui, server)

Note that p2 is created inside the first $then() function, and then it is returned from that function. The way that promises work, this means that the next function that's chained with $then() will wait until that promise resolves before it runs. See the docs for the promises package for more information on how promises work. The API is very similar to JavaScript promises.

One other thing I want to mention: the code you started with uses a mix of sync and async programming, and calls to synchronous Chromote functions inside of asynchronous functions. It works in this case but might do unexpected things for more complicated code. It's probably best to stick to just async code for complex tasks, but that will require a good understanding of how these promises work.

@saleforecast1
Copy link
Author

saleforecast1 commented May 18, 2024

Thanks for you response @wch. I really appreciate your answer and it works great. However, I still face an error when I run this app from multiple devices by shinyapps.io.

Error:
"Unhandled promise error: Chromote: timed out waiting for response to command Page.disable"
"Unhandled promise error: Chromote: timed out waiting for event Page.loadEventFired"

Code:

library(shiny)
library(curl)
library(chromote)
library(pagedown)

ui <- fluidPage(
  tableOutput("result")
)

server <- function(session, input, output) {
  driver <- ChromoteSession$new()

  p <- driver$Page$loadEventFired(wait_ = FALSE)
  driver$Page$navigate("https://google.com", wait_ = FALSE)
  
  p$then(function(value){
    googleSearchText <- "4 star hotel in barcelona"
    p2 <- driver$Page$loadEventFired(wait_ = FALSE)
    
    driver$Runtime$evaluate(paste0('document.querySelector("textarea").value = "', googleSearchText,'"'))
    driver$Runtime$evaluate('document.querySelector("input[aria-label=\'Google Search\']").click()')
    p2
  })$then(function(value){
    p3 <- driver$Page$loadEventFired(wait_ = FALSE)
    driver$Runtime$evaluate('document.querySelector("div.R2w7Jd").click()')
    driver$Runtime$evaluate('document.querySelector("div.JWXKNd").click()')
    p3
  })$then(function(value){
    priceElement <- driver$Runtime$evaluate(
      'var elements = document.querySelectorAll(".K1smNd > c-wiz[jsrenderer=\'hAbFdb\'] .PwV1Ac");
                 var elementPrices = [];
                 elements.forEach(function(element) {
                   elementPrices.push(element.innerText);
                 });
                 elementPrices.join("@");'
    )
    print(priceElement)
  })
}


shinyApp(ui, server)

can you please help me for sort out the problem?

@wch
Copy link
Collaborator

wch commented May 22, 2024

I don't know for sure, but my guess would be that there's not enough time between the two click() commands in the block with p3.

@saleforecast1
Copy link
Author

Thanks for your response @wch. can you please say, how to ensure enough time between two click() event?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants