Skip to content

Add browser-use CDP extraction as copyable tool #124

@wu-changxing

Description

@wu-changxing

Context

Our browser element extraction uses JS injection (extract_elements.js) + injected data-browser-agent-id attributes. This is simple and reliable but misses some cases that browser-use handles via Chrome DevTools Protocol (CDP).

What browser-use does better

  • Event listener detection — catches React/Vue dynamically attached handlers
  • Paint order occlusion — filters elements hidden behind other elements
  • Tree structure — gives LLM parent/child context
  • New element tracking — marks elements that appeared since last step

What our approach does better

  • Simpler (one JS file + one Python file)
  • Reliable CSS locators (data-browser-agent-id)
  • Cross-browser (works with Firefox/WebKit via Playwright)

Proposal

Add browser-use's dom/ module as a co copy-able tool:

co copy browser_use_extractor

Users who need advanced extraction (complex SPAs, heavy shadow DOM usage) can swap it in. Keep our current approach as default.

Already Fixed

  • Shadow DOM traversal (v0.8.8) — covers LinkedIn modals
  • contenteditable detection (v0.8.8)
  • Post-click wait for DOM changes (v0.8.8)

Reference

  • browser-use/browser-use — CDP-based extraction
  • Their key files: browser_use/dom/, browser_use/dom/buildDomTree.js

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions