Context
Our browser element extraction uses JS injection (extract_elements.js) + injected data-browser-agent-id attributes. This is simple and reliable but misses some cases that browser-use handles via Chrome DevTools Protocol (CDP).
What browser-use does better
- Event listener detection — catches React/Vue dynamically attached handlers
- Paint order occlusion — filters elements hidden behind other elements
- Tree structure — gives LLM parent/child context
- New element tracking — marks elements that appeared since last step
What our approach does better
- Simpler (one JS file + one Python file)
- Reliable CSS locators (
data-browser-agent-id)
- Cross-browser (works with Firefox/WebKit via Playwright)
Proposal
Add browser-use's dom/ module as a co copy-able tool:
co copy browser_use_extractor
Users who need advanced extraction (complex SPAs, heavy shadow DOM usage) can swap it in. Keep our current approach as default.
Already Fixed
- Shadow DOM traversal (v0.8.8) — covers LinkedIn modals
contenteditable detection (v0.8.8)
- Post-click wait for DOM changes (v0.8.8)
Reference
Context
Our browser element extraction uses JS injection (
extract_elements.js) + injecteddata-browser-agent-idattributes. This is simple and reliable but misses some cases that browser-use handles via Chrome DevTools Protocol (CDP).What browser-use does better
What our approach does better
data-browser-agent-id)Proposal
Add browser-use's
dom/module as aco copy-able tool:Users who need advanced extraction (complex SPAs, heavy shadow DOM usage) can swap it in. Keep our current approach as default.
Already Fixed
contenteditabledetection (v0.8.8)Reference
browser_use/dom/,browser_use/dom/buildDomTree.js