Skip to content

fix: detect custom clickable elements in take_snapshot#452

Merged
shivammittal274 merged 1 commit intomainfrom
fix/snapshot-missing-interactive-elements
Mar 16, 2026
Merged

fix: detect custom clickable elements in take_snapshot#452
shivammittal274 merged 1 commit intomainfrom
fix/snapshot-missing-interactive-elements

Conversation

@shivammittal274
Copy link
Contributor

Summary

  • Merges cursor-interactive detection (cursor:pointer, onclick, tabindex) into take_snapshot so the agent sees custom components — upvote buttons, dropdown triggers, clickable cards, etc.
  • Adds DisclosureTriangle to INTERACTIVE_ROLES so <summary> elements are captured
  • Uses aria-label as text fallback in cursor detection for icon-only buttons (e.g. close ✕)
  • Fixes dedup bug in enhancedSnapshot that silently dropped all cursor-detected elements

Before → After

take_snapshot on a page with custom components:

Before (16/31 elements):

[16] button "Standard Button"
[17] link "Standard Link"
[1]  textbox "Email"

After (28/31 elements — adds cursor-detected elements):

[16] button "Standard Button"
[17] link "Standard Link"
[1]  textbox "Email"
[38] DisclosureTriangle "Expand me" (collapsed)
[52] clickable "Clickable Div"
[58] clickable "▲ 42"
[62] clickable "Select option ▾"
[63] clickable "Dark mode"

Test plan

  • Live CDP test: all standard HTML, ARIA, and custom elements detected
  • Verified DisclosureTriangle captures <summary> elements
  • Verified aria-label fallback captures icon-only buttons
  • Verified dedup fix: cursor elements no longer silently dropped
  • Full server pipeline test via chat API confirmed all elements visible
  • Typecheck passes

take_snapshot only used the AX tree, which misses custom components
(cursor:pointer divs, onclick handlers, etc.) that lack ARIA roles.
These elements appeared as role="generic" and were invisible to the agent.

Changes:
- Merge findCursorInteractiveElements into snapshot() so take_snapshot
  catches cursor:pointer, onclick, and tabindex elements
- Add DisclosureTriangle to INTERACTIVE_ROLES for <summary> elements
- Use aria-label as text fallback in cursor detection for icon-only buttons
- Fix dedup bug in enhancedSnapshot that was silently dropping all
  cursor-detected elements by checking against all AX node IDs instead
  of only already-included output IDs
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR extends take_snapshot to detect custom clickable elements (cursor:pointer, onclick, tabindex) that the accessibility tree misses, bringing feature parity with enhancedSnapshot. It also fixes a dedup bug in enhancedSnapshot where cursor-detected elements were silently dropped because the old code compared against all AX tree node IDs (including non-interactive ones) rather than only the IDs actually rendered in the output.

  • Adds cursor-interactive element detection to snapshot(), matching the existing behavior in enhancedSnapshot()
  • Fixes dedup in enhancedSnapshot() by matching against rendered output lines instead of the full AX node list
  • Adds DisclosureTriangle to INTERACTIVE_ROLES for <summary> element capture
  • Adds aria-label fallback in cursor detection JS for icon-only buttons
  • The cursor-detection merging logic is now duplicated across snapshot() and enhancedSnapshot() — could benefit from extraction into a shared helper

Confidence Score: 4/5

  • This PR is safe to merge — it adds best-effort detection with proper fallback and fixes a real dedup bug.
  • The changes are well-scoped and correct. The dedup bug fix is a clear improvement, and the new cursor detection in snapshot() mirrors established patterns from enhancedSnapshot(). The silent catch in snapshot() is appropriate since cursor detection is additive. Minor style concern about duplicated logic between the two snapshot methods, but no functional issues found.
  • No files require special attention — both changes are straightforward and follow existing patterns.

Important Files Changed

Filename Overview
packages/browseros-agent/apps/server/src/browser/browser.ts Adds cursor-interactive detection to snapshot() and fixes the dedup bug in enhancedSnapshot() by matching against rendered output lines instead of the full AX tree. Logic is correct but introduces near-duplicate cursor-merging blocks.
packages/browseros-agent/apps/server/src/browser/snapshot.ts Adds DisclosureTriangle to INTERACTIVE_ROLES and aria-label fallback for icon-only buttons in cursor detection JS. Both changes are minimal and correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[take_snapshot / enhancedSnapshot called] --> B[Fetch AX Tree via CDP]
    B --> C{AX nodes empty?}
    C -- Yes --> D[Return empty string]
    C -- No --> E[Build interactive/enhanced tree lines]
    E --> F[findCursorInteractiveElements via JS injection]
    F --> G{Cursor elements found?}
    G -- No --> J[Return tree lines]
    G -- Yes --> H[Parse rendered line IDs into includedIds set]
    H --> I[Append non-duplicate cursor elements as 'clickable']
    I --> J
    F -- Error --> J
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/browser/browser.ts
Line: 395-413

Comment:
**Duplicated cursor-detection merging logic**

The cursor-element merging block in `snapshot()` (lines 395–413) is nearly identical to the one in `enhancedSnapshot()` (lines 462–490), differing only in the output format string and error handling. Consider extracting a shared helper (e.g., `mergeCursorElements(lines, session, formatFn)`) to avoid maintaining the same dedup logic in two places. This would reduce the risk of future divergence between the two code paths.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: d4e0a30

@shivammittal274 shivammittal274 merged commit 2d51c82 into main Mar 16, 2026
3 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 16, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant