Skip to content

Conversation

@ctate
Copy link
Collaborator

@ctate ctate commented Jan 25, 2026

This PR addresses issue #259

@vercel
Copy link
Contributor

vercel bot commented Jan 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
agent-browser Ready Ready Preview, Comment Jan 25, 2026 7:23pm

Copy link
Contributor

@vercel vercel bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Suggestion:

The launchSchema Zod schema is missing profile and storageState fields, causing CLI --profile and --state flags to be silently discarded when sent as explicit launch commands.

Fix on Vercel

@ctate ctate merged commit 79863a5 into main Jan 25, 2026
11 checks passed
34f2aww added a commit to 34f2aww/agent-browser that referenced this pull request Jan 27, 2026
* Add Browserbase support for remote browser over CDP (vercel-labs#3)

* Add Browserbase support for remote browser over CDP

When BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID env vars are set,
connect to a Browserbase session via CDP instead of launching a local browser.

* Update URLs to browserbase repo

* Add Browserbase support for remote browser over CDP

When BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID env vars are set,
connect to a Browserbase session via CDP instead of launching a local browser.

* Update link to Browserbase Dashboard in README

* bump browserbase sdk to latest version

* remove sdk as a dep

* change name back to vercel labs

* added try catch blocks, functions to close session

* revert package names

* remove extra if statement

---------

Co-authored-by: Kylejeong2 <[email protected]>

* feat: add Browser Use cloud browser as available provider (vercel-labs#138)

* feat: add Browser Use cloud browser
  integration

* feat: enhance Browser Use integration with provider flag support

- Updated README to reflect new usage instructions for enabling Browser Use with the `-p` flag.
- Modified CLI to parse and handle the `-p` flag for specifying the provider.
- Implemented logic in the main application to launch with the specified cloud provider.
- Adjusted BrowserManager to connect to Browser Use based on the provider flag or environment variable.
- Updated types and protocol schemas to include provider information.

* feat: add validation for mutually exclusive CLI options

- Implemented checks to prevent the use of both --cdp and --provider flags simultaneously.
- Added validation to ensure --extension cannot be used with the --provider flag.
- Enhanced error handling to provide clear feedback in both JSON and console output formats.

* feat: support remote CDP WebSocket URLs in --cdp flag (vercel-labs#99)

Previously, the --cdp flag only accepted a port number and connected via
http://localhost:{port}. This made it impossible to connect to remote
browser services like Kernel, Browserless, etc. that provide WebSocket URLs.

The --cdp flag now accepts either:
- A port number (e.g., 9222) for local connections
- A full WebSocket URL (e.g., wss://...) for remote browser services

Changes:
- Added cdpUrl field to LaunchCommand type
- Updated protocol validation to accept URL format with scheme validation
- Modified connectViaCDP to detect and handle both formats
- Handle numeric strings for JSON serialization edge cases
- Updated CLI to send cdpUrl or cdpPort based on input format
- Updated README with examples for remote connections

Co-authored-by: Claude Opus 4.5 <[email protected]>

* feat: add browser launch --args, --user-agent,  --proxy-bypass configuration support. (vercel-labs#35)

* feat: add browser launch args, user-agent, and proxy configuration support

* fix: User Agent env need added

* fix: command pass error

---------

Co-authored-by: Chris Tate <[email protected]>

* add missing flag (vercel-labs#203)

* add missing flag

* clean up tests

* fix: add .exe extension for Windows source binary path (vercel-labs#188)

The copy-native.js script was looking for 'agent-browser' but on Windows
the compiled binary is 'agent-browser.exe', causing the copy to fail.

Co-authored-by: jiazhuangai <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: use ~/.agent-browser for socket files instead of TMPDIR (vercel-labs#180)

* fix: use ~/.agent-browser for socket files instead of TMPDIR

This fixes issue vercel-labs#163 where different TMPDIR values (common with
tmux/screen/VSCode/IntelliJ) caused the CLI and daemon to use
different socket paths.

Socket directory priority:
1. AGENT_BROWSER_SOCKET_DIR (explicit override)
2. $XDG_RUNTIME_DIR/agent-browser (Linux standard)
3. ~/.agent-browser (fallback, like Docker Desktop)

Both CLI (Rust) and daemon (Node.js) now use the same logic.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix: session list now looks in correct socket directory

- Make get_socket_dir() public in connection.rs
- Update session list to use get_socket_dir() instead of temp_dir()
- Update pid file pattern from agent-browser-{session}.pid to {session}.pid
- Add tmpdir fallback to daemon.ts when homedir is unavailable

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* test: add unit tests for socket directory resolution

Add comprehensive tests for get_socket_dir/getSocketDir to verify:
- AGENT_BROWSER_SOCKET_DIR takes priority
- Empty strings are ignored (fixes Rust/TypeScript consistency)
- XDG_RUNTIME_DIR fallback works correctly
- Home directory fallback when env vars unset

Co-Authored-By: Claude Opus 4.5 <[email protected]>

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>

* errors doc more descriptive (vercel-labs#190)

* docs: add Claude Code marketplace plugin installation instructions (vercel-labs#181)

Document the recommended way to install the agent-browser skill using the /plugin marketplace commands introduced in PR vercel-labs#106.

* Add --profile flag for persistent browser profiles (vercel-labs#68)

* Add --profile flag for persistent browser profiles

Adds support for persistent browser profiles that preserve cookies,
localStorage, and login sessions across browser restarts.

Changes:
- Add --profile <path> CLI flag (flags.rs)
- Add AGENT_BROWSER_PROFILE environment variable support
- Add profile field to LaunchCommand type (types.ts)
- Use launchPersistentContext when profile is specified (browser.ts)
- Update help text and README with documentation

Usage:
  agent-browser --profile ~/.myapp-profile open myapp.com

This enables AI agents to maintain authenticated sessions across
browser restarts without re-authenticating each time.

* Expand tilde in profile path to home directory

* fix: add missing profile field to test Flags struct

---------

Co-authored-by: Chris Tate <[email protected]>

* fix: support WebSocket URLs in connect command (vercel-labs#205)

* fix: support WebSocket URLs in connect command

* address feedback

* feat: add download CLI commands with ref support  (vercel-labs#183)

* feat: add download and waitfordownload CLI commands

Add CLI support for the existing download functionality in the daemon:

- `download <selector> <path>`: Click an element to trigger download
  and save to specified path
- `wait --download [path] [--timeout ms]`: Wait for any download to
  complete, optionally save to path with configurable timeout

Includes comprehensive unit tests and help documentation.

* fix: download command ref support and output message

- Fix handleDownload to use browser.getLocator() for ref selector support
- Fix CLI output to show "Downloaded to" instead of "Screenshot saved"

Co-Authored-By: Claude Opus 4.5 <[email protected]>

---------

Co-authored-by: Claude <[email protected]>
Co-authored-by: Chris Tate <[email protected]>

* docs: update agent-browser skill documentation (vercel-labs#164)

* fix(screenshot): support refs and improve error messages (vercel-labs#141)

* fix(screenshot): support refs and improve error messages

* fix(cli): support selector argument in screenshot command

* Fix CSS class selectors being treated as file paths

* fix(test): update screenshot test assertions

---------

Co-authored-by: Chris Tate <[email protected]>

* feat(skills): Add hierarchical structure with references and templates (vercel-labs#157)

* feat(skills): Add hierarchical structure with references and templates

Adds modular documentation and executable templates to the agent-browser skill
for better AI agent consumption and progressive disclosure.

## Added

### References (deep-dive documentation)
- `references/snapshot-refs.md` - Ref lifecycle, invalidation, troubleshooting
- `references/session-management.md` - Parallel sessions, state persistence
- `references/authentication.md` - Login flows, OAuth, 2FA patterns
- `references/video-recording.md` - Recording for debugging/docs
- `references/proxy-support.md` - Proxy configuration, geo-testing

### Templates (ready-to-use workflows)
- `templates/form-automation.sh` - Form filling with validation
- `templates/authenticated-session.sh` - Login once, reuse state
- `templates/capture-workflow.sh` - Content extraction with screenshots

## Modified
- `SKILL.md` - Added reference tables linking to new documentation

## Benefits
- Progressive disclosure: Load overview first, deep dives on demand
- Reduced context: Smaller chunks for better LLM token efficiency
- Ready workflows: Copy-paste templates for common patterns

* fix(templates): Make authenticated-session.sh runnable out-of-box

Addresses review feedback: login actions were commented but verification
wasn't, causing script to fail when run as-is.

New approach:
- DISCOVERY MODE runs first (shows form structure)
- LOGIN FLOW section is fully commented as a unit
- User runs once to see refs, then customizes

┌─────────────────────────────────────────────────────────────┐
│ LOGIN FORM STRUCTURE                                        │
├─────────────────────────────────────────────────────────────┤
│ @e1 [input type="email"]                                    │
│ @e2 [input type="password"]                                 │
│ @e3 [button] "Sign In"                                      │
└─────────────────────────────────────────────────────────────┘

* fix(cli): correct output messages for state load and path-based actions (vercel-labs#109)

* Add files via upload

fix(cli): correct output messages for state load and path-based actions

* Add files via upload

* Update output.rs

* fix crlf

---------

Co-authored-by: Chris Tate <[email protected]>

* auto-release (vercel-labs#216)

* auto-release

* fixes

* fix secret name

* update provider flag (vercel-labs#217)

* auto-release

* fixes

* fix secret name

* update provider flag

* v0.7.0 docs (vercel-labs#218)

* auto-release

* fixes

* fix secret name

* update provider flag

* v0.7.0 changelog

* chore: add changeset for v0.7.0 release (vercel-labs#219)

* chore: version packages (vercel-labs#220)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: download artifacts to temp directory to avoid naming conflict (vercel-labs#221)

The download-artifact action creates directories named after each artifact.
When downloading to bin/, this caused conflicts because the artifact directory
names matched the binary names (e.g., bin/agent-browser-darwin-arm64/agent-browser-darwin-arm64).

Fix by downloading to artifacts/ first, then using find to move the binaries to bin/.

* fix docs (vercel-labs#223)

* fix: download artifacts to temp directory to avoid naming conflict

The download-artifact action creates directories named after each artifact.
When downloading to bin/, this caused conflicts because the artifact directory
names matched the binary names (e.g., bin/agent-browser-darwin-arm64/agent-browser-darwin-arm64).

Fix by downloading to artifacts/ first, then using find to move the binaries to bin/.

* fix docs

* fix bin (vercel-labs#224)

* chore: release v0.7.1 (vercel-labs#225)

Fix native binary distribution in npm package. Binaries are now built
before publishing to npm, ensuring all platforms work on installation.

* chore: version packages (vercel-labs#226)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* chore: add changeset for binary distribution fix (vercel-labs#227)

* chore: version packages (vercel-labs#228)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: ensure binary is executable after npm install (vercel-labs#229)

* chore: version packages (vercel-labs#231)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: handle existing GitHub releases in workflow (vercel-labs#232)

* chore: version packages (vercel-labs#233)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: allow null selector in screenshot command schema (vercel-labs#236)

The screenshot command was failing with 'Validation error: selector: Expected string, received null' when only a path was provided (e.g., 'agent-browser screenshot ~/Desktop/test.png').

The Rust CLI serializes None values as null in JSON, but the Zod schema only allowed undefined (via .optional()), not null. Changed selector field to use .nullish() which accepts both null and undefined.

Fixes issue where screenshot command without selector fails validation.

* chore: add patch changeset for release (vercel-labs#242)

* chore: version packages (vercel-labs#243)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat: add support for ignoring HTTPS certificate errors (vercel-labs#93)

* feat: add support for ignoring HTTPS certificate errors

* fix: update warning message for already running daemon to include ignore HTTPS errors option

* docs: add documentation for --ignore-https-errors option in README and SKILL.md

* feat: initialize ignore_https_errors flag in command context

* fix: change launch_cmd to mutable for cdp value handling

* Add CLI flags for cookie URL, domain, path, httpOnly, secure, and expires (vercel-labs#266)

* Add CLI flags for cookie URL, domain, path, httpOnly, secure, and expires

Extends the `cookies set` command to support setting cookies with additional parameters before loading a page, solving authentication workflows where cookies need to be set for different domains.

**Key changes:**
- Added CLI flags: `--url`, `--domain`, `--path`, `--httpOnly`, `--secure`, `--sameSite`, `--expires`
- Added comprehensive test coverage for all new flags and combinations
- Updated help documentation with usage examples
- No daemon changes needed - it already supported these parameters

**Example usage:**
```bash
agent-browser cookies set session_id "abc123" --url https://app.example.com --httpOnly --secure
```

This allows setting cookies for a URL before opening the page, eliminating the need for workarounds in cross-domain authentication scenarios.

Fixes vercel-labs#261

* Update lock

* Fix compilation error

* Fix: CLI: state load / profile persistence not usable in v0.7.6 (vercel-labs#268)

* Fix: CLI: state load / profile persistence not usable in v0.7.6

This PR addresses issue vercel-labs#259

* Fix issues identified in code review

* generic placeholder for cloud browser provider (vercel-labs#260)

* docs: use generic placeholder for cloud browser provider

* docs: clarify available cloud browser providers

* Fix: set device does not apply deviceScaleFactor - HiDPI screenshots not possible (vercel-labs#270)

Fixes vercel-labs#255

* Fix: check command hangs indefinitely (vercel-labs#272)

Fixes vercel-labs#257

* feat: add Kernel as cloud browser provider (vercel-labs#200)

Add Kernel (https://kernel.sh) as a third-party cloud browser provider,
following the same pattern as Browserbase and Browser Use integrations.

Features:
- Launch browser with `-p kernel` flag or `AGENT_BROWSER_PROVIDER=kernel`
- Configurable via environment variables:
  - KERNEL_API_KEY (required)
  - KERNEL_HEADLESS (default: false)
  - KERNEL_STEALTH (default: true)
  - KERNEL_TIMEOUT_SECONDS (default: 300)
  - KERNEL_PROFILE_NAME (optional, for persistent sessions)
- Profile find-or-create: automatically creates profile if it doesn't exist
- Profile persistence: cookies/logins saved back to profile on session close
- Uses raw fetch() calls for API communication (no SDK dependency)

Co-authored-by: Claude Opus 4.5 <[email protected]>

* Security: Reject cross-origin connections to daemon and stream server (vercel-labs#274)

* Fix tab list command not recognizing new pages opened via clicks (vercel-labs#275)

## Summary

Fixed an issue where the `tab list` command couldn't recognize new pages that were opened externally (e.g., via `target="_blank"` links or popup windows). The problem occurred because context-level page tracking wasn't properly set up for all browser launch methods, causing new pages created outside of explicit `newTab()` calls to go untracked.

## Changes

- Added `setupContextTracking(context)` calls to `launch()`, `launchIncognito()`, and other context creation methods to ensure all contexts listen for new page events
- Added duplicate page checks (`!this.pages.includes(page)`) in `setupContextTracking()`, `newTab()`, and `launchIncognito()` to prevent the same page from being tracked multiple times
- Fixed `activePageIndex` calculation in `launch()` to properly set the active page index
- Enhanced comments to clarify that `setupContextTracking()` handles externally created pages (popups, new tabs from links)

## Implementation Details

The fix ensures that when a user clicks an element that opens a new tab/window, the browser context's 'page' event listener will automatically detect and track the new page. The duplicate prevention logic handles cases where both the context listener and manual page creation might try to add the same page.

Fixes vercel-labs#273

* chore(cli): sync Cargo.toml version to 0.7.6 (vercel-labs#276)

* chore(cli): save screenshots to tmp dir when no path provided (vercel-labs#247)

* fix(cli): save screenshots to tmp dir when no path provided

Instead of outputting base64 to stdout (which is not useful for most CLI use cases),
screenshots without a path now save to ~/.agent-browser/tmp/screenshots/ with a
generated filename and return the path.

This makes the behavior more ergonomic for AI agents and CLI users alike.

* cleanup

* cleanup

* just revert the cargo.lock version for now

* refactor: extract getAppDir() from getSocketDir()

* docs: improve screenshot help text consistency

* chore: add minor changeset for release (vercel-labs#280)

* chore: version packages (vercel-labs#281)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* v0.8.0 changelog (vercel-labs#282)

* header (vercel-labs#283)

* fix: CLI binary not executable when postinstall is skipped (pnpm, bun) (vercel-labs#285)

* fix binary

* check binary in CI

* chore: add patch changeset for release (vercel-labs#286)

* chore: version packages (vercel-labs#288)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* ci: add retry logic to flaky Windows integration test (vercel-labs#287)

* durable windows ci

* more

* ci: add test for Windows CMD wrapper (vercel-labs#289)

* ci: add test for Windows CMD wrapper

This test will fail until the CMD wrapper is fixed to call the native binary.

* fix: Windows CMD wrapper calls native binary instead of missing index.js

* chore: add patch changeset for release (vercel-labs#291)

* chore: version packages (vercel-labs#292)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* test: add Windows npm global install CI test (reproduces vercel-labs#262) (vercel-labs#293)

* test: add Windows npm global install CI test (reproduces vercel-labs#262)

This test packs the package and installs it globally with npm,
then runs agent-browser --version. This reproduces the issue where
npm-generated shims on Windows try to invoke /bin/sh which doesn't exist.

The bin/agent-browser.js wrapper is added but not yet wired up,
so this commit should fail CI to confirm the issue.

* fix: Windows npm global install and npx support

The shell script wrapper (bin/agent-browser) with #!/bin/sh shebang
causes npm to generate Windows shims that try to invoke /bin/sh,
which doesn't exist on Windows.

This fix uses a hybrid approach:

1. Node.js wrapper (bin/agent-browser.js) as bin entry
   - Makes npx work on all platforms
   - ~100ms overhead (acceptable since npx has its own overhead)

2. postinstall patches bin entries for global installs
   - Windows: Overwrites .cmd/.ps1 shims to invoke .exe directly
   - Mac/Linux: Replaces symlink to point to native binary
   - Zero overhead for `npm i -g agent-browser` users on all platforms

Also fixes PowerShell glob expansion in CI test.

Fixes vercel-labs#262

* fix: Windows npm global install and npx support

The shell script wrapper (bin/agent-browser) with #!/bin/sh shebang
causes npm to generate Windows shims that try to invoke /bin/sh,
which doesn't exist on Windows.

This fix uses a hybrid approach:

1. Node.js wrapper (bin/agent-browser.js) as bin entry
   - Makes npx work on all platforms
   - ~100ms overhead (acceptable since npx has its own overhead)

2. postinstall patches bin entries for global installs
   - Windows: Overwrites .cmd/.ps1 shims to invoke .exe directly
   - Mac/Linux: Replaces symlink to point to native binary
   - Zero overhead for `npm i -g agent-browser` users on all platforms

Also adds cross-platform CI tests for npm global install to catch
regressions on all platforms (Ubuntu, macOS, Windows).

Fixes vercel-labs#262

* test global install

* remove dead code

* chore: add patch changeset for release (vercel-labs#294)

* chore: version packages (vercel-labs#295)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* resolve rebase conflicts in src/daemon.ts

---------

Co-authored-by: Paul Klein <[email protected]>
Co-authored-by: Kylejeong2 <[email protected]>
Co-authored-by: Aitor <[email protected]>
Co-authored-by: Rafael <[email protected]>
Co-authored-by: Claude Opus 4.5 <[email protected]>
Co-authored-by: Oanakiaja <[email protected]>
Co-authored-by: Chris Tate <[email protected]>
Co-authored-by: mypengpengli <[email protected]>
Co-authored-by: jiazhuangai <[email protected]>
Co-authored-by: mmhiyoko <[email protected]>
Co-authored-by: Shpeedle <[email protected]>
Co-authored-by: Tom Dale <[email protected]>
Co-authored-by: Lindsey Simon <[email protected]>
Co-authored-by: Namish pruthi <[email protected]>
Co-authored-by: Márk Magyar <[email protected]>
Co-authored-by: Danila Poyarkov <[email protected]>
Co-authored-by: Yonatan <[email protected]>
Co-authored-by: TimWhite <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Zach Warunek <[email protected]>
Co-authored-by: Zhiwei Li <[email protected]>
Co-authored-by: shawn pana <[email protected]>
Co-authored-by: n33pm <[email protected]>
Co-authored-by: Li Yang <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: 34f2aww <[email protected]>
34f2aww added a commit to 34f2aww/agent-browser that referenced this pull request Jan 28, 2026
* Add Browserbase support for remote browser over CDP (#3)

* Add Browserbase support for remote browser over CDP

When BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID env vars are set,
connect to a Browserbase session via CDP instead of launching a local browser.

* Update URLs to browserbase repo

* Add Browserbase support for remote browser over CDP

When BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID env vars are set,
connect to a Browserbase session via CDP instead of launching a local browser.

* Update link to Browserbase Dashboard in README

* bump browserbase sdk to latest version

* remove sdk as a dep

* change name back to vercel labs

* added try catch blocks, functions to close session

* revert package names

* remove extra if statement

---------

Co-authored-by: Kylejeong2 <[email protected]>

* feat: add Browser Use cloud browser as available provider (vercel-labs#138)

* feat: add Browser Use cloud browser
  integration

* feat: enhance Browser Use integration with provider flag support

- Updated README to reflect new usage instructions for enabling Browser Use with the `-p` flag.
- Modified CLI to parse and handle the `-p` flag for specifying the provider.
- Implemented logic in the main application to launch with the specified cloud provider.
- Adjusted BrowserManager to connect to Browser Use based on the provider flag or environment variable.
- Updated types and protocol schemas to include provider information.

* feat: add validation for mutually exclusive CLI options

- Implemented checks to prevent the use of both --cdp and --provider flags simultaneously.
- Added validation to ensure --extension cannot be used with the --provider flag.
- Enhanced error handling to provide clear feedback in both JSON and console output formats.

* feat: support remote CDP WebSocket URLs in --cdp flag (vercel-labs#99)

Previously, the --cdp flag only accepted a port number and connected via
http://localhost:{port}. This made it impossible to connect to remote
browser services like Kernel, Browserless, etc. that provide WebSocket URLs.

The --cdp flag now accepts either:
- A port number (e.g., 9222) for local connections
- A full WebSocket URL (e.g., wss://...) for remote browser services

Changes:
- Added cdpUrl field to LaunchCommand type
- Updated protocol validation to accept URL format with scheme validation
- Modified connectViaCDP to detect and handle both formats
- Handle numeric strings for JSON serialization edge cases
- Updated CLI to send cdpUrl or cdpPort based on input format
- Updated README with examples for remote connections

Co-authored-by: Claude Opus 4.5 <[email protected]>

* feat: add browser launch --args, --user-agent,  --proxy-bypass configuration support. (vercel-labs#35)

* feat: add browser launch args, user-agent, and proxy configuration support

* fix: User Agent env need added

* fix: command pass error

---------

Co-authored-by: Chris Tate <[email protected]>

* add missing flag (vercel-labs#203)

* add missing flag

* clean up tests

* fix: add .exe extension for Windows source binary path (vercel-labs#188)

The copy-native.js script was looking for 'agent-browser' but on Windows
the compiled binary is 'agent-browser.exe', causing the copy to fail.

Co-authored-by: jiazhuangai <[email protected]>
Co-authored-by: Claude <[email protected]>

* fix: use ~/.agent-browser for socket files instead of TMPDIR (vercel-labs#180)

* fix: use ~/.agent-browser for socket files instead of TMPDIR

This fixes issue vercel-labs#163 where different TMPDIR values (common with
tmux/screen/VSCode/IntelliJ) caused the CLI and daemon to use
different socket paths.

Socket directory priority:
1. AGENT_BROWSER_SOCKET_DIR (explicit override)
2. $XDG_RUNTIME_DIR/agent-browser (Linux standard)
3. ~/.agent-browser (fallback, like Docker Desktop)

Both CLI (Rust) and daemon (Node.js) now use the same logic.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix: session list now looks in correct socket directory

- Make get_socket_dir() public in connection.rs
- Update session list to use get_socket_dir() instead of temp_dir()
- Update pid file pattern from agent-browser-{session}.pid to {session}.pid
- Add tmpdir fallback to daemon.ts when homedir is unavailable

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* test: add unit tests for socket directory resolution

Add comprehensive tests for get_socket_dir/getSocketDir to verify:
- AGENT_BROWSER_SOCKET_DIR takes priority
- Empty strings are ignored (fixes Rust/TypeScript consistency)
- XDG_RUNTIME_DIR fallback works correctly
- Home directory fallback when env vars unset

Co-Authored-By: Claude Opus 4.5 <[email protected]>

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>

* errors doc more descriptive (vercel-labs#190)

* docs: add Claude Code marketplace plugin installation instructions (vercel-labs#181)

Document the recommended way to install the agent-browser skill using the /plugin marketplace commands introduced in PR vercel-labs#106.

* Add --profile flag for persistent browser profiles (vercel-labs#68)

* Add --profile flag for persistent browser profiles

Adds support for persistent browser profiles that preserve cookies,
localStorage, and login sessions across browser restarts.

Changes:
- Add --profile <path> CLI flag (flags.rs)
- Add AGENT_BROWSER_PROFILE environment variable support
- Add profile field to LaunchCommand type (types.ts)
- Use launchPersistentContext when profile is specified (browser.ts)
- Update help text and README with documentation

Usage:
  agent-browser --profile ~/.myapp-profile open myapp.com

This enables AI agents to maintain authenticated sessions across
browser restarts without re-authenticating each time.

* Expand tilde in profile path to home directory

* fix: add missing profile field to test Flags struct

---------

Co-authored-by: Chris Tate <[email protected]>

* fix: support WebSocket URLs in connect command (vercel-labs#205)

* fix: support WebSocket URLs in connect command

* address feedback

* feat: add download CLI commands with ref support  (vercel-labs#183)

* feat: add download and waitfordownload CLI commands

Add CLI support for the existing download functionality in the daemon:

- `download <selector> <path>`: Click an element to trigger download
  and save to specified path
- `wait --download [path] [--timeout ms]`: Wait for any download to
  complete, optionally save to path with configurable timeout

Includes comprehensive unit tests and help documentation.

* fix: download command ref support and output message

- Fix handleDownload to use browser.getLocator() for ref selector support
- Fix CLI output to show "Downloaded to" instead of "Screenshot saved"

Co-Authored-By: Claude Opus 4.5 <[email protected]>

---------

Co-authored-by: Claude <[email protected]>
Co-authored-by: Chris Tate <[email protected]>

* docs: update agent-browser skill documentation (vercel-labs#164)

* fix(screenshot): support refs and improve error messages (vercel-labs#141)

* fix(screenshot): support refs and improve error messages

* fix(cli): support selector argument in screenshot command

* Fix CSS class selectors being treated as file paths

* fix(test): update screenshot test assertions

---------

Co-authored-by: Chris Tate <[email protected]>

* feat(skills): Add hierarchical structure with references and templates (vercel-labs#157)

* feat(skills): Add hierarchical structure with references and templates

Adds modular documentation and executable templates to the agent-browser skill
for better AI agent consumption and progressive disclosure.

## Added

### References (deep-dive documentation)
- `references/snapshot-refs.md` - Ref lifecycle, invalidation, troubleshooting
- `references/session-management.md` - Parallel sessions, state persistence
- `references/authentication.md` - Login flows, OAuth, 2FA patterns
- `references/video-recording.md` - Recording for debugging/docs
- `references/proxy-support.md` - Proxy configuration, geo-testing

### Templates (ready-to-use workflows)
- `templates/form-automation.sh` - Form filling with validation
- `templates/authenticated-session.sh` - Login once, reuse state
- `templates/capture-workflow.sh` - Content extraction with screenshots

## Modified
- `SKILL.md` - Added reference tables linking to new documentation

## Benefits
- Progressive disclosure: Load overview first, deep dives on demand
- Reduced context: Smaller chunks for better LLM token efficiency
- Ready workflows: Copy-paste templates for common patterns

* fix(templates): Make authenticated-session.sh runnable out-of-box

Addresses review feedback: login actions were commented but verification
wasn't, causing script to fail when run as-is.

New approach:
- DISCOVERY MODE runs first (shows form structure)
- LOGIN FLOW section is fully commented as a unit
- User runs once to see refs, then customizes

┌─────────────────────────────────────────────────────────────┐
│ LOGIN FORM STRUCTURE                                        │
├─────────────────────────────────────────────────────────────┤
│ @e1 [input type="email"]                                    │
│ @e2 [input type="password"]                                 │
│ @e3 [button] "Sign In"                                      │
└─────────────────────────────────────────────────────────────┘

* fix(cli): correct output messages for state load and path-based actions (vercel-labs#109)

* Add files via upload

fix(cli): correct output messages for state load and path-based actions

* Add files via upload

* Update output.rs

* fix crlf

---------

Co-authored-by: Chris Tate <[email protected]>

* auto-release (vercel-labs#216)

* auto-release

* fixes

* fix secret name

* update provider flag (vercel-labs#217)

* auto-release

* fixes

* fix secret name

* update provider flag

* v0.7.0 docs (vercel-labs#218)

* auto-release

* fixes

* fix secret name

* update provider flag

* v0.7.0 changelog

* chore: add changeset for v0.7.0 release (vercel-labs#219)

* chore: version packages (vercel-labs#220)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: download artifacts to temp directory to avoid naming conflict (vercel-labs#221)

The download-artifact action creates directories named after each artifact.
When downloading to bin/, this caused conflicts because the artifact directory
names matched the binary names (e.g., bin/agent-browser-darwin-arm64/agent-browser-darwin-arm64).

Fix by downloading to artifacts/ first, then using find to move the binaries to bin/.

* fix docs (vercel-labs#223)

* fix: download artifacts to temp directory to avoid naming conflict

The download-artifact action creates directories named after each artifact.
When downloading to bin/, this caused conflicts because the artifact directory
names matched the binary names (e.g., bin/agent-browser-darwin-arm64/agent-browser-darwin-arm64).

Fix by downloading to artifacts/ first, then using find to move the binaries to bin/.

* fix docs

* fix bin (vercel-labs#224)

* chore: release v0.7.1 (vercel-labs#225)

Fix native binary distribution in npm package. Binaries are now built
before publishing to npm, ensuring all platforms work on installation.

* chore: version packages (vercel-labs#226)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* chore: add changeset for binary distribution fix (vercel-labs#227)

* chore: version packages (vercel-labs#228)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: ensure binary is executable after npm install (vercel-labs#229)

* chore: version packages (vercel-labs#231)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: handle existing GitHub releases in workflow (vercel-labs#232)

* chore: version packages (vercel-labs#233)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: allow null selector in screenshot command schema (vercel-labs#236)

The screenshot command was failing with 'Validation error: selector: Expected string, received null' when only a path was provided (e.g., 'agent-browser screenshot ~/Desktop/test.png').

The Rust CLI serializes None values as null in JSON, but the Zod schema only allowed undefined (via .optional()), not null. Changed selector field to use .nullish() which accepts both null and undefined.

Fixes issue where screenshot command without selector fails validation.

* chore: add patch changeset for release (vercel-labs#242)

* chore: version packages (vercel-labs#243)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat: add support for ignoring HTTPS certificate errors (vercel-labs#93)

* feat: add support for ignoring HTTPS certificate errors

* fix: update warning message for already running daemon to include ignore HTTPS errors option

* docs: add documentation for --ignore-https-errors option in README and SKILL.md

* feat: initialize ignore_https_errors flag in command context

* fix: change launch_cmd to mutable for cdp value handling

* Add CLI flags for cookie URL, domain, path, httpOnly, secure, and expires (vercel-labs#266)

* Add CLI flags for cookie URL, domain, path, httpOnly, secure, and expires

Extends the `cookies set` command to support setting cookies with additional parameters before loading a page, solving authentication workflows where cookies need to be set for different domains.

**Key changes:**
- Added CLI flags: `--url`, `--domain`, `--path`, `--httpOnly`, `--secure`, `--sameSite`, `--expires`
- Added comprehensive test coverage for all new flags and combinations
- Updated help documentation with usage examples
- No daemon changes needed - it already supported these parameters

**Example usage:**
```bash
agent-browser cookies set session_id "abc123" --url https://app.example.com --httpOnly --secure
```

This allows setting cookies for a URL before opening the page, eliminating the need for workarounds in cross-domain authentication scenarios.

Fixes vercel-labs#261

* Update lock

* Fix compilation error

* Fix: CLI: state load / profile persistence not usable in v0.7.6 (vercel-labs#268)

* Fix: CLI: state load / profile persistence not usable in v0.7.6

This PR addresses issue vercel-labs#259

* Fix issues identified in code review

* generic placeholder for cloud browser provider (vercel-labs#260)

* docs: use generic placeholder for cloud browser provider

* docs: clarify available cloud browser providers

* Fix: set device does not apply deviceScaleFactor - HiDPI screenshots not possible (vercel-labs#270)

Fixes vercel-labs#255

* Fix: check command hangs indefinitely (vercel-labs#272)

Fixes vercel-labs#257

* feat: add Kernel as cloud browser provider (vercel-labs#200)

Add Kernel (https://kernel.sh) as a third-party cloud browser provider,
following the same pattern as Browserbase and Browser Use integrations.

Features:
- Launch browser with `-p kernel` flag or `AGENT_BROWSER_PROVIDER=kernel`
- Configurable via environment variables:
  - KERNEL_API_KEY (required)
  - KERNEL_HEADLESS (default: false)
  - KERNEL_STEALTH (default: true)
  - KERNEL_TIMEOUT_SECONDS (default: 300)
  - KERNEL_PROFILE_NAME (optional, for persistent sessions)
- Profile find-or-create: automatically creates profile if it doesn't exist
- Profile persistence: cookies/logins saved back to profile on session close
- Uses raw fetch() calls for API communication (no SDK dependency)

Co-authored-by: Claude Opus 4.5 <[email protected]>

* Security: Reject cross-origin connections to daemon and stream server (vercel-labs#274)

* Fix tab list command not recognizing new pages opened via clicks (vercel-labs#275)

## Summary

Fixed an issue where the `tab list` command couldn't recognize new pages that were opened externally (e.g., via `target="_blank"` links or popup windows). The problem occurred because context-level page tracking wasn't properly set up for all browser launch methods, causing new pages created outside of explicit `newTab()` calls to go untracked.

## Changes

- Added `setupContextTracking(context)` calls to `launch()`, `launchIncognito()`, and other context creation methods to ensure all contexts listen for new page events
- Added duplicate page checks (`!this.pages.includes(page)`) in `setupContextTracking()`, `newTab()`, and `launchIncognito()` to prevent the same page from being tracked multiple times
- Fixed `activePageIndex` calculation in `launch()` to properly set the active page index
- Enhanced comments to clarify that `setupContextTracking()` handles externally created pages (popups, new tabs from links)

## Implementation Details

The fix ensures that when a user clicks an element that opens a new tab/window, the browser context's 'page' event listener will automatically detect and track the new page. The duplicate prevention logic handles cases where both the context listener and manual page creation might try to add the same page.

Fixes vercel-labs#273

* chore(cli): sync Cargo.toml version to 0.7.6 (vercel-labs#276)

* chore(cli): save screenshots to tmp dir when no path provided (vercel-labs#247)

* fix(cli): save screenshots to tmp dir when no path provided

Instead of outputting base64 to stdout (which is not useful for most CLI use cases),
screenshots without a path now save to ~/.agent-browser/tmp/screenshots/ with a
generated filename and return the path.

This makes the behavior more ergonomic for AI agents and CLI users alike.

* cleanup

* cleanup

* just revert the cargo.lock version for now

* refactor: extract getAppDir() from getSocketDir()

* docs: improve screenshot help text consistency

* chore: add minor changeset for release (vercel-labs#280)

* chore: version packages (vercel-labs#281)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* v0.8.0 changelog (vercel-labs#282)

* header (vercel-labs#283)

* fix: CLI binary not executable when postinstall is skipped (pnpm, bun) (vercel-labs#285)

* fix binary

* check binary in CI

* chore: add patch changeset for release (vercel-labs#286)

* chore: version packages (vercel-labs#288)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* ci: add retry logic to flaky Windows integration test (vercel-labs#287)

* durable windows ci

* more

* ci: add test for Windows CMD wrapper (vercel-labs#289)

* ci: add test for Windows CMD wrapper

This test will fail until the CMD wrapper is fixed to call the native binary.

* fix: Windows CMD wrapper calls native binary instead of missing index.js

* chore: add patch changeset for release (vercel-labs#291)

* chore: version packages (vercel-labs#292)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* test: add Windows npm global install CI test (reproduces vercel-labs#262) (vercel-labs#293)

* test: add Windows npm global install CI test (reproduces vercel-labs#262)

This test packs the package and installs it globally with npm,
then runs agent-browser --version. This reproduces the issue where
npm-generated shims on Windows try to invoke /bin/sh which doesn't exist.

The bin/agent-browser.js wrapper is added but not yet wired up,
so this commit should fail CI to confirm the issue.

* fix: Windows npm global install and npx support

The shell script wrapper (bin/agent-browser) with #!/bin/sh shebang
causes npm to generate Windows shims that try to invoke /bin/sh,
which doesn't exist on Windows.

This fix uses a hybrid approach:

1. Node.js wrapper (bin/agent-browser.js) as bin entry
   - Makes npx work on all platforms
   - ~100ms overhead (acceptable since npx has its own overhead)

2. postinstall patches bin entries for global installs
   - Windows: Overwrites .cmd/.ps1 shims to invoke .exe directly
   - Mac/Linux: Replaces symlink to point to native binary
   - Zero overhead for `npm i -g agent-browser` users on all platforms

Also fixes PowerShell glob expansion in CI test.

Fixes vercel-labs#262

* fix: Windows npm global install and npx support

The shell script wrapper (bin/agent-browser) with #!/bin/sh shebang
causes npm to generate Windows shims that try to invoke /bin/sh,
which doesn't exist on Windows.

This fix uses a hybrid approach:

1. Node.js wrapper (bin/agent-browser.js) as bin entry
   - Makes npx work on all platforms
   - ~100ms overhead (acceptable since npx has its own overhead)

2. postinstall patches bin entries for global installs
   - Windows: Overwrites .cmd/.ps1 shims to invoke .exe directly
   - Mac/Linux: Replaces symlink to point to native binary
   - Zero overhead for `npm i -g agent-browser` users on all platforms

Also adds cross-platform CI tests for npm global install to catch
regressions on all platforms (Ubuntu, macOS, Windows).

Fixes vercel-labs#262

* test global install

* remove dead code

* chore: add patch changeset for release (vercel-labs#294)

* chore: version packages (vercel-labs#295)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix daemon not found (vercel-labs#299)

* ci(version): add version sync check between package.json and Cargo.toml (vercel-labs#277)

Add automated verification that package.json and cli/Cargo.toml versions
stay in sync. This prevents version drift between the npm package and
Rust CLI binary.

- Add CI job to check version sync on push/PR
- Update pre-commit hook to sync versions automatically
- Update ci:version script to include version sync step
- Add check-version-sync.js script for CI validation

* v0.8.4 changeset (vercel-labs#300)

* chore: version packages (vercel-labs#301)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: sync Cargo.lock when version changes (vercel-labs#302)

Update sync-version.js to also run `cargo update -p agent-browser` after
updating Cargo.toml, keeping Cargo.lock in sync. Also update pre-commit
hook to stage Cargo.lock along with Cargo.toml.

This commit also brings Cargo.lock up to date (was stuck at 0.7.6).

* Resolve version and launch conflicts

---------

Co-authored-by: Paul Klein <[email protected]>
Co-authored-by: Kylejeong2 <[email protected]>
Co-authored-by: Aitor <[email protected]>
Co-authored-by: Rafael <[email protected]>
Co-authored-by: Claude Opus 4.5 <[email protected]>
Co-authored-by: Oanakiaja <[email protected]>
Co-authored-by: Chris Tate <[email protected]>
Co-authored-by: mypengpengli <[email protected]>
Co-authored-by: jiazhuangai <[email protected]>
Co-authored-by: mmhiyoko <[email protected]>
Co-authored-by: Shpeedle <[email protected]>
Co-authored-by: Tom Dale <[email protected]>
Co-authored-by: Lindsey Simon <[email protected]>
Co-authored-by: Namish pruthi <[email protected]>
Co-authored-by: Márk Magyar <[email protected]>
Co-authored-by: Danila Poyarkov <[email protected]>
Co-authored-by: Yonatan <[email protected]>
Co-authored-by: TimWhite <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Zach Warunek <[email protected]>
Co-authored-by: Zhiwei Li <[email protected]>
Co-authored-by: shawn pana <[email protected]>
Co-authored-by: n33pm <[email protected]>
Co-authored-by: Li Yang <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: opencode-agent[bot] <opencode-agent[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants