Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions mcp-server/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
node_modules/
dist/
*.log
.DS_Store
20 changes: 20 additions & 0 deletions mcp-server/.npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Source files
src/
test/
*.test.ts

# Config files
tsconfig.json
vitest.config.ts
.prettierrc

# Build artifacts
node_modules/
*.log

# Git
.git/
.gitignore

# Development
PULL_REQUEST_TEMPLATE.md
41 changes: 41 additions & 0 deletions mcp-server/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Changelog

All notable changes to agent-browser-mcp-server will be documented in this file.

## [2.0.0] - 2026-01-22

### Added
- 🎉 **50+ browser automation tools** for LLMs
- Direct `BrowserManager` integration (no subprocess overhead)
- Full navigation support (navigate, back, forward, reload)
- Complete interaction tools (click, fill, type, hover, select, etc.)
- Information gathering (snapshot, get text, screenshots, etc.)
- Multi-tab support (new, switch, close, list)
- Cookie management (get, set, clear)
- Storage management (localStorage, sessionStorage)
- Frame handling (switch to iframe, back to main)
- Dialog handling (accept, dismiss)
- Network request tracking
- Viewport and geolocation settings
- Console and error monitoring
- Session management for parallel browsers

### Changed
- Architecture: Direct import instead of CLI subprocess
- Headed mode by default (browser window visible)
- Improved error handling and reporting
- Better TypeScript types

### Documentation
- Complete README with 50+ tools documented
- SETUP.md with step-by-step installation
- Example config files for Cursor and Claude Desktop
- Troubleshooting guide

## [1.0.0] - Initial Release

### Added
- Basic MCP server implementation
- CLI subprocess approach
- Core browser commands
- README documentation
209 changes: 209 additions & 0 deletions mcp-server/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
# agent-browser MCP Server

[Model Context Protocol](https://modelcontextprotocol.io) server for [agent-browser](https://github.com/vercel-labs/agent-browser). Enables LLMs to control browsers through 50+ automation tools.

## Quick Start

### Prerequisites

```bash
npm install -g agent-browser
agent-browser install
```

### Option 1: NPX (Recommended)

No installation needed! Just configure and use:

**Cursor** (`~/.cursor/mcp.json`):
```json
{
"mcpServers": {
"agent-browser": {
"command": "npx",
"args": ["-y", "@agent-browser/mcp-server"]
}
}
}
```

**Claude Desktop**:
- macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`
- Windows: `%APPDATA%\Claude\claude_desktop_config.json`

```json
{
"mcpServers": {
"agent-browser": {
"command": "npx",
"args": ["-y", "@agent-browser/mcp-server"]
}
}
}
```

### Option 2: Global Install

```bash
npm install -g @agent-browser/mcp-server
```

**Configuration**:
```json
{
"mcpServers": {
"agent-browser": {
"command": "agent-browser-mcp"
}
}
}
```

### Option 3: Local Development

```bash
cd mcp-server
npm install
npm run build
```

**Configuration**:
```json
{
"mcpServers": {
"agent-browser": {
"command": "node",
"args": ["/absolute/path/to/mcp-server/dist/index.js"]
}
}
}
```

## Usage

Once configured, LLMs can use browser automation:

```
"Open https://example.com and take a screenshot"
"Navigate to GitHub and click the login button"
"Fill the email field with [email protected]"
```

## Tools

The server provides 50+ browser automation tools:

### Navigation
- `browser_navigate`, `browser_back`, `browser_forward`, `browser_reload`

### Interactions
- `browser_click`, `browser_fill`, `browser_type`, `browser_press`
- `browser_hover`, `browser_select`, `browser_check`, `browser_scroll`
- `browser_drag`, `browser_upload`

### Information
- `browser_snapshot` - Get accessibility tree with refs (AI-optimized)
- `browser_get_text`, `browser_get_html`, `browser_get_value`
- `browser_get_title`, `browser_get_url`, `browser_screenshot`

### Tabs
- `browser_tab_new`, `browser_tab_switch`, `browser_tab_close`, `browser_tab_list`

### Storage
- `browser_cookies_get`, `browser_cookies_set`, `browser_cookies_clear`
- `browser_storage_get`, `browser_storage_set`, `browser_storage_clear`

### Advanced
- `browser_evaluate` - Execute JavaScript
- `browser_frame_switch` - Work with iframes
- `browser_dialog_accept` - Handle alerts/confirms
- `browser_network_requests` - Track network activity

[View complete tool list](SETUP.md#available-tools-50)

## Workflow Pattern

```javascript
// 1. Navigate
browser_navigate({ url: "https://example.com" })

// 2. Get page structure
browser_snapshot({ interactive: true, compact: true })

// 3. Interact using refs
browser_click({ selector: "@e5" }) // Use ref from snapshot
browser_fill({ selector: "@e3", value: "text" })

// 4. Extract data
browser_get_text({ selector: "@e1" })
browser_screenshot({ fullPage: false })
```

## Architecture

```
LLM (Claude/Cursor)
↓ MCP Protocol
MCP Server (this package)
↓ Direct import
BrowserManager (agent-browser)
Playwright
```

This server directly imports `BrowserManager` from agent-browser for optimal performance.

## Development

```bash
npm run dev # Watch mode
npm run build # Production build
npm run format # Format code with Prettier
npm test # Run tests
```

## Configuration Options

### Headless Mode

Edit `src/index.ts` line ~680:

```typescript
headless: false, // Change to true for headless mode
```

### Session Management

Multiple browser instances:

```javascript
browser_navigate({ url: "site1.com", session: "task1" })
browser_navigate({ url: "site2.com", session: "task2" })
```

## Troubleshooting

**agent-browser not found**
```bash
npm install -g agent-browser
agent-browser install
```

**Module not found**
- Ensure agent-browser is installed globally
- Check the path in MCP config is correct

**Linux dependencies**
```bash
agent-browser install --with-deps
```

See [SETUP.md](SETUP.md) for detailed troubleshooting.

## License

Apache-2.0

## Credits

Built on [agent-browser](https://github.com/vercel-labs/agent-browser) by Vercel Labs.
Loading