Control web browsers through CLI commands for automation tasks.
hyper-agent-browser (hab) is a browser automation CLI that lets you:
- Navigate web pages
- Interact with elements (click, fill, type)
- Extract page information via snapshots
- Maintain login sessions across invocations
- Download files with authentication
- Handle login detection and captchas
| Skill | 用途 |
|---|---|
| login-detection.md | 智能登录检测,自动切换 headed 模式 |
| retry-strategy.md | 操作失败重试策略 |
| download-guide.md | 文件下载指南 |
| captcha-handling.md | 验证码处理 |
| web-monitoring.md | 网页监控 |
默认行为:始终使用 default session,保持登录状态持久化。
- 除非用户明确要求创建新 session 或使用特定命名的 session,否则不要指定
-s参数 - 这样可以复用已有的登录状态(Cookies、LocalStorage),避免重复登录
- 只有当用户说"新建一个 session"、"用 xxx session"、"隔离的浏览器"等明确要求时才使用
-s <name>
# ✅ 默认用法(复用登录状态)
hab open https://example.com
hab snapshot -i
# ✅ 仅当用户明确要求时才用命名 session
hab -s gmail open https://mail.google.com # 用户说"用 gmail session"
hab -s isolated open https://example.com # 用户说"新建隔离环境"- Open a page:
hab open <url> - Get snapshot:
hab snapshot -ito see interactive elements - Analyze snapshot: Find target element references (@e1, @e2, etc.)
- Execute action:
hab click @e5orhab fill @e3 "text" - Repeat until task is complete
hab open <url>- Open URLhab reload- Refresh pagehab back/hab forward- Navigate history
hab click <selector>- Click elementhab fill <selector> <value>- Fill input (clears first)hab type <selector> <text>- Type text (no clear)hab press <key>- Press key (Enter, Tab, Escape, etc.)
hab snapshot -i- Get interactive elements (MOST IMPORTANT)hab url- Get current URLhab title- Get page title
hab download <selector>- Download by clicking elementhab download <selector> -o <path>- Specify output pathhab download-url <url>- Download directly from URL
hab --session <name> <cmd>- Use named sessionhab sessions- List sessionshab close- Close browser
@e1,@e2, ... - Element references from snapshot (preferred)css=.class- CSS selectortext=Click me- Text content matchxpath=//button- XPath
hab --headed -s mysite open https://example.com/login
hab snapshot -i
# Output shows: @e3 [textbox] "Email", @e4 [textbox] "Password", @e5 [button] "Sign in"
hab fill @e3 "user@example.com"
hab fill @e4 "password123"
hab click @e5
hab wait navigation
hab snapshot -i
# Verify login success