Browser automation actions for Jido AI agents.
JidoBrowser provides a set of Jido Actions for web browsing, enabling AI agents to navigate, interact with, and extract content from web pages. It uses an adapter pattern to support multiple browser automation backends.
Add jido_browser to your dependencies:
def deps do
[
{:jido_browser, "~> 0.1.0"}
]
endJidoBrowser supports multiple browser backends via adapters:
Vibium (Recommended)
npm install -g vibiumchrismccord/web
# Download from https://github.com/chrismccord/web
# Or build from source
git clone https://github.com/chrismccord/web
cd web && make && sudo cp web /usr/local/bin/# Start a browser session
{:ok, session} = JidoBrowser.start_session()
# Navigate to a page
{:ok, _} = JidoBrowser.navigate(session, "https://example.com")
# Click an element
{:ok, _} = JidoBrowser.click(session, "button#submit")
# Type into an input
{:ok, _} = JidoBrowser.type(session, "input#search", "hello world")
# Take a screenshot
{:ok, %{bytes: png_data}} = JidoBrowser.screenshot(session)
# Extract page content as markdown (great for LLMs)
{:ok, %{content: markdown}} = JidoBrowser.extract_content(session)
# End session
:ok = JidoBrowser.end_session(session)JidoBrowser actions integrate seamlessly with Jido agents:
defmodule MyBrowsingAgent do
use Jido.Agent,
name: "web_browser",
description: "An agent that can browse the web",
tools: [
JidoBrowser.Actions.Navigate,
JidoBrowser.Actions.Click,
JidoBrowser.Actions.Type,
JidoBrowser.Actions.Screenshot,
JidoBrowser.Actions.ExtractContent
]
# Inject browser session via on_before_cmd hook
def on_before_cmd(_agent, _cmd, context) do
{:ok, session} = JidoBrowser.start_session()
{:ok, Map.put(context, :tool_context, %{session: session})}
end
endconfig :jido_browser,
adapter: JidoBrowser.Adapters.Vibium,
timeout: 30_000
# Vibium-specific options
config :jido_browser, :vibium,
binary_path: "/usr/local/bin/vibium",
port: 9515
# Web adapter options
config :jido_browser, :web,
binary_path: "/usr/local/bin/web",
profile: "default"- WebDriver BiDi protocol (standards-based)
- Automatic Chrome download
- ~10MB Go binary
- Built-in MCP server
- Firefox-based via Selenium
- Built-in HTML to Markdown conversion
- Phoenix LiveView-aware
- Session persistence with profiles
| Action | Description |
|---|---|
StartSession |
Start a new browser session |
EndSession |
End the current session |
GetStatus |
Get session status (url, title, alive) |
| Action | Description |
|---|---|
Navigate |
Navigate to a URL |
Back |
Go back in browser history |
Forward |
Go forward in browser history |
Reload |
Reload current page |
GetUrl |
Get current page URL |
GetTitle |
Get current page title |
| Action | Description |
|---|---|
Click |
Click an element by CSS selector |
Type |
Type text into an input element |
Hover |
Hover over an element |
Focus |
Focus on an element |
Scroll |
Scroll page or element |
SelectOption |
Select option from dropdown |
| Action | Description |
|---|---|
Wait |
Wait for specified milliseconds |
WaitForSelector |
Wait for element (visible/hidden/attached/detached) |
WaitForNavigation |
Wait for page navigation |
| Action | Description |
|---|---|
Query |
Query elements matching selector |
GetText |
Get text content of element |
GetAttribute |
Get attribute value from element |
IsVisible |
Check if element is visible |
| Action | Description |
|---|---|
Snapshot |
Get comprehensive page snapshot (LLM-optimized) |
Screenshot |
Capture page screenshot |
ExtractContent |
Extract page content as markdown/HTML |
| Action | Description |
|---|---|
Evaluate |
Execute arbitrary JavaScript |
The recommended way to use JidoBrowser with Jido agents is via the Skill:
defmodule MyBrowsingAgent do
use Jido.Agent,
name: "web_browser",
description: "An agent that can browse the web",
skills: [{JidoBrowser.Skill, [headless: true]}]
endThe Skill provides:
- Session lifecycle management
- 26 browser automation actions
- Signal routing (
browser.*patterns) - Error diagnostics with page context
Apache-2.0 - See LICENSE for details.