Skip to content

agentjido/jido_browser

Repository files navigation

Jido Browser

Hex.pm Docs CI

Browser automation actions for Jido AI agents.

Overview

JidoBrowser provides a set of Jido Actions for web browsing, enabling AI agents to navigate, interact with, and extract content from web pages. It uses an adapter pattern to support multiple browser automation backends.

Installation

Add jido_browser to your dependencies:

def deps do
  [
    {:jido_browser, "~> 0.1.0"}
  ]
end

Browser Backend

JidoBrowser supports multiple browser backends via adapters:

Vibium (Recommended)

npm install -g vibium

chrismccord/web

# Download from https://github.com/chrismccord/web
# Or build from source
git clone https://github.com/chrismccord/web
cd web && make && sudo cp web /usr/local/bin/

Quick Start

# Start a browser session
{:ok, session} = JidoBrowser.start_session()

# Navigate to a page
{:ok, _} = JidoBrowser.navigate(session, "https://example.com")

# Click an element
{:ok, _} = JidoBrowser.click(session, "button#submit")

# Type into an input
{:ok, _} = JidoBrowser.type(session, "input#search", "hello world")

# Take a screenshot
{:ok, %{bytes: png_data}} = JidoBrowser.screenshot(session)

# Extract page content as markdown (great for LLMs)
{:ok, %{content: markdown}} = JidoBrowser.extract_content(session)

# End session
:ok = JidoBrowser.end_session(session)

Using with Jido Agents

JidoBrowser actions integrate seamlessly with Jido agents:

defmodule MyBrowsingAgent do
  use Jido.Agent,
    name: "web_browser",
    description: "An agent that can browse the web",
    tools: [
      JidoBrowser.Actions.Navigate,
      JidoBrowser.Actions.Click,
      JidoBrowser.Actions.Type,
      JidoBrowser.Actions.Screenshot,
      JidoBrowser.Actions.ExtractContent
    ]

  # Inject browser session via on_before_cmd hook
  def on_before_cmd(_agent, _cmd, context) do
    {:ok, session} = JidoBrowser.start_session()
    {:ok, Map.put(context, :tool_context, %{session: session})}
  end
end

Configuration

config :jido_browser,
  adapter: JidoBrowser.Adapters.Vibium,
  timeout: 30_000

# Vibium-specific options
config :jido_browser, :vibium,
  binary_path: "/usr/local/bin/vibium",
  port: 9515

# Web adapter options
config :jido_browser, :web,
  binary_path: "/usr/local/bin/web",
  profile: "default"

Adapters

Vibium (Default)

  • WebDriver BiDi protocol (standards-based)
  • Automatic Chrome download
  • ~10MB Go binary
  • Built-in MCP server

chrismccord/web

  • Firefox-based via Selenium
  • Built-in HTML to Markdown conversion
  • Phoenix LiveView-aware
  • Session persistence with profiles

Available Actions

Session Lifecycle

Action Description
StartSession Start a new browser session
EndSession End the current session
GetStatus Get session status (url, title, alive)

Navigation

Action Description
Navigate Navigate to a URL
Back Go back in browser history
Forward Go forward in browser history
Reload Reload current page
GetUrl Get current page URL
GetTitle Get current page title

Interaction

Action Description
Click Click an element by CSS selector
Type Type text into an input element
Hover Hover over an element
Focus Focus on an element
Scroll Scroll page or element
SelectOption Select option from dropdown

Waiting/Synchronization

Action Description
Wait Wait for specified milliseconds
WaitForSelector Wait for element (visible/hidden/attached/detached)
WaitForNavigation Wait for page navigation

Element Queries

Action Description
Query Query elements matching selector
GetText Get text content of element
GetAttribute Get attribute value from element
IsVisible Check if element is visible

Content Extraction

Action Description
Snapshot Get comprehensive page snapshot (LLM-optimized)
Screenshot Capture page screenshot
ExtractContent Extract page content as markdown/HTML

Advanced

Action Description
Evaluate Execute arbitrary JavaScript

Using JidoBrowser.Skill

The recommended way to use JidoBrowser with Jido agents is via the Skill:

defmodule MyBrowsingAgent do
  use Jido.Agent,
    name: "web_browser",
    description: "An agent that can browse the web",
    skills: [{JidoBrowser.Skill, [headless: true]}]
end

The Skill provides:

  • Session lifecycle management
  • 26 browser automation actions
  • Signal routing (browser.* patterns)
  • Error diagnostics with page context

License

Apache-2.0 - See LICENSE for details.

About

Browser automation actions for Jido AI agents

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages