Skip to content

Commit ea5a751

Browse files
authored
Merge pull request #33 from dbhurley/feat/ai-discoverability
LGTM. cargo check clean. Tool descriptions updated with accurate 17x stat and action-oriented guidance. AGENTS.md is comprehensive — covers build/test, directory map, how to add tools, description guidelines, selector syntax, and anti-patterns.
2 parents a19a501 + 9acca9a commit ea5a751

File tree

2 files changed

+134
-3
lines changed

2 files changed

+134
-3
lines changed

AGENTS.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# AGENTS.md — Plasmate Codebase Guide
2+
3+
This file is for AI coding agents (Cursor, Devin, Claude Code, Copilot, etc.). It tells you what the codebase does, how it is structured, and how to make changes safely.
4+
5+
## What Plasmate is
6+
7+
Plasmate is a headless browser engine that compiles web pages into a **Semantic Object Model (SOM)** — structured JSON optimised for LLM consumption — instead of returning raw HTML. 17x average token reduction. No API key, no cloud.
8+
9+
It runs as a CLI, a persistent daemon, an MCP server, and a CDP server.
10+
11+
## Build
12+
13+
```bash
14+
~/.cargo/bin/cargo build # debug
15+
~/.cargo/bin/cargo build --release # release
16+
```
17+
18+
Requires Rust stable (1.77+). No system dependencies beyond a C linker.
19+
20+
## Test
21+
22+
```bash
23+
~/.cargo/bin/cargo test # all tests
24+
~/.cargo/bin/cargo test som:: # SOM tests only
25+
~/.cargo/bin/cargo test mcp:: # MCP tests only
26+
RUST_LOG=debug ~/.cargo/bin/cargo test -- --nocapture # with logging
27+
```
28+
29+
There are 224+ tests. All must pass before a PR.
30+
31+
## Key directories
32+
33+
```
34+
src/
35+
main.rs CLI entry point (fetch, compile, diff, mcp, serve, daemon, screenshot)
36+
mcp/
37+
mod.rs MCP server, JSON-RPC router, session manager
38+
tools.rs ALL MCP tool definitions + handlers (add new tools here)
39+
sessions.rs Persistent browser session state
40+
som/
41+
mod.rs SOM data structures and serialisation
42+
filter.rs apply_selector() — shared between CLI and MCP
43+
compiler.rs HTML → SOM compiler (the core algorithm)
44+
js/
45+
runtime.rs V8-backed JS execution
46+
pipeline.rs Full fetch+JS+compile pipeline
47+
network/
48+
fetch.rs HTTP client (reqwest)
49+
sdk/python/ Python SDK (MCP client)
50+
sdk/node/ Node.js SDK (MCP client)
51+
integrations/ LangChain, LlamaIndex, Browser Use, etc.
52+
packages/ som-parser-python, som-parser-node
53+
```
54+
55+
## How to add an MCP tool
56+
57+
1. Add a `struct YourToolParams` with `#[derive(Deserialize)]` in `src/mcp/tools.rs`
58+
2. Write `pub fn your_tool_definition() -> ToolDefinition` with name, description, and input_schema
59+
3. Write `pub async fn handle_your_tool(arguments: &Value, ...) -> Value` handler
60+
4. Register both in `src/mcp/mod.rs` — add to `list_tools()` and to the match in `call_tool()`
61+
5. Add tests in `src/mcp/tools.rs` under `#[cfg(test)]`
62+
63+
Look at `extract_links_definition()` and `handle_extract_links()` for a clean minimal example.
64+
65+
## MCP tool description guidelines
66+
67+
Tool descriptions are read by LLMs (Claude, GPT-4, etc.) to decide which tool to call. Write them as action-oriented instructions, not feature lists:
68+
69+
- State WHAT it returns concretely
70+
- State WHEN to use it vs alternatives
71+
- Include any token-saving tips (`selector='main'`)
72+
- Avoid vague phrases like "token-efficient" without numbers
73+
74+
## SOM selector syntax
75+
76+
`apply_selector(som, sel)` in `src/som/filter.rs` — supported values:
77+
78+
| Selector | Matches |
79+
|----------|---------|
80+
| `main` | `<main>` and `role=main` regions |
81+
| `nav` | Navigation regions |
82+
| `header` / `footer` | Header / footer regions |
83+
| `aside` | Sidebar regions |
84+
| `content` | Article / content regions |
85+
| `form` | Form regions |
86+
| `dialog` | Dialog/modal regions |
87+
| `#foo` | Region with id `foo` |
88+
89+
Returns full SOM if selector matches nothing (graceful fallback).
90+
91+
## Python SDK
92+
93+
Located in `sdk/python/`. Run tests with:
94+
95+
```bash
96+
cd sdk/python && PYTHONPATH=src python3 -m pytest tests/ -v
97+
```
98+
99+
The key helper to know: `_extract_last_json(text)` in `client.py` — hardened JSON parser used by both sync and async `_call_tool`. It handles mixed output (progress lines before JSON, embedded JSON in log messages).
100+
101+
## Common patterns
102+
103+
**Error responses (Rust MCP handlers):**
104+
```rust
105+
return error_response("descriptive message here");
106+
```
107+
108+
**Returning SOM as MCP content:**
109+
```rust
110+
return tool_response(serde_json::to_string(&result).unwrap_or_default());
111+
```
112+
113+
**Applying selector before responding:**
114+
```rust
115+
let effective_som = if let Some(ref sel) = params.selector {
116+
crate::som::filter::apply_selector(&page_result.som, sel)
117+
} else {
118+
page_result.som.clone()
119+
};
120+
```
121+
122+
## What NOT to do
123+
124+
- Do not call `reqwest::blocking` from inside a V8 callback or Tokio async context — use `std::thread::spawn` + `mpsc::channel` to escape (see PR #27 for the pattern)
125+
- Do not add `unwrap()` on network operations — always handle errors and return `error_response()`
126+
- Do not break the `apply_selector()` contract — it must return full SOM on no-match, never panic
127+
- Do not change the `--format` or `--selector` CLI flags without updating both `main.rs` and `src/mcp/tools.rs`
128+
129+
## CI
130+
131+
GitHub Actions runs `cargo test` and `cargo clippy` on every PR. Both must pass.

src/mcp/tools.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ struct ExtractTextParams {
6767
pub fn fetch_page_definition() -> ToolDefinition {
6868
ToolDefinition {
6969
name: "fetch_page".to_string(),
70-
description: "Fetch a web page and return its Semantic Object Model (SOM) - a structured, token-efficient representation of the page content. Use this instead of raw HTML fetching for 10x token savings.".to_string(),
70+
description: "Fetch a web page and return its Semantic Object Model (SOM) - structured JSON with typed regions, interactive elements with stable IDs, and clean text content. Averages 17x fewer tokens than raw HTML (up to 117x on complex pages). Prefer this over raw HTTP fetches for any web content in agent pipelines. Add selector='main' to strip nav/footer and reduce tokens further.".to_string(),
7171
input_schema: json!({
7272
"type": "object",
7373
"properties": {
@@ -97,7 +97,7 @@ pub fn fetch_page_definition() -> ToolDefinition {
9797
pub fn extract_text_definition() -> ToolDefinition {
9898
ToolDefinition {
9999
name: "extract_text".to_string(),
100-
description: "Fetch a web page and return only the clean, readable text content. No HTML, no structure - just the text a human would read.".to_string(),
100+
description: "Fetch a web page and return only the clean, readable text - no markup, no structure, no element IDs. Use this (instead of fetch_page) when you only need the written content and do not need to interact with the page or reference specific elements.".to_string(),
101101
input_schema: json!({
102102
"type": "object",
103103
"properties": {
@@ -682,7 +682,7 @@ struct ClosePageParams {
682682
pub fn open_page_definition() -> ToolDefinition {
683683
ToolDefinition {
684684
name: "open_page".to_string(),
685-
description: "Open a web page in a persistent browser session. Returns a session ID and the initial SOM. Use with click, type, and evaluate for multi-step interactions.".to_string(),
685+
description: "Open a URL in a persistent browser session. Returns a session_id and the initial SOM. Use this (instead of fetch_page) when you need to interact with the page - click buttons, fill forms, navigate, or run JavaScript. Pair with click, type_text, navigate_to, and evaluate.".to_string(),
686686
input_schema: json!({
687687
"type": "object",
688688
"properties": {

0 commit comments

Comments
 (0)