Skip to content

Commit f3b96eb

Browse files
committed
feat(walkthrough): dedicated Step 5 for semantic search
User wanted full vs search trade-off as its own onboarding step, not a subsection inside Step 4. New Step 5 'Semantic search — opt-in for large knowledge bases' spells out: - Full mode = default, zero setup - Search mode = catalog-only + axme_search_kb + ~770 MB one-time - Recommendation table by KB size - Enable now button OR decide later (just skip) Completion: onCommand:axme.enableSemanticSearch. firstChat.md's old semantic-search subsection collapsed to a one-liner pointing to Step 5.
1 parent a0f21ac commit f3b96eb

3 files changed

Lines changed: 72 additions & 17 deletions

File tree

extension/package.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,15 @@
128128
"completionEvents": [
129129
"onView:axme.monitor"
130130
]
131+
},
132+
{
133+
"id": "axme.step.semanticSearch",
134+
"title": "Semantic search — opt-in for large knowledge bases",
135+
"description": "AXME runs in **full mode** by default — every memory + decision body is loaded into the agent at session start. Zero setup, simple.\n\nFor larger KBs (>50 entries, or decisions with long rationale): **semantic search mode** loads only the catalog at startup, agent fetches bodies on demand via smart similarity search. Saves significant tokens.\n\nThe trade-off: a one-time ~770 MB download (`@huggingface/transformers` runtime). It's reversible any time.\n\n[Enable semantic search now](command:axme.enableSemanticSearch)\n\nLeave it for later? Just skip this step — full mode keeps working. You can enable from the sidebar (Knowledge base → Search mode) or `AXME: Enable semantic search` whenever.",
136+
"media": { "markdown": "walkthroughs/semanticSearch.md" },
137+
"completionEvents": [
138+
"onCommand:axme.enableSemanticSearch"
139+
]
131140
}
132141
]
133142
}

extension/walkthroughs/firstChat.md

Lines changed: 4 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -55,24 +55,11 @@ project-specific rules or remove ones you don't need.
5555
background audits, last audit failed, or a recent handoff). Otherwise
5656
hidden.
5757

58-
## Semantic search (opt-in)
58+
## Semantic search
5959

60-
By default, AXME loads every memory + decision body into the agent's context
61-
at session start (**full mode**). Works great until your knowledge base grows
62-
past ~50 entries — then context bloat becomes a problem.
63-
64-
**Semantic search mode** loads only the catalog (slug + title + 1-line
65-
description) at startup and exposes `axme_search_kb` so the agent fetches
66-
relevant bodies on demand. Saves significant tokens on large KBs.
67-
68-
Enable from the sidebar's **Knowledge base** section (`Search mode: full →
69-
[Enable]` button) or via `AXME: Enable semantic search` command. The first
70-
enable downloads `@huggingface/transformers` (~770 MB) into
71-
`~/.local/share/axme-code/runtime/` and indexes every existing memory +
72-
decision. Subsequent re-enables are instant.
73-
74-
Disable any time with the sidebar toggle or `AXME: Disable semantic search`.
75-
The runtime and the embeddings index stay on disk — re-enabling is fast.
60+
If your knowledge base grows past ~50 entries, switching from full mode to
61+
semantic search mode saves significant tokens. See **Step 5: Semantic
62+
search** in this walkthrough for the trade-offs and one-click enable.
7663

7764
## Power-user palette commands
7865

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Semantic search — opt-in for large knowledge bases
2+
3+
AXME has two modes for loading the knowledge base at session start.
4+
5+
## Full mode (default — works out of the box)
6+
7+
Every memory + decision body is loaded into the agent's context.
8+
9+
-**Zero setup** — works immediately after `axme-code setup`
10+
-**Simple** — agent sees everything at startup, no extra tool call
11+
-**Best for small / medium KBs** (under ~50 entries)
12+
- ⚠️ **Context bloat on large KBs** — long decision rationales eat tokens
13+
fast on Cursor's per-turn budget
14+
15+
## Semantic search mode (opt-in)
16+
17+
Loads only the **catalog** (titles + 1-line descriptions) at startup. The
18+
agent fetches full bodies on demand via `axme_search_kb` — semantic
19+
similarity search across memories and decisions.
20+
21+
-**Major token savings** on large KBs (>50 entries, especially decisions
22+
with long reasoning blocks)
23+
-**Smart fuzzy search** — "how did we handle auth?" finds relevant
24+
entries by meaning, not by keyword match. The model
25+
(`@huggingface/transformers` MiniLM) embeds each entry once and
26+
compares vector distance to your query.
27+
- ⚠️ **One-time install**: `@huggingface/transformers@^4.0.1` lands in
28+
`~/.local/share/axme-code/runtime/` — about **770 MB on Linux**
29+
(smaller on macOS / Windows; the bulk is `onnxruntime-node` platform
30+
prebuilts).
31+
- ⚠️ **Initial indexing** takes a few seconds (typical KB) to a couple
32+
minutes (very large KB).
33+
-**Live re-embedding** — once enabled, every new save via
34+
`axme_save_memory` / `axme_save_decision` auto-updates the index.
35+
36+
## When to enable
37+
38+
| KB size | Recommendation |
39+
|---|---|
40+
| Under 30 entries | Stick with full mode. The extra ~770 MB and indexing aren't worth it. |
41+
| 30–50 entries | Either works. Semantic search starts saving tokens; full still convenient. |
42+
| Over 50 entries | **Enable.** Token savings become significant. |
43+
| Decisions with long rationale bodies | Enable — full mode bloats context fastest here. |
44+
45+
## Enable now or later — it's a non-irreversible decision
46+
47+
You can switch any time:
48+
49+
- **Sidebar**: Knowledge base section → `Search mode: full` row → click `Enable`
50+
- **Command Palette**: `AXME: Enable semantic search`
51+
- **CLI** (if you prefer terminal): `axme-code config set context.mode search`
52+
53+
To switch back: same surfaces, `Disable` button or `AXME: Disable semantic
54+
search`. The runtime and the embeddings index stay on disk — re-enabling
55+
is instant after the first install.
56+
57+
**This walkthrough step auto-completes when you click Enable.** If you
58+
choose to stay in full mode for now, just skip the step — everything still
59+
works.

0 commit comments

Comments
 (0)