Skip to content

feat: NouseAgent — single-model sub-agent architecture#19

Open
base76-research-lab wants to merge 1 commit into
mainfrom
feature/single-model-agent-18
Open

feat: NouseAgent — single-model sub-agent architecture#19
base76-research-lab wants to merge 1 commit into
mainfrom
feature/single-model-agent-18

Conversation

@base76-research-lab
Copy link
Copy Markdown
Owner

What this does

Replaces the fragmented multi-model stack with a single capable local model using role-specific system prompts.

Before: deepseek-r1:1.5b (extraction, timing out) + external APIs (Groq, Cerebras, OpenRouter) + ad-hoc httpx calls in domain_bootstrap

After: gemma4:e2b handles every workload via NouseAgent — one model, six roles, no external dependencies

Changes

src/nouse/llm/agent.py — NouseAgent

agent = NouseAgent("gemma4:e2b")

relations = await agent.extract(text)           # extractor role
answer    = await agent.synthesize(query, ctx)  # synthesizer role
questions = await agent.curiosity(topic)        # curiosity role
facts     = await agent.bootstrap(topic)        # bootstrap role
verdict   = await agent.validate(claim)         # validator role
reply     = await agent.chat(message, context)  # chat role

Each role uses the same model with a different system prompt. Zero routing overhead. Consistent quality across all workloads.

src/nouse/cli/commands/run.py + nouse run command

nouse run                              # interactive REPL with graph R/W
nouse run "What is epistemic memory?"  # single query
nouse run --model gemma4:26b           # upgrade model

Flow: user query → graph.query() → inject context → NouseAgent.chat() → brain.learn()

examples/Modelfile

ollama create NoUse -f examples/Modelfile
nouse run --model NoUse

Creates a named Ollama model with NoUse's epistemic identity baked into the system prompt. Distributable as ollama create NoUse.

inject.py — domain_bootstrap refactored

domain_bootstrap() now uses NouseAgent.bootstrap() directly instead of raw httpx calls with hardcoded deepseek-r1:1.5b.

Tested

NouseAgent("gemma4:e2b")
  extract:    3 relations from 2-sentence text ✓
  bootstrap:  10 relations for "Hebbian plasticity" ✓
  curiosity:  3 high-quality questions for "epistemic memory" ✓
  synthesize: grounded answer from graph nodes ✓

Minimum hardware

Component Model Size
Reasoning (all roles) gemma4:e2b 7.2 GB
Embeddings qwen3-embedding:4b 3.9 GB

Fits on any machine with 8 GB VRAM. No cloud API keys needed.

Closes #18

🤖 Generated with Claude Code

One capable local model replaces the multi-model stack.
gemma4:e2b handles all NoUse workloads via role-specific system prompts:
extractor, synthesizer, curiosity, bootstrap, validator, chat.

Changes:
- src/nouse/llm/agent.py: NouseAgent class + module singleton get_agent()
- src/nouse/cli/commands/run.py: nouse run — interactive REPL with graph R/W
- src/nouse/cli/main.py: register 'nouse run' command
- src/nouse/inject.py: domain_bootstrap uses NouseAgent instead of raw httpx
- examples/Modelfile: build 'ollama create NoUse' from gemma4:e2b

Minimum stack: gemma4:e2b (7.2 GB) + qwen3-embedding:4b (3.9 GB).
No external API dependencies required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: single-model architecture — gemma4 sub-agents replace multi-model stack + nouse run command

1 participant