Feature Request
Problem
MetaClaw currently operates as a single global instance with one shared skill library, one operating mode, one RL training pipeline, and one memory scope. This does not work well for multi-agent setups where different agents have fundamentally different roles and tasks.
For example, in an OpenClaw deployment with multiple agents:
- Agent A (research assistant) should accumulate academic skills and train a LoRA tuned for research tasks
- Agent B (daily assistant) should accumulate productivity skills and may only need
skills_only mode
- Agent C (code reviewer) should have code-review-specific skills and its own RL policy
Currently, all agents share the same ~/.metaclaw/skills/ directory, the same mode, the same RL buffer, and the same LoRA checkpoint. Skills learned from one agent's conversations pollute another's prompt injections, and RL training mixes trajectories from incompatible task distributions.
Proposed Solution
Add per-agent isolation across four dimensions:
1. Agent Identity Propagation
The OpenClaw plugin (index.ts) should pass an X-Agent-Id header alongside the existing X-Session-Id and X-Turn-Type. The plugin's before_prompt_build callback has access to the session context, which should contain the agent identifier.
2. Per-Agent Skill Directories
~/.metaclaw/skills/
├── _shared/ # Skills shared across all agents (manually curated)
├── agent-a/ # Auto-evolved skills for Agent A
├── agent-b/ # Auto-evolved skills for Agent B
└── agent-c/ # Auto-evolved skills for Agent C
SkillManager loads _shared/ + {agent_id}/ for each request
SkillEvolver writes new skills to {agent_id}/ subdirectory
- Fallback: if no
X-Agent-Id is provided, use a _default/ directory (backward compatible)
3. Per-Agent Mode Configuration
# ~/.metaclaw/config.yaml
agents:
agent-a:
mode: rl
agent-b:
mode: skills_only
agent-c:
mode: rl
_default:
mode: skills_only # fallback for unknown agents
The API server routes each request based on the agent's configured mode:
skills_only agents: skill injection only, no RL data collection
rl agents: skill injection + conversation samples added to RL buffer
4. Per-Agent LoRA Training
This is critical for RL mode. If multiple agents use RL, they must train separate LoRA checkpoints, because:
- Different agents have different task distributions (research vs. chat vs. code review)
- Mixing trajectories from incompatible tasks degrades policy quality for all agents
- Each agent's PRM scoring context is different
Proposed structure:
agents:
agent-a:
mode: rl
rl:
model: "base-model-id" # base model for this agent
lora_output: "~/.metaclaw/lora/agent-a/"
resume_from_ckpt: ""
agent-c:
mode: rl
rl:
model: "base-model-id"
lora_output: "~/.metaclaw/lora/agent-c/"
Each agent gets:
- Its own RL sample buffer (filtered by agent_id)
- Its own LoRA checkpoint directory
- Its own training schedule (batch_size threshold per agent)
5. Per-Agent Memory Scope (already partially supported)
MetaClaw's memory layer already has user_id and workspace_id support via headers. The X-Agent-Id can be used as an additional scope dimension to ensure memory isolation, using the existing derive_memory_scope() infrastructure.
Backward Compatibility
- If no
X-Agent-Id header is present, all behavior falls back to _default — identical to current single-agent behavior
- Existing
~/.metaclaw/skills/ flat structure continues to work as the _default agent's skill directory
- No config migration needed; the
agents config section is optional
Estimated Changes
| File |
Change |
index.ts (OpenClaw plugin) |
Add X-Agent-Id header from session context (~5 lines) |
config.py |
Add AgentConfig dataclass and agents config section (~30 lines) |
api_server.py |
Read X-Agent-Id, route to per-agent skill dir and mode (~40 lines) |
skill_manager.py |
Support agent_id parameter for directory routing (~30 lines) |
skill_evolver.py |
Write skills to per-agent subdirectory (~10 lines) |
data_formatter.py |
Tag samples with agent_id for RL buffer filtering (~10 lines) |
| Total |
~125 lines of production code |
Environment
- MetaClaw version: 0.4.1
- OpenClaw with 6 configured agents
- Use case: multi-agent deployment with heterogeneous task profiles
Feature Request
Problem
MetaClaw currently operates as a single global instance with one shared skill library, one operating mode, one RL training pipeline, and one memory scope. This does not work well for multi-agent setups where different agents have fundamentally different roles and tasks.
For example, in an OpenClaw deployment with multiple agents:
skills_onlymodeCurrently, all agents share the same
~/.metaclaw/skills/directory, the same mode, the same RL buffer, and the same LoRA checkpoint. Skills learned from one agent's conversations pollute another's prompt injections, and RL training mixes trajectories from incompatible task distributions.Proposed Solution
Add per-agent isolation across four dimensions:
1. Agent Identity Propagation
The OpenClaw plugin (
index.ts) should pass anX-Agent-Idheader alongside the existingX-Session-IdandX-Turn-Type. The plugin'sbefore_prompt_buildcallback has access to the session context, which should contain the agent identifier.2. Per-Agent Skill Directories
SkillManagerloads_shared/+{agent_id}/for each requestSkillEvolverwrites new skills to{agent_id}/subdirectoryX-Agent-Idis provided, use a_default/directory (backward compatible)3. Per-Agent Mode Configuration
The API server routes each request based on the agent's configured mode:
skills_onlyagents: skill injection only, no RL data collectionrlagents: skill injection + conversation samples added to RL buffer4. Per-Agent LoRA Training
This is critical for RL mode. If multiple agents use RL, they must train separate LoRA checkpoints, because:
Proposed structure:
Each agent gets:
5. Per-Agent Memory Scope (already partially supported)
MetaClaw's memory layer already has
user_idandworkspace_idsupport via headers. TheX-Agent-Idcan be used as an additional scope dimension to ensure memory isolation, using the existingderive_memory_scope()infrastructure.Backward Compatibility
X-Agent-Idheader is present, all behavior falls back to_default— identical to current single-agent behavior~/.metaclaw/skills/flat structure continues to work as the_defaultagent's skill directoryagentsconfig section is optionalEstimated Changes
index.ts(OpenClaw plugin)X-Agent-Idheader from session context (~5 lines)config.pyAgentConfigdataclass andagentsconfig section (~30 lines)api_server.pyX-Agent-Id, route to per-agent skill dir and mode (~40 lines)skill_manager.pyagent_idparameter for directory routing (~30 lines)skill_evolver.pydata_formatter.pyagent_idfor RL buffer filtering (~10 lines)Environment