Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,18 @@
Plugin-only repo for OpenClaw Braintrust experimentation.

Contents:
- `extensions/braintrust/` — Braintrust plugin scaffold (config, command, quorum policy tests)
- `extensions/braintrust/` — Braintrust plugin with persistent state, quorum policy, and runtime worker orchestration

## Live Feature Validation

Run the reproducible validation script:

```bash
./scripts/validate-braintrust.sh
```

What it verifies:
- `/braintrust on` and `/braintrust status` command flow
- runtime bridge creates one durable Braintrust session and returns exactly one synthesized final output
- if quorum cannot be met, plugin emits explicit `Braintrust temporarily unavailable (...)` notice
- worker/session state is persisted to `.braintrust/state.json`
59 changes: 41 additions & 18 deletions extensions/braintrust/README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,50 @@
# Braintrust Plugin (MVP Control Plane)
# Braintrust Plugin

Feature-flagged plugin path for multi-agent orchestration in OpenClaw.
Feature-flagged Braintrust plugin for OpenClaw with persistent control state and orchestrator-managed worker lifecycle.

## Current behavior (implemented)
- `/braintrust on|off|status|unavailable`
- Configurable team size / strategy / model roles
- Quorum contract configuration (`minParticipatingAgents`, `minAnsweringAgents`)
- Prompt injection with explicit quorum policy + unavailable contract
- Logging hooks for `llm_input` and `llm_output`
- Deterministic quorum policy helpers + unit tests
## What it now does
- Chat/auto-reply surfaces: `/braintrust on|off|status|unavailable|sessions`
- Local CLI surface: `openclaw braintrust [on|off|status|unavailable|sessions]`
- Persists enabled/disabled state plus recent session history to `extensions/braintrust/.braintrust/state.json`
- Runs true plugin-owned fan-out/fan-in orchestration when a runtime executor is available
- Tracks per-worker lifecycle (`pending -> running -> completed/refused/timeout/error`) for each Braintrust session
- Enforces quorum before allowing a synthesized final answer through
- Falls back to policy-only prompt injection when the runtime hook cannot execute workers directly
- Logs `llm_input` / `llm_output` activity for observability

## Important limitation (still pending)
This plugin **does not yet perform true parallel fan-out/fan-in orchestration** by itself.
It sets control policy and prompt/lifecycle behavior; runtime fan-out wiring remains a core integration task.
## Runtime behavior
When enabled and a runtime executor is present, Braintrust now:
1. Creates a durable session record
2. Starts up to 4 workers with distinct roles
3. Resolves each worker model at runtime
4. Updates worker state as each candidate finishes
5. Evaluates quorum
6. Persists the final synthesized answer or explicit unavailable notice

## Quorum contract
- Minimum participating agents: default `2`
- Minimum answering agents: default `2`
- If quorum fails, return explicit unavailable notice instead of pretending panel output.
Recent sessions can be inspected with `/braintrust sessions`.

See `src/policy.ts` and `src/policy.test.ts`.
## Model routing
- No model/provider defaults are baked in.
- Empty role settings mean "inherit the active chat model at runtime."
- `synthModel` is retained only for backward-compatible parsing; synthesis inherits the active chat model at runtime.
- If a role-specific model is set explicitly, Braintrust uses that configured value for that role.

## Durable state
Braintrust stores durable local state in:

```text
extensions/braintrust/.braintrust/state.json
```

That file keeps:
- whether Braintrust is currently enabled
- recent sessions and worker statuses
- final result/unavailable reason for each session

## Test
```bash
pnpm vitest run extensions/braintrust/src/policy.test.ts extensions/braintrust/src/settings.test.ts
pnpm vitest run
./scripts/validate-braintrust.sh
```

Note: plugin id is `braintrust-plugin`; command remains `/braintrust`.
171 changes: 171 additions & 0 deletions extensions/braintrust/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
import { buildUnavailableNotice } from "./src/policy.js";
import { BraintrustOrchestrator } from "./src/orchestrator.js";
import { readSettings } from "./src/settings.js";
const braintrustConfigSchema = {
type: "object",
additionalProperties: false,
properties: {
enabled: { type: "boolean", default: false },
teamSize: { type: "integer", minimum: 1, maximum: 4, default: 4 },
strategy: { type: "string", enum: ["independent", "debate", "panel"], default: "panel" },
model: { type: "string", default: "" },
criticModel: { type: "string", default: "" },
researcherModel: { type: "string", default: "" },
synthModel: { type: "string", default: "" },
timeoutSeconds: { type: "integer", minimum: 10, maximum: 300, default: 90 },
minParticipatingAgents: { type: "integer", minimum: 1, maximum: 4, default: 2 },
minAnsweringAgents: { type: "integer", minimum: 1, maximum: 4, default: 2 }
}
};
function parseBraintrustAction(raw) {
const arg = (raw ?? "status").trim().toLowerCase();
if (arg === "on" || arg === "off" || arg === "unavailable" || arg === "sessions") return arg;
return "status";
}
function extractPromptFromMessages(messages) {
if (!Array.isArray(messages)) return "";
for (let i = messages.length - 1; i >= 0; i -= 1) {
const msg = messages[i];
if (msg?.role !== "user") continue;
const c = msg.content;
if (typeof c === "string") return c;
if (Array.isArray(c)) {
const text = c.map((part) => typeof part === "string" ? part : part?.text ?? "").join("\n").trim();
if (text) return text;
}
}
return "";
}
function readTextOutput(out) {
if (typeof out === "string") return out;
if (!out || typeof out !== "object") return "";
const candidate = out;
if (typeof candidate.text === "string") return candidate.text;
if (typeof candidate.outputText === "string") return candidate.outputText;
if (typeof candidate.assistantText === "string") return candidate.assistantText;
if (Array.isArray(candidate.assistantTexts) && typeof candidate.assistantTexts[0] === "string") return candidate.assistantTexts[0];
if (typeof candidate.content === "string") return candidate.content;
if (Array.isArray(candidate.choices)) {
const first = candidate.choices[0];
if (typeof first?.text === "string") return first.text;
if (typeof first?.message?.content === "string") return first.message.content;
}
return "";
}
function resolveRuntimeExecutor(api, runtimeContext) {
const source = [runtimeContext, api];
const names = ["runModel", "invokeModel", "runLlm", "invokeLlm", "complete"];
for (const obj of source) {
if (!obj) continue;
for (const name of names) {
const fn = obj[name];
if (typeof fn !== "function") continue;
return async (input) => {
const payload = {
role: input.role,
timeoutSeconds: input.timeoutSeconds,
metadata: { candidateId: input.candidateId, role: input.role, pluginId: "braintrust-plugin" },
messages: [
{
role: "system",
content: `You are Braintrust ${input.role}. Candidate id=${input.candidateId}. Respond with one concise candidate answer for the user prompt.`
},
{ role: "user", content: input.prompt }
]
};
if (input.model.trim()) payload.model = input.model;
const out = await fn(payload);
const text = readTextOutput(out).trim();
if (!text) throw new Error("empty output");
return { text, refusal: /\b(cannot comply|i can't|i cannot|refuse|not able)\b/i.test(text) };
};
}
}
return void 0;
}
function describeConfiguredModel(value, synth = false) {
if (synth) return "inherit active chat model at runtime";
return value.trim() || "inherit active chat model at runtime";
}
function buildFallbackPrepend(settings) {
return [
"BRAINTRUST MODE ACTIVE.",
`Use a ${settings.teamSize}-agent internal panel with strategy=${settings.strategy}.`,
[
"Role model routing:",
`- solver: ${describeConfiguredModel(settings.model)}`,
`- critic: ${describeConfiguredModel(settings.criticModel)}`,
`- researcher: ${describeConfiguredModel(settings.researcherModel)}`,
`- synthesizer: ${describeConfiguredModel(settings.synthModel, true)}`
].join("\n"),
`Quorum contract: require >=${settings.minParticipatingAgents} participating and >=${settings.minAnsweringAgents} answering agents.`,
"If quorum cannot be satisfied, return exactly: Braintrust temporarily unavailable (...).",
"Return only one final answer to the user."
].join("\n");
}
var index_default = {
id: "braintrust-plugin",
name: "Braintrust",
description: "Multi-agent orchestration control plane",
configSchema: braintrustConfigSchema,
register(api) {
const orchestrator = new BraintrustOrchestrator(api, readSettings(api.pluginConfig));
const renderUnavailable = () => buildUnavailableNotice(orchestrator.getLastQuorumEvaluation() ?? {
participating: 0,
answering: 0,
refused: 0,
failed: orchestrator.getSettings().teamSize,
meetsQuorum: false,
reason: `only 0/${orchestrator.getSettings().teamSize} agents participated`
});
api.registerCommand({
name: "braintrust",
description: "Control Braintrust mode: /braintrust on|off|status|unavailable|sessions",
acceptsArgs: true,
handler: async (ctx) => ({ text: orchestrator.executeAction(parseBraintrustAction(ctx.args), renderUnavailable) })
});
api.registerCli?.(({ program }) => {
program.command("braintrust").description("Braintrust controls for local CLI surfaces").argument("[action]", "on|off|status|unavailable|sessions", "status").action((action) => {
console.log(orchestrator.executeAction(parseBraintrustAction(action), renderUnavailable));
});
}, { commands: ["braintrust"] });
api.on("before_prompt_build", async (payload) => {
if (!orchestrator.isEnabled()) return;
const event = payload?.event ?? payload ?? {};
const context = payload?.context ?? {};
const prompt = extractPromptFromMessages(event?.messages) || event.prompt || "";
const execute = resolveRuntimeExecutor(api, context);
const activeChatModel = typeof event?.model === "string" ? event.model : void 0;
if (!execute || !prompt) {
api.logger.info("[braintrust] runtime bridge unavailable in hook context, using policy-only prompt injection");
return { prependContext: buildFallbackPrepend(orchestrator.getSettings()) };
}
const bridge = await orchestrator.runPrompt(
prompt,
({ role, model, prompt: p, timeoutSeconds, candidateId }) => execute({ role, model, prompt: p, timeoutSeconds, candidateId }),
activeChatModel
);
api.logger.info(`[braintrust] runtime bridge complete unavailable=${bridge.unavailable} candidates=${bridge.candidates.length}`);
if (bridge.unavailable) {
return {
prependContext: ["BRAINTRUST MODE ACTIVE.", "Quorum could not be satisfied by runtime-bridge execution.", `Return exactly this notice: ${bridge.final}`].join("\n")
};
}
return {
prependContext: ["BRAINTRUST MODE ACTIVE.", "Use this runtime-bridge synthesis as your final answer.", bridge.final].join("\n\n")
};
});
api.on("llm_input", async (payload) => {
if (!orchestrator.isEnabled()) return;
const event = payload?.event ?? payload;
const context = payload?.context ?? {};
api.logger.info(`[braintrust] llm_input run=${event?.runId ?? "unknown"} session=${context?.sessionKey ?? "unknown"} model=${event?.model ?? "unknown"}`);
});
api.on("llm_output", async (payload) => {
if (!orchestrator.isEnabled()) return;
const event = payload?.event ?? payload;
api.logger.info(`[braintrust] llm_output run=${event?.runId ?? "unknown"} model=${event?.model ?? "unknown"} responses=${Array.isArray(event?.assistantTexts) ? event.assistantTexts.length : 0}`);
});
}
};
export { index_default as default };
Loading