Claude Code plugin for conceptualizing, designing, implementing, updating, and validating complex Mistral Vibe workflows.
VibeFlow is for workflows where a generic prompt is not enough: multi-phase automation, custom skills, tools, MCP/connectors, middleware, hooks, agent profiles, subagents, source-level changes, and validation evidence. Its job is to preserve design freedom while forcing every selected runtime surface to be grounded in what Mistral Vibe can actually do.
VibeFlow separates two questions that often get mixed together:
- What is the best workflow design for this task?
- Which Mistral Vibe runtime surfaces can honestly implement that design?
The plugin should not tell the agent to use every possible customization point. It should choose the smallest sufficient surface, explain why it fits, reject unnecessary surfaces with rationale, and record enough evidence that implementation oversight is visible later.
- init — interactive intent loop. Completes only after user sign-off and generation of
VISION.md,PLAN.md, andWORKFLOW_CONTRACT.json. - design — maps signed intent onto real Mistral Vibe runtime surfaces. It first compares grounded candidate architectures, lints the candidate set, then promotes the winning candidate into component breakdowns, diagrams, feasibility classification, design decisions, rejected alternatives, and approval-ready artifacts.
- plan — researches source/docs as needed and produces implementation targets, component contracts, tests, and validation gates.
- apply — writes or patches files according to the approved plan. Runs a pre-apply surface guard to enforce the contract before any file is touched.
- validate — runs the serial validation chain, writes evidence, classifies failures, detects drift, and reports whether the workflow is ready or needs rework.
- update — modifies or hardens an existing workflow through narrative intake, repo/artifact scan, ambiguity clarification, proposed edits, approval, implementation, record updates, and validation.
- inspect — audits an existing workflow/repo and persists an inspection report for later design/update work.
- realize — ingests existing conceptual workflows, skillsets, plugin drafts, partially implemented repos, or design documents and converts them into working, runtime-grounded implementations.
VibeFlow's references model the current Mistral Vibe runtime surfaces, including:
- Skills.
allowed_toolsis advisory only — it is never used to filterToolManager.available_tools. The model sees all available tools regardless. Use agent profile flat-TOMLenabled_toolsfor actual restriction. - Built-in tools:
ask_user_question,exit_plan_mode,todo,task,webfetch,websearch.ask_user_questionis disabled in programmatic (-p) and ACP execution contexts. - Tool contracts:
BaseToolclass variables (name,descriptionareClassVar[str]),invoke()→run()call chain (no__call__),BaseToolConfig,resolve_permission(),get_result_extra(),get_file_snapshot(),is_available(), tool state (BaseToolState), and tool prompt files. - Tool placement: Custom tools must be skill-bundled or in
config.tool_paths— never invibe/core/tools/builtins/. That directory is core infrastructure (bash, grep, read_file, etc.); placing workflow-specific tools there is a Tier D source modification. - Tool discovery: Search path order is
[builtins, config.tool_paths, project .vibe/tools/, ~/.vibe/tools/]— later entries override earlier ones. Discovery is recursive. Tool files whose filename starts with_are silently skipped. - MCP servers and Mistral Connectors as distinct remote tool surfaces. Per-server
disabledanddisabled_toolscontrols. - Agent profiles:
default,plan,accept-edits,auto-approve,chat, and opt-inlean. Thechatprofile restricts to a fixed tool list (grep,read_file,ask_user_question,task) viabypass_tool_permissions + enabled_toolsoverrides — not a generic "read-only" mode. Profile overrides use flat TOML keys, not a nested[overrides]section. Thebase_disabledkey merges with existingdisabled_toolswithout replacing it. Agent TOML files must be directly in the search path root (not subdirectories), must use.tomlextension, and the agent name is always the filename stem. - Per-phase model assignment via agent profile
overrides(active_model,providers,models) — no separate tool needed. Every model name referenced in a profile TOML must be declared as a model alias in~/.vibe/config.tomlfirst. - Tool-triggered profile switching through
ctx.switch_agent_callback.AgentProfileChangedEventis silently dropped by the ACP layer. Profile switching changes the system prompt but keeps the same context window — it is not subagent isolation. - Subagents through the
tasktool. The built-inexploresubagent is read-only by profile — this is not a hard runtime constraint on all subagents. Custom subagent profiles with write tools can write. Subagents never run hooks.TaskResult.completed = Falseon middleware stop or skipped tool call. True subagent isolation requirestaskto spawn a fresh AgentLoop with its own message history. - Hooks as
POST_AGENT_TURNonly, viahooks.toml+enable_experimental_hooks = true(Tier A — no Python required). Exit-code semantics:exit 2+ non-empty stdout = retry;exit 2+ empty stdout = warning only, no retry. 3-retry ceiling per hook per user turn. Chain breaks on first retrying hook. - Middleware as a duck-typed
ConversationMiddlewareProtocol (not an ABC). Must implementbefore_turn(context)andreset(reset_reason). Registration requires modifying_setup_middleware()in source — Tier D. Stateless middleware may usepassforreset(). - Compaction:
compact()resets messages to[system_message, summary_message]. The system prompt is regenerated from the current agent profile each turn. Agent profile state survives compaction automatically — do not usestate.jsonto persist phase across compaction. Middlewarereset()receivesResetReason.COMPACTand must preserve state (only clear onResetReason.STOP). - Config:
bypass_tool_permissions(notauto_approve— silently ignored),system_prompt_id,enable_experimental_hooks,context_warnings,auto_compact_threshold(global and per-model),api_timeout.thinkingis per-model inside[[models]].session_prefixlives under[session_logging]. All fields overridable viaVIBE_*env vars. - Session scratchpad (temp directory, non-deterministic path), AGENTS.md context injection, programmatic JSON/streaming output, and compaction model settings.
The main reference for this is references/feasibility/runtime-pattern-catalog.md.
Custom middleware is valid in Mistral Vibe. Multiple middleware can be composed.
The constraint is not whether middleware can exist; it is where it lives and when it runs. Middleware is a duck-typed ConversationMiddleware Protocol — no ABC to subclass. Registration requires modifying _setup_middleware() in AgentLoop source (Tier D). It cannot be registered through config, skills, or hooks alone.
Middleware must implement:
async def before_turn(self, context: ConversationContext) -> MiddlewareResult: ...
def reset(self, reset_reason: ResetReason = ResetReason.STOP) -> None: ...before_turn() runs before every LLM call in the tool loop — not once per user message. On a multi-tool turn it fires once per tool batch plus once for the final response. A hook retry also triggers another before_turn() on the next loop iteration.
reset() may be a no-op (pass) for stateless middleware. Stateful middleware must handle both ResetReason.STOP and ResetReason.COMPACT.
Middleware actions:
CONTINUE— proceed.STOP— halt the agent turn entirely. No compaction.COMPACT— trigger context compaction, then continue the loop. Not the same asSTOP.INJECT_MESSAGE— collect a message; all collected injections are joined with\n\nand injected before the LLM call.
Pipeline: STOP and COMPACT short-circuit and discard any previously accumulated INJECT_MESSAGE results. The default pipeline is AutoCompactMiddleware + two ReadOnlyAgentMiddleware instances (plan and chat). TurnLimitMiddleware, PriceLimitMiddleware, and ContextWarningMiddleware are conditional on their respective config flags.
Every phase produces a machine-checkable artifact set that the next phase consumes without re-interpreting intent. No phase is allowed to silently upgrade, downgrade, or reinterpret the workflow.
init produces:
VISION.md— locked intent, success criteria, scope boundaryPLAN.md— signed phased planWORKFLOW_CONTRACT.json— initial surface selections and rejections
design consumes: WORKFLOW_CONTRACT.json, references/feasibility/*
design produces: DESIGN_CANDIDATES.json, DESIGN.md, ARCHITECTURE.md, SystemName-ORIGINAL.md (mermaid diagram), selected_surfaces[], rejected_surfaces[], feasibility classification
plan consumes: WORKFLOW_CONTRACT.json, DESIGN.md, ARCHITECTURE.md
plan produces: updated PLAN.md with file targets, contracts, validation gates, expected evidence
apply consumes: PLAN.md, DESIGN.md, ARCHITECTURE.md, WORKFLOW_CONTRACT.json
apply produces: implementation files, .vibe-workflow/workflow.yaml or .vibe-workflow/workflow.json
apply must not change selected runtime surfaces unless it records a contract amendment first
validate consumes: repo state, workflow.yaml / manifest, WORKFLOW_CONTRACT.json, expected evidence
validate produces: validation report, evidence artifacts, drift classification
realize consumes: existing repo/artifacts/concepts, references/feasibility/*
realize requires: user confirmation of candidate selection → scope confirmation → fix plan approval before any file edits
realize produces: REALIZATION_CONTRACT.json (authoritative contract for this run), CONCEPT_MAP.json, WORKFLOW_CONTRACT.json stub if absent, implementation patches, REALIZATION_REPORT.md, SystemName-ORIGINAL.md (mermaid diagram), validation evidence
update consumes: existing artifacts, repo state, WORKFLOW_CONTRACT.json
update produces: SystemName-PROPOSED#.md or overwrites existing PROPOSED diagram, contract amendments, updated implementation files, CHANGE_REQUEST.md
WORKFLOW_CONTRACT.json is the durable contract between phases. It records:
- signed intent and confidence;
- selected runtime surfaces;
- rejected or not-applicable surfaces;
- rationale for each decision;
- capability each selected surface provides;
- runtime contract each selected surface must obey;
- expected implementation evidence;
- validation proof requirements;
- update/change-control records.
This is how VibeFlow keeps creative freedom without allowing implementation drift. The schema does not force every workflow to include middleware, tools, hooks, agents, or MCP. It forces the workflow to justify the surfaces it does select and explain why others are unnecessary.
Before approving a design, VibeFlow writes DESIGN_CANDIDATES.json: a small portfolio of candidate architectures. Each candidate must map requirement → runtime surface → concrete mechanism → proof → failure mode. scripts/strategy-fit-linter.py rejects candidates that skip that proof chain, use impossible runtime mechanisms, select source changes without rejecting lower tiers, or choose a winner that is already marked rejected.
This gives the model room to search for better workflows without letting conceptual but non-working designs reach implementation. The selected winner becomes the only candidate that can populate DESIGN.md, ARCHITECTURE.md, and WORKFLOW_CONTRACT.json.
These rules are enforced by apply, validate, and update. They are what separates a governed workflow compiler from a prompt pack.
applymust fail if it implements a runtime surface not selected inWORKFLOW_CONTRACT.json. Record a contract amendment first or stop.applymust fail if it omits a selected surface without a recorded contract amendment.validatemust classify drift as one of:missing_selected_surface— a contracted surface has no implementation evidenceunauthorized_surface_added— an uncontracted surface was implementedwrong_runtime_surface— the wrong Vibe extension point was used for a contracted capabilityimpossible_runtime_assumption— the implementation assumes a Vibe behavior that does not existstale_evidence— evidence artifacts exist but do not match current repo statevalidation_gap— a contracted validation target has no corresponding check
updateis the only phase allowed to amend the contract after initial sign-off.- Every amendment must record:
- old surface
- new surface
- reason
- affected files
- required revalidation targets
Before apply writes any file, it must pass the surface guard (scripts/pre-apply-guard.py). The guard reads a proposed_changes.json describing every intended file edit, then verifies:
- every proposed surface is in the contract's selected surfaces
- no rejected surface is being implemented
- no selected surface is missing without a recorded deviation
The guard exits 0 (pass), 1 (violations — retry), or 2 (bad input). Apply runs the guard in a retry loop until exit 0. Writing implementation files before the guard passes defeats the contract.
Validation runs through scripts/validation-runner.py. It executes checks serially, writes evidence even on failure, and avoids letting stale evidence masquerade as success.
The current validation chain includes:
- workflow schema/lifecycle linting (
workflow-linter.py) - failure classification (
failure-classifier.py) - dry-run simulation with declared tool checks and retry events (
dry-run-simulator.py) - state invariant checking (
state-invariant-checker.py) - gate checks (
gate-engine.py) - design contract linting (
design-contract-linter.py) - pattern-fit linting (
pattern-fit-linter.py) - grounded strategy selection linting (
strategy-fit-linter.py) whenDESIGN_CANDIDATES.jsonexists - drift detection against approved surfaces (
drift-detector.py) - convergence scoring (
convergence-scorer.py) - evidence report generation (
evidence-reporter.py) - auto-rework generation on
NEEDS_REWORKverdicts (auto-rework-generator.py)
Relative paths declared in workflow.yaml are resolved relative to the manifest location, not the shell's current working directory. This applies to evidence output, state paths, reports, and contract discovery.
Executable workflow manifests use the canonical schema in references/workflow-manifest-schema.json. Tooling accepts YAML or JSON and normalizes legacy aliases with warnings, but generators should emit canonical fields: phase.id, phase.entry, phase.exit, and phase.retryLimit.
VibeFlow includes a linter (scripts/pattern-fit-linter.py) for common design mistakes, for example:
- using middleware as
after_tool,post_tool, or tool-level interception; - assuming
before_turn()fires once per user message — it fires before every LLM call in the tool loop; - selecting custom middleware without source/runtime registration in
_setup_middleware(); - treating
allowed_toolsin a skill as a hard tool-access boundary — it is advisory only; - using
auto_approveinstead ofbypass_tool_permissionsin config; - using hooks for pre-turn or mid-turn behavior — hooks are
POST_AGENT_TURNonly; - writing
exit 2in a hook without printing stdout — empty stdout is a warning, not a retry; - assuming subagents inherit parent hooks — subagents never run hooks;
- selecting plan-mode behavior without
exit_plan_mode; - using
ask_user_questionin a workflow that runs headless or via ACP — it is disabled in both contexts; - using generic user prompts where
ask_user_questionshould provide structured choices; - treating
todoas durable audit state; - treating scratchpad as canonical record storage or a project-relative path;
- enabling MCP
sampling_enabledwithout justification; - requiring reasoning visibility without checking model/backend support;
- designing interactive workflows without asking whether headless execution is required;
- placing custom workflow tools in
vibe/core/tools/builtins/— that is core infrastructure, not a skill extension point; - using
state.jsonto persist phase across compaction — agent profile survives compaction natively; - claiming subagent isolation while only switching profiles — profile switching keeps the same context window;
- referencing model names in agent profile TOMLs that are not declared in
~/.vibe/config.toml.
VibeFlow produces mermaid diagrams as a sanity check that workflows actually work end-to-end, and to track what was tried across iterations.
Naming:
SystemName-ORIGINAL.md— produced after initial workflow completion (design, apply, or realize)SystemName-PROPOSED#.md— produced before update work begins, where#is the iteration number
When to overwrite vs. create new PROPOSED:
- Overwrite the existing PROPOSED diagram when the update is an architectural fix within the same intent direction (e.g., fixing tool placement, correcting compaction behavior).
- Create a new PROPOSED# when the update represents a genuine change in intent direction (e.g., adding phases, changing fundamental topology).
Validation: Every diagram must be double-checked to ensure it truthfully represents the actual workflow end-to-end. If it doesn't match, fix the diagram or the implementation before proceeding.
VibeFlow can ingest existing conceptual workflows, skillsets, plugin drafts, partially implemented repos, or design documents and convert them into working, runtime-grounded implementations.
This is not init. It starts from existing artifacts, not a blank intake. It is interactive: the user confirms what to work on, confirms the scope and intent, reviews the feasibility reflection, approves the fix plan, then the system executes.
The realization flow:
- Scan — silently scan the repo, group findings into named candidates, display them, ask the user to choose one
- Scope confirmation — present interpretation of the selected candidate's intent, ask targeted clarifying questions, wait for explicit user confirmation
- Feasibility reflection — classify each component against real Vibe runtime surfaces using
references/feasibility/; surface what cannot be implemented as stated and the nearest alternative - Proposed fix plan — present a numbered, file-specific list of changes; wait for user approval or adjustments
- Contract bootstrap — write
REALIZATION_CONTRACT.jsonas the authoritative contract for this run; create or stubWORKFLOW_CONTRACT.jsonif absent (partial implementations will not have one) - Apply — execute only the approved plan; record any deviations before making them
- Validate — run validation against the realization contract; classify failures with the drift taxonomy; report honestly if validation cannot run
Because partial implementations will not have a complete WORKFLOW_CONTRACT.json, realize bootstraps one from what it finds. Validation uses REALIZATION_CONTRACT.json as the authoritative contract source rather than failing on a missing or incomplete prior contract.
Classification buckets:
| Bucket | Example |
|---|---|
implemented |
Valid skill prompt, valid command file already wired |
maps_to_existing_surface |
Approval loop via plan profile + exit_plan_mode |
requires_custom_tool |
Evidence writer, manifest generator |
requires_mcp_connector |
Remote knowledge server, external API bridge |
requires_source_modification |
Middleware registration, custom AgentLoop hook |
not_implementable_as_stated |
"middleware after every tool call" — no Vibe hook exists for this |
Load the plugin directly:
claude --plugin-dir /path/to/VibeFlowClaude Code installs plugins per config root. If you use alternate homes such as ~/.claude-openrouter-1, sync VibeFlow explicitly:
python3 /path/to/VibeFlow/scripts/install-to-claude-profiles.py
python3 /path/to/VibeFlow/scripts/install-to-claude-profiles.py --apply --install-cacheThe first command is a dry run. The second syncs ~/.claude and detected ~/.claude-openrouter* profiles, enables vibe-flow@local, and refreshes Claude's plugin cache where possible.
| User command | Backing skill | Purpose |
|---|---|---|
/vibe-workflow-init |
skills/vibe-workflow-init/SKILL.md |
Lock intent and emit VISION.md, PLAN.md, WORKFLOW_CONTRACT.json |
/vibe-workflow-design |
skills/vibe-workflow-design/SKILL.md |
Map intent to feasible runtime topology and diagrams |
/vibe-workflow-plan |
skills/vibe-workflow-plan/SKILL.md |
Research source/docs and produce implementation file targets |
/vibe-workflow-apply |
skills/vibe-workflow-apply/SKILL.md |
Patch or write files from the approved plan (with pre-apply guard) |
/vibe-workflow-validate |
skills/vibe-workflow-validate/SKILL.md |
Prove the result works or explain exactly why not |
/vibe-workflow-update |
skills/vibe-workflow-update/SKILL.md |
Change-control loop for modifying or hardening existing workflows |
/vibe-workflow-inspect |
skills/vibe-workflow-inspect/SKILL.md |
Inspect an existing repo/workflow state |
/vibe-workflow-realize |
skills/vibe-workflow-realize/SKILL.md |
Realize a conceptual workflow, skillset, or partial implementation |
Example sequence:
/vibe-workflow-init
/vibe-workflow-design
/vibe-workflow-plan
/vibe-workflow-apply
/vibe-workflow-validate
For existing workflows:
/vibe-workflow-inspect
/vibe-workflow-update
/vibe-workflow-validate
For conceptual or partially implemented work:
/vibe-workflow-realize
/vibe-workflow-validate
PYTHONPYCACHEPREFIX=/tmp/vibeflow-pycache python3 -m py_compile scripts/*.py
python3 -m json.tool references/manifest-schema.json
python3 -m json.tool references/diagrams/diagram-index.json
PYTHONPYCACHEPREFIX=/tmp/vibeflow-pycache python3 -m unittest discover -s tests -vPYTHONPYCACHEPREFIX is useful in restricted environments where the repo cannot write __pycache__.
commands/— user-facing slash command definitions.skills/— command backing instructions loaded by Claude Code.references/— shared Mistral Vibe runtime and feasibility knowledge.references/feasibility/— runtime contracts, pattern catalog, configuration keys, composition rules, implementation patterns, and event system reference.references/diagrams/— Mermaid diagrams plus machine-readable diagram index.scripts/— validators, linters, simulators, drift detection, contract enforcement, install/sync tooling.tests/— regression tests for workflow tooling and plugin guarantees.
VibeFlow is for advanced Mistral Vibe users building:
- multi-phase workflow commands;
- quality gates and approval loops;
- middleware-backed runtime behavior;
- custom tools and tool orchestration;
- MCP or connector integrations;
- CI validation flows and programmatic automation;
- source-level Mistral Vibe customizations.
It is not intended as a beginner Mistral AI guide.
VibeFlow is still hardening. The current focus is validation integrity, runtime-surface accuracy, update/inspect loops, prevention of plausible but non-functional designs, and pattern-fit enforcement.