Skip to content

Latest commit

 

History

History
208 lines (161 loc) · 17.1 KB

File metadata and controls

208 lines (161 loc) · 17.1 KB

Operating Loop

One-page spine. The operational discipline every AgentOps process skill executes. Companion to Component Map (product/component routing), Ports and Adapters (the runtime seams), Intent-to-Loop Hexagon (the process-level ports), and CDLC (the context lifecycle inside the SDLC control plane). RPI naming (/rpi skill vs ao rpi CLI vs this loop): codebase-overview — RPI terminology.

AgentOps' execution discipline is one repeatable loop inside the SDLC control plane, not a phased waterfall of documents. Every process skill is one move within it. No artifact exists unless it advances the loop.

BDD-shaped intent issue
  → vertical slices (each one a behavior, not a layer)
  → TDD per slice (first failing test, then implementation)
  → conflict-free parallel wave (only if write scopes do not collide)
  → integrated bead completion (acceptance examples pass)
  → evidence + learning capture (under the promotion ratchet)

The unit of value is the proof, not the artifact. A slice is done only when the membrane has written an independent verdict on it (no verdict = not done) — this is the move every skill feeds. The corpus/ratchet beneath is the (unproven, ADR-0004) compounding layer, not the headline; the membrane's own self-improvement (escape → new check → re-measure) is the compounding that has a deterministic gradient.

The doctrine source for this spine is .agents/research/2026-05-16-agentops-3-cdlc-context-validation.md. Promote changes there first, then update this doc.

Governing principles

  1. The loop is the primitive, not the documents. If an artifact does not advance behavior toward acceptance, enable parallel work, preserve human authority, or become a reusable gate, it is token drag.
  2. Behavior is the unit of work, not a layer. A slice cuts vertically through whatever layers are needed to demonstrate one Given/When/Then.
  3. The first failing test is the slice's contract. Code without a failing test has no acceptance surface; an agent has no way to know when it is done.
  4. Parallelism is explicit ownership. Waves are valid only when the conflict-free check below passes. Default to sequential.
  5. Less process, more executable shared language. The promotion ratchet kills artifacts that do not change future behavior.
  6. Context crosses boundaries as artifacts. RPI keeps orchestration visible, but phase execution should cross through bounded packets and summaries, not raw accumulated chat context.
  7. The map is fixed; the route is re-routed. This loop is a deterministic role-topology — its stages, legal transitions, and gates do not change per goal (the map). The path a given goal takes through it is dynamic and recalculated on failure (the route). Because the worker is stochastic, you trust the map and the gates, not the agent: the gate at move 6 is the windshield — deterministic ground-truth that catches a confident hallucination (a road that was never there) which re-routing alone cannot. See 3.0 → the navigator model. Why the re-routing terminates, and how the map itself improves between runs without oscillating, is specified in the Control-Loop Model (two timescales + the governor): the map is fixed within a run; the slow loop tunes it across runs, governed so it doesn't thrash.
  8. Single-agent-first; orchestration is opt-in escalation. The default execution shape is one capable agent working in-session with good bookkeeping. Multi-agent orchestration — parallel waves, ATM swarms, Agent Mail coordination — is an escalation you reach for, never a substrate you start from. Escalation trigger (observable): escalate only when you are creating two or more active lanes — independent read/review lanes whose outputs a lead will merge, or independent implementation slices with disjoint write scopes. When ≥2 lanes/panes share the repo, Agent Mail registration and file reservations are mandatory before writes. With only one active writer, stay single-agent and use normal bookkeeping. ATM and AM are separate escalations on different axes — never a package. ATM (the out-of-session substrate) answers a durability/wall-clock need — work must outlive your session or run unattended. AM (coordination) answers a contention need — ≥2 writers can touch the same path. You reach for either alone: AM-without-ATM is the common case (two in-session lanes sharing a repo); ATM-without-AM is an unattended file-disjoint queue. Asymmetry guardrail: the de-mandate removes the single-writer session-start tax, not the collision guard — the ≥2-writers → reserve reflex stays non-negotiable (an unneeded AM call costs one command; a missing one silently clobbers a shared file). Full 4-case matrix: using-atm. (Shape routing detail: automation-shape-routing — "shape 0" is the default front door; AGENTOPS_ORCHESTRATION=off pins the beads floor.)

The seven moves

1. Shape intent as BDD

The intent issue is not ready until the acceptance examples are testable. Required surface:

  • Feature / capability name
  • Given / When / Then examples (one happy path + at least one edge)
  • Domain terms used (anchored to the repo's ubiquitous-language register; for AgentOps that is skills/domain/references/ and skills/standards/references/architecture-terms.md)
  • Component and bounded-context route per the Component Map; generated skill-role context per the context map
  • Non-goals
  • Rollback / containment path
  • Evidence needed for completion (test names, snapshot keys, eval suites, council verdicts)

Template: docs/templates/intent-issue.md. Skills that produce this artifact: /discovery, /product, /plan.

2. Track as a bead when it leaves the head

A bead is the linked-intent packet for one BDD-shaped behavior change. It carries the acceptance examples, the bounded-context tag, the slice list, the wave plan, accumulating evidence, and residual gaps at close. One-shot work that stays inside a single prompt does not need a bead. Skill: beads-br (via br; while legacy .beads/ retirement is in progress, invoke as BEADS_DIR="$(ao beads dir)" br ...).

3. Slice vertically through behavior

A good slice maps to one Given/When/Then row, has a nameable first failing test, has a review-in-one-pass write scope, and touches one bounded context. "Refactor then feature" is two slices. Skill: /plan produces the slice list.

4. TDD per slice

Per slice, in order:

  1. First failing test — must fail for the right reason (missing behavior, not syntax).
  2. Smallest change that flips it to green.
  3. Refactor under green. Refactor is its own commit.
  4. Record evidence into the bead.

Skill: /implement operates on one slice at a time.

5. Group into a wave only when write scopes do not collide

Wave validity is a hard gate, applied row by row:

Check Pass means
Distinct write scopes Each slice's modified-files set is disjoint
Distinct test targets Tests run independently; no shared fixture mutation
No shared migration At most one slice per migration / schema / generated file
No shared CLI surface At most one slice per command's flags or arguments
Integration order declared Merge order is named if it matters
Owner per slice One agent or one human per slice — no joint ownership
Discard path per slice Every slice has a rollback or drop-and-re-plan exit

Any failed row → slices run sequential. Skill: /plan declares the wave; /crank, /swarm, /autodev execute it.

6. Close the bead by proving its acceptance

Every Given/When/Then maps to a passing test. Every non-goal is still untouched. Every rollback path is reachable. Evidence is recorded. Activity logs do not close beads. Skills: /validate, /council, /pre-land-refuters.

When a cycle is logged, the CycleTrace can carry the closeout join explicitly: bead_id, acceptance_examples, validation_commands, and closeout_verdict. That join is the reviewer path from a bead's Gherkin example to the test, gate, or eval that proved it.

7. Capture evidence and learning, then ratchet

Two outputs per loop turn — evidence into .agents/rpi/, the bead, and the relevant council/validation artifacts; learnings only if they cleared the promotion bar (next section). Skills: /post-mortem, /forge, /flywheel, /compile.

The loop closes here: re-plan on evidence, not just on failure

Move 7 feeds back into move 1 — this is where the route gets re-routed (principle 7). The sharpening principle 7 leaves implicit: re-routing is triggered by evidence, not only by failure. A wave that succeeds still teaches something the plan didn't know, and that evidence may refactor, insert, drop, reorder, or re-scope the remaining waves before the next one runs. The wave plan is a hypothesis; each wave is the experiment that tests it. Under --auto the orchestrator executes those pivots itself — it is not gated on a wave failing first, and it does not run the initial wave-list to the letter. Two failure modes this kills: retry-not-replan (re-cranking a failed wave forever instead of asking whether the remaining plan should change) and waterfall (executing the pre-written wave list because "that was the plan"). Bounded by the run's circuit breakers (budget / attempt cap / oscillation detection) and the ≥5-arc post-mortem checkpoint; the operator is surfaced only at the terminal objective or a breaker trip — never just to approve a pivot. The orchestrator that owns this across a turn is /rpi; full mechanics: Agile Re-Plan Loop.

The promotion ratchet

Do not run full ceremony for every observation. Promote progressively:

Trigger Goes to
Noticed once Stays in the handoff. Dies when the handoff ages out.
Repeats twice across sessions or beads .agents/learnings/<slug>.md
Changes future agent behavior Update a SKILL.md or a template under docs/templates/
Must never regress Add a validation gate (warn-only first, then blocking)
Becomes core doctrine Promote into PRODUCT.md / GOALS.md / docs/cdlc.md

The ratchet is what keeps .agents/ from becoming a landfill. Compounding only happens when capture meets pruning.

R3 self-enforcement (no learning without a constraint). The "Must never regress → add a validation gate" rung used to be prose only — a learning could be promoted to a durable maturity tier without ever compiling into a gate/test/rule. scripts/check-ratchet-r3-constraint.sh enforces it against the live (gitignored) .agents/learnings/ corpus: any durable-tier learning (candidate/established/canonical/stable/promoted) that cites no constraint — a scripts//.github/workflows/ gate, a _test.go/tests/ reference, a skills/**/SKILL.md step, or a constraint:/enforced_by: frontmatter field — is flagged. Warn-only by default; --strict (or RATCHET_R3_BLOCKING=true) makes it blocking, mirroring the same warn-then-fail ladder. A CI path-filter gate is intentionally not used because the learnings corpus is gitignored (dead-by-design, like the retired learning-coherence job); the script's own correctness is gated by tests/scripts/check-ratchet-r3-constraint.bats.

Skill → loop-move map

Loop move Primary skills Produces
Shape intent discovery, product, plan BDD intent issue with acceptance examples
Track as bead beads-br Bead with slice list + acceptance contract
Slice + wave plan plan Slice list + wave grouping + ownership map
Pre-flight check pre-mortem, council Verdict on plan + wave validity
TDD per slice implement First failing test → green → refactor
Wave execution crank, swarm, autodev Parallel slices with explicit ownership
Slice validation validate, council, pre-land-refuters Per-slice acceptance proof
Bead acceptance validate, council Roll-up acceptance verdict
Capture post-mortem, forge Evidence + promoted learnings
Compound flywheel, compile, operationalize Learnings → patterns → rules → gates

How the loop composes with the architectural seams

The loop is operational discipline. The architectural seams are structural. They are orthogonal and they compose:

  • Bounded contexts (Component Map, generated context map) — every slice declares which bounded context it touches. A slice that crosses contexts is two slices.
  • Ports (cli/internal/ports/) — the first failing test for a slice that touches a port can be written against the port interface before any adapter exists.
  • Adapters (cli/internal/adapters/) — adapter changes are slices like any other. The first failing test calls the adapter through the port; the port stays stable.
  • Domain purity (ADR-0001) — slices that change cli/internal/domain/ must keep the no-import-from-internal/* invariant. The wave check treats domain-purity as a shared concern: at most one slice per wave touches domain types.

Failure modes the loop prevents

Failure mode Loop move that prevents it
Agent writes code with no contract Move 4: first failing test before implementation
Two agents stomp on the same file in parallel Move 5: wave-validity write-scope check
Bead closes with "looks good" instead of evidence Move 6: every Given/When/Then maps to a passing test
.agents/ accumulates one-off observations forever Move 7 + promotion rule: most observations die at handoff
A "refactor + feature" PR mixes contracts Move 3: refactor and feature are two slices
Layer-by-layer waterfall reappears under "phases" Move 3 + move 1: slices are vertical and BDD-shaped

What this doctrine deliberately does NOT do

  • Does not introduce a new skills/cdlc/ skill — the spine is doc-shaped, referenced by every process skill.
  • Does not introduce new practice slugs — the loop is a composition of bdd-gherkin + tdd + ddd-bounded-context + hexagonal-architecture + agile-manifesto + pragmatic-programmer + continuous-delivery.
  • Does not couple AgentOps to any consumer's domain vocabulary — bounded contexts are named by the consuming repo.
  • Does not require new tooling — br and existing validation gates carry the load.
  • Does not enforce parallelism — parallel waves are an optimization unlocked by the conflict-free check, not a default.

See also