Skip to content

Close PMF blockers: prove lift, shrink time-to-aha, narrow ICP #556

@boshu2

Description

@boshu2

Problem

AgentOps has shipped the four-layer architecture across 73 skills, a Go CLI, and 4 runtime adapters (328 stars). But the project's own workbench A/B shows Δ=+0.0000 — no measurable lift at v1 difficulty. The README is 416 lines, the quickstart has 9 conditional branches, and the activation moment requires multiple sessions. The vendor-eclipse thesis is honest but unfocused across three personas.

Seven specific blockers stand between the current state and PMF.


PMF Blockers (ranked by leverage)

PMF-1: No measurable lift to point at (P0)

PRODUCT.md reports workbench A/B at Δ=+0.0000 across 12 cases. Both legs scored 12/12. A prospective user reads: the maintainer's own benchmark cannot distinguish AgentOps-on from AgentOps-off.

Fix: Ship workbench v2 with 8-12 realistic tasks where the hook layer differentiates (multi-file refactors needing prior learnings, security reviews benefiting from prevention rules, tasks where pre-mortem catches known pitfalls). Run skill-on vs skill-off A/B. Publish Δ ≥ +0.15 in README above the fold.

Files: evals/workbench/tasks/, evals/workbench/suite-workbench-behavioral-v2.json, PRODUCT.md, README.md

PMF-2: No 60-second aha moment (P0)

GOALS.md North Star: "install to first validated flow in under 5 minutes." No gate measures it. Reality: 7 install permutations, 9 quickstart branches, and the value prop ("repo remembers what worked") only pays off by session 3.

Fix: Build a 90-second ao demo that clones a scratch repo with seeded .agents/ history, runs a task that triggers a verdict the user wouldn't have produced themselves, and shows "this is what AgentOps added" in the terminal. Zero setup beyond ao + git.

Files: cli/cmd/ao/demo.go, examples/demo/, README.md

PMF-3: ICP messaging split across three personas (P1)

README pitches solo dev, orchestrator, and quality-first maintainer equally. The sovereignty thesis (cross-runtime corpus, local-first, operator-owned scheduling) is the only wedge Anthropic cannot absorb — but it's buried.

Fix: Lead README with sovereignty value prop for teams needing constrained-network discipline, audit trails, and cross-runtime portability. Move "what if Anthropic ships native X" from defensive to assertive. Secondary personas acknowledged but not equal-weighted.

Files: README.md, PRODUCT.md, docs/index.md

PMF-4: Quickstart is a decision tree, not a fast path (P1)

9 conditional branches in skills/quickstart/SKILL.md before showing "next action." State detection belongs in ao doctor, not the first thing a user runs.

Fix: Simplify /quickstart to 1 happy path + 1 fallback. Move elaborate state detection into ao doctor.

Files: skills/quickstart/SKILL.md, cli/cmd/ao/doctor.go

PMF-5: 73 skills with no discoverability hierarchy (P1)

A new user doesn't know if they need /rpi vs /crank vs /swarm vs /evolve vs /dream. The skill router's existence is itself the symptom.

Fix: Three tiers: Start Here (5-7 skills), Power (10-15), Specialist (50+). Surface tiers in README, docs/SKILLS.md, and quickstart output.

Files: skills/SKILL-TIERS.md, docs/SKILLS.md, README.md

PMF-6: Internal vocabulary crowds user-facing surfaces (P2)

CDLC, RPI, Brownian Ratchet, σρ > δ, Meadows leverage points, Knowledge OS → Olympus → Mt. Olympus — all on the front page. Signals deep thinking to insiders; unintelligible to everyone else.

Fix: Remove jargon from README and mkdocs landing. Keep in dedicated docs. Progressive disclosure, not front-page taxonomy.

Files: README.md, docs/index.md

PMF-7: Distribution is power-user channels only (P2)

Install via brew tap, raw GitHub scripts, Claude plugin marketplace. No demo video, no VS Code listing, no "try without installing" surface.

Fix: Add demo GIF/asciicast to README. Check in example verdict output. Consider VS Code marketplace listing.

Files: README.md, examples/demo-output/


Execution Waves

Wave 1 (parallel, no dependencies)

  • PMF-1 — Workbench v2 (generates the Δ number)
  • PMF-2ao demo (generates the activation path + demo output)
  • PMF-5 — Skill tiers (independent catalog restructuring)

Wave 2 (depends on Wave 1)

  • PMF-3 — ICP messaging (needs Δ from PMF-1)
  • PMF-4 — Quickstart simplification (demo replaces quickstart as first action)
  • PMF-7 — Distribution surface (records demo output from PMF-2)

Wave 3 (depends on Wave 2)

  • PMF-6 — Vocabulary cleanup (messaging rewrite sets the tone first)

Success Criteria

Metric Current Target
Workbench A/B Δ +0.0000 ≥ +0.15 on v2 tasks
README above-the-fold 416 lines ≤ 150 lines
Quickstart branches 9 ≤ 3
Time install → first verdict ~15-20 min (est.) < 5 min measured
Internal jargon on front page 5+ terms 0
Skill tiers flat list 3 tiers

The Compressed Read

PMF is gated on proving lift, narrowing the ICP, and shrinking time-to-aha — in that order. The one experiment most worth running first: ship workbench v2, publish a credible Δ, and replace the 9-branch quickstart with a single 90-second ao demo that ends with a screenshot-able verdict.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions