Skip to content

taniwhaai/skills

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Taniwha Skills

A discipline for AI-generated codebases. Imposes structure that holds as projects grow, with verifier-checked contracts and a re-raise loop that surfaces under-specification before code is written.

This is the skills suite — the prompt-and-subagent layer. For the runtime backbone that makes the skills cheaper to operate (state management, atomic writes, schema validation), see Kupu (in development).

What this is

LLMs given an open-ended brief tend to produce sprawling, inconsistent, hard-to-review code that drifts further from the brief as it grows. Taniwha is a set of skills and subagents that constrain how the agent works: it must agree on a structural shape before writing any code, contracts must be complete enough to implement in isolation, every implementation gets independently verified, and ambiguity gets surfaced as questions rather than silently filled in.

The cost is more ceremony than a human writes by hand at small scale. The payoff is structure that holds as the codebase grows — and an audit trail of decisions, contracts, and verifications that survives the conversations that produced them.

Taniwha matches structure to the brief. A small project gets a single contract, a single implementation, a single verifier. A large project gets a tree of modules with composition agents wiring them together. The design-doc agent decides which tier the brief calls for, and the user approves; the rest of the system follows.

The bigger picture

Taniwha Skills is one piece of a wider toolset for AI-centric codebases. The thesis: AI design should be a primitive of the codebase, not an afterthought. Repositories should carry their own structural discipline, decision history, and contract definitions in a form that survives any individual conversation, agent, or developer. Source code is one artefact among several; design docs, contracts, vocabulary, and decision records are first-class citizens of the repo, not chat ephemera.

The Taniwha suite currently spans:

  • Taniwha Skills (this repo) — the discipline and prompts that constrain AI build behaviour
  • Kupu (in development) — the runtime state backbone for AI-centric repos; manages contracts, decisions, events, re-raises in .taniwha/kupu/
  • Arai — local guardrails that enforce coding rules during AI-driven development

Eventually all of these feed into Kete, Taniwha's broader platform for managing AI work across teams. None of that is required to use the skills today; this repo stands alone.

Install

As a Claude Code plugin

/plugin install taniwha

Manual install

Drop the .claude/ directory at the root of your project, or:

git clone <repo-url> taniwha
cp -r taniwha/skills .claude/skills
cp -r taniwha/agents .claude/agents
cp taniwha/CLAUDE.md ./CLAUDE.md

Then in Claude Code, ask it to start a Taniwha build:

Build me a [whatever you want] as a Taniwha project.

The dispatcher takes over from there.

Optional: install Kupu

The skills run on bash utility scripts and direct file writes by default. For a leaner agent context and atomic state operations, install Kupu — the MCP server that backs Taniwha state. The skills detect Kupu's tools at runtime per-operation; whichever tools are available are used, and the rest fall through to the bash path.

Both modes produce the same .taniwha/kupu/ layout, so you can install Kupu later (or upgrade across Kupu phases) without re-running existing builds.

Kupu ships in phases. Each phase adds tools without breaking previously-shipped surfaces. Skills work with any Kupu phase installed, falling back to bash for tools not yet shipped:

  • Phase 1 — primitives (new_id, now) and project lifecycle (init, get_project)
  • Phase 2 — durable writes (record_event, record_decision, register_re_raise, resolve_re_raise)
  • Phase 3+ — read tools, artefact CRUD, tree operations, toolchain detection, build metrics (planned)

A Phase 1-only Kupu installation produces a build trace that uses Kupu for IDs and timestamps but bash for state writes. A Phase 2 installation uses Kupu for both. The skills don't need configuration to handle this — detection is per-operation.

For the full Kupu tool surface and roadmap, see the kupu repository's docs/kupu-tool-surface.md.

How it works

A Taniwha build is a loop:

  1. Dispatcher (the main Claude Code session) holds the Task tool and acts as a mechanical executor.
  2. Orchestrator subagents are spawned one at a time, read project state from disk, decide the single next action, write that decision to disk, and exit. They have no memory between invocations.
  3. Role subagents (design-doc, contract-derivation, leaf-implementation, composition, verifier) carry out the orchestrator's decisions. Each role sees a strictly whitelisted set of inputs to enforce compartmentalisation.

The project's memory lives in a .taniwha/ directory at the project root: design docs, contracts, vocabulary, implementation manifests, decision records, event log, re-raise log. Source code lives at the repo root in whatever layout the project context names; .taniwha/ only holds agent state.

A single user-facing run looks like this:

  1. User gives a brief.
  2. Dispatcher captures the brief, spawns the orchestrator.
  3. Orchestrator surfaces structured questions to capture project context (language, repo layout, test framework).
  4. Toolchain detection runs.
  5. Design-doc subagent produces a design including the structural tier (single_module / small_multi_module / full_decomposition). Any under-specified parts of the brief surface as open questions.
  6. User answers open questions. User approves the design and tier.
  7. Contract-derivation subagent produces per-module manifests and shared vocabulary.
  8. Leaf-implementation subagent(s) build modules from contracts.
  9. Verifier subagent(s) independently check each implementation against its acceptance criteria.
  10. Composition subagent(s) wire modules together (multi-module builds only).
  11. Build complete; user reviews summary.

Every step is a fresh subagent reading filesystem state. Decisions, events, and re-raises are durable and human-readable.

Skills

Skill Purpose
design-doc Produces a structural design from a brief, including the project's tier. Audits the brief for under-specification; surfaces silent decisions.
contract-derivation Derives per-module manifests and shared vocabulary from an approved design. Manifests are language-neutral.
leaf-implementation Implements one module from one manifest, working only from the manifest + vocabulary + project context.
composition Wires two completed children under a parent contract. Produces canonical shared-types packages.
verifier Independently verifies an implementation against its contract's acceptance criteria. Writes its own tests.
orchestrator Ephemeral; decides the single next action by reading state.
dispatcher Main-session loop; spawns subagents per orchestrator decisions.

Subagents

Defined in agents/; installed alongside the skills.

Agent Skill
taniwha-orchestrator orchestrator
taniwha-design-doc design-doc
taniwha-contract-derivation contract-derivation
taniwha-leaf-implementation leaf-implementation
taniwha-composition composition
taniwha-verifier verifier

Design principles

These are the load-bearing rules across the system:

  • Compartmentalisation. Every role sees only a whitelisted set of inputs. Cross-role context leakage destroys the discipline.
  • Re-raise over guess. Any under-specified clause is surfaced as a question, not silently filled in.
  • Tier match. The structural shape matches the brief. A URL shortener gets one module; a multi-subsystem service gets a tree.
  • Verification is mandatory. Implementor self-reports do not count. A verifier reads the contract independently and checks per-AC.
  • State on disk, not in context. Every decision survives in .taniwha/. Subagents read state, decide one thing, exit.
  • Cold-readable. A returning agent six months later, with no chat history, can pick up where it left off by reading the directory.

Use with general-purpose skills

Taniwha focuses on project-architecture. For general-purpose engineering and productivity skills (TDD, debugging, grilling sessions, codebase improvement), mattpocock/skills is excellent and pairs well.

License

MIT — see LICENSE.

About

A discipline for AI-generated codebases. Imposes structure that holds as projects grow, with verifier-checked contracts and a re-raise loop that surfaces under-specification before code is written.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages