From 63863c23dc638bf5b836a2db3536314ea80c6da0 Mon Sep 17 00:00:00 2001 From: Yogesh Rao Date: Sun, 19 Apr 2026 01:22:08 +0530 Subject: [PATCH 1/2] feat: improve skill scores for agentic-commerce-skills-plugins MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## What This PR Does Hey @OrcaQubits 👋 Improves 2 skill definitions (`a2a-task-lifecycle` and `mpp-session-flow`) with expanded descriptions, executable code examples, structured verification workflows, and trimmed conceptual padding. I ran your skills through `tessl skill review` at work and found some targeted improvements. Here's the full before/after: | Skill | Before | After | Change | |-------|--------|-------|--------| | a2a-task-lifecycle | 67% | 90% | +23% | | mpp-session-flow | 61% | 84% | +23% | I kept this PR focused on the 2 skills with the biggest improvements to keep the diff reviewable. Happy to follow up with the rest in a separate PR if you'd like. ## Why Both skills scored well on description quality but had significant gaps in content — missing executable code examples, no verification workflows, and heavy conceptual explanations that didn't help Claude actually implement anything. The changes replace padding with actionable code and clear validation steps. ## Type of Change - [x] Documentation improvement
Changes summary ### a2a-task-lifecycle (67% → 90%) - Expanded description with "Agent-to-Agent" expansion and natural trigger terms ("task status tracking", "agent-to-agent workflows") - Added complete TypeScript state machine implementation with transition validation - Added task creation handler (`message/send`) with concrete code - Added 6-step verification workflow with explicit validation checkpoints - Removed "What a Task Is" conceptual section — Claude doesn't need background narration - Streamlined state transition diagram for clarity ### mpp-session-flow (61% → 84%) - Expanded description with natural trigger terms ("pay-per-use", "metered billing", "usage-based pricing") - Added complete server-side implementation with metered endpoint example - Added full client-side implementation (session open, metered requests, balance monitoring, refill, close) - Added cap exhaustion handling with 402 retry pattern - Added 6-step verification workflow - Removed "What Session Intent Is", "Payment Channel", and "Session vs Charge" comparison sections — conceptual padding replaced with executable code
## Checklist - [x] Plugin follows the required directory structure (see CONTRIBUTING.md) - [x] Agent has Live Documentation Rule with official source URLs - [x] `tools` field is a comma-separated string, not a YAML list ## Attribution Skill quality evaluation and optimization performed using Tessl skill review tooling. Honest disclosure — I work at @tesslio where we build tooling around skills like these. Not a pitch - just saw room for improvement and wanted to contribute. Want to self-improve your skills? Just point your agent (Claude Code, Codex, etc.) at this Tessl guide (https://docs.tessl.io/evaluate/optimize-a-skill-using-best-practices) and ask it to optimize your skill. Ping me - @yogesh-tessl (https://github.com/yogesh-tessl) - if you hit any snags. ## Testing Notes Each skill was reviewed with `tessl skill review` before and after changes to verify score improvements. Changes focus on adding executable code examples and verification workflows while preserving all domain-specific content and terminology. Thanks in advance 🙏 --- .../skills/a2a-task-lifecycle/SKILL.md | 152 +++++++++++------- stripe-mpp/skills/mpp-session-flow/SKILL.md | 139 +++++++++------- 2 files changed, 172 insertions(+), 119 deletions(-) diff --git a/a2a-multi-agent/skills/a2a-task-lifecycle/SKILL.md b/a2a-multi-agent/skills/a2a-task-lifecycle/SKILL.md index 5014ba1..e05ecc7 100644 --- a/a2a-multi-agent/skills/a2a-task-lifecycle/SKILL.md +++ b/a2a-multi-agent/skills/a2a-task-lifecycle/SKILL.md @@ -1,6 +1,6 @@ --- name: a2a-task-lifecycle -description: Implement A2A task lifecycle management — task creation, state transitions, terminal states, history, and artifacts. Use when building task state machines, handling state transitions, or managing task persistence. +description: "Implement A2A (Agent-to-Agent) task lifecycle management — task creation, state transitions, terminal states, history, and artifacts. Use when building task state machines, handling state transitions, managing task persistence, or implementing task status tracking in agent-to-agent workflows." allowed-tools: Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch --- @@ -12,24 +12,8 @@ allowed-tools: Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch 1. Fetch `https://a2a-protocol.org/latest/specification/` for the Task object schema and state machine 2. Web-search `site:github.com a2aproject A2A task lifecycle states` for state transition rules 3. Web-search `site:github.com a2aproject a2a-samples task` for task handling examples -4. Fetch SDK docs for task-related classes and state management utilities -## Conceptual Architecture - -### What a Task Is - -A Task is the **central unit of work** in A2A. It represents a request from a client agent to a server agent, tracks progress through well-defined states, accumulates messages and artifacts, and reaches a terminal state when complete. - -### Task Structure - -Key fields of a Task object: -- **id** — Unique task identifier (set by client or server) -- **status** — Current state object containing `state` enum and optional `message` -- **messages** — Array of messages exchanged (if `stateTransitionHistory` enabled) -- **artifacts** — Array of output artifacts produced by the agent -- **metadata** — Optional key-value pairs for custom data - -### The 9 States +## The 9 States | State | Terminal? | Description | |-------|-----------|-------------| @@ -43,68 +27,118 @@ Key fields of a Task object: | `rejected` | Yes | Server refused the task | | `unknown` | — | Default/unknown state | -### Valid State Transitions +## Valid State Transitions ``` -submitted → working +submitted → working → completed + → failed + → canceled + → input-required → working (client provides input) + → canceled + submitted → rejected submitted → canceled -working → completed -working → failed -working → canceled -working → input-required - -input-required → working (when client provides more input) -input-required → canceled - -auth-required → working (when auth is provided) +auth-required → working (auth provided) auth-required → canceled ``` -**Rules:** -- Terminal states (`completed`, `failed`, `canceled`, `rejected`) are final — no transitions out -- Only the server transitions the task state (except `canceled` which client can request) -- `input-required` → `working` happens when the client sends a follow-up message - -### Task Creation - -Tasks are created implicitly when a client sends a message without a `taskId`: -1. Client sends `message/send` or `message/stream` without `taskId` -2. Server creates a new task, assigns an ID -3. Task starts in `submitted` state -4. Server may immediately transition to `working` or return `submitted` - -### Task Continuation +**Rules:** Terminal states (`completed`, `failed`, `canceled`, `rejected`) are final — no transitions out. Only the server transitions state (except `canceled` which client can request). + +## Task State Machine Implementation + +```typescript +const TERMINAL_STATES = new Set(["completed", "failed", "canceled", "rejected"]); + +const VALID_TRANSITIONS: Record = { + submitted: ["working", "rejected", "canceled"], + working: ["completed", "failed", "canceled", "input-required"], + "input-required": ["working", "canceled"], + "auth-required": ["working", "canceled"], +}; + +interface Task { + id: string; + status: { state: string; message?: string; timestamp: string }; + messages: Message[]; + artifacts: Artifact[]; + metadata?: Record; +} + +function transitionTask(task: Task, newState: string, message?: string): Task { + if (TERMINAL_STATES.has(task.status.state)) { + throw new Error(`Cannot transition from terminal state: ${task.status.state}`); + } + const allowed = VALID_TRANSITIONS[task.status.state]; + if (!allowed?.includes(newState)) { + throw new Error(`Invalid transition: ${task.status.state} → ${newState}`); + } + return { + ...task, + status: { state: newState, message, timestamp: new Date().toISOString() }, + }; +} +``` -When a client sends a message with an existing `taskId`: -1. The message is appended to the task's history -2. The server resumes processing -3. State typically transitions from `input-required` back to `working` +## Task Creation (message/send handler) + +```typescript +import { randomUUID } from "crypto"; + +async function handleMessageSend(request: { + taskId?: string; + message: Message; +}): Promise { + let task: Task; + + if (request.taskId) { + task = await taskStore.get(request.taskId); + if (!task) throw new Error(`Task not found: ${request.taskId}`); + task.messages.push(request.message); + task = transitionTask(task, "working"); + } else { + task = { + id: randomUUID(), + status: { state: "submitted", timestamp: new Date().toISOString() }, + messages: [request.message], + artifacts: [], + }; + } + + await taskStore.save(task); + task = transitionTask(task, "working", "Processing request"); + await taskStore.save(task); + + const result = await processTask(task); + task.artifacts.push(result.artifact); + task = transitionTask(task, "completed", "Done"); + await taskStore.save(task); + return task; +} +``` -### Artifacts +## Artifacts -Artifacts are the **outputs** of a task: -- Produced during `working` state +Artifacts are the outputs of a task, produced during `working` state: - Each artifact has `id`, `name`, optional `description`, and `parts` - Parts can be TextPart, FilePart, or DataPart - In streaming mode, artifacts are delivered incrementally via `TaskArtifactUpdateEvent` -- Multiple artifacts can be produced per task -### State Transition History +## Verification Workflow -If the agent declares `stateTransitionHistory: true` in its Agent Card: -- The task object includes a complete history of all state transitions -- Each transition records the state, timestamp, and optional message -- Useful for auditing and debugging +1. Create a task via `message/send` without `taskId` — verify task is created with `submitted` state +2. Verify automatic transition to `working` — check status updates +3. Attempt an invalid transition (e.g., `submitted` → `completed`) — verify error is thrown +4. Complete a task — verify state is `completed` and artifacts are present +5. Attempt to transition a completed task — verify error (terminal state) +6. Test `input-required` flow: send a task that needs more input, provide follow-up, verify it resumes -### Best Practices +## Best Practices - Always validate state transitions — reject invalid ones with appropriate errors -- Use task IDs that are globally unique (UUIDs recommended) +- Use UUIDs for task IDs - Store task state durably for production (not just in-memory) - Set timeouts for tasks stuck in non-terminal states -- Clean up old tasks to prevent unbounded storage growth - Include meaningful messages in status updates (not just the state enum) - Use artifacts for structured outputs, messages for conversational exchanges - Implement idempotency — handle duplicate messages for the same task gracefully diff --git a/stripe-mpp/skills/mpp-session-flow/SKILL.md b/stripe-mpp/skills/mpp-session-flow/SKILL.md index 50e9f95..4ae7524 100644 --- a/stripe-mpp/skills/mpp-session-flow/SKILL.md +++ b/stripe-mpp/skills/mpp-session-flow/SKILL.md @@ -1,6 +1,6 @@ --- name: mpp-session-flow -description: Implement MPP session-based streaming payment flows — authorize-once pay-as-you-go patterns for continuous data feeds, per-token billing, and micropayment aggregation. Use when building streaming APIs or services that charge incrementally. +description: "Implement MPP session-based streaming payment flows — authorize-once pay-as-you-go patterns for continuous data feeds, per-token billing, and micropayment aggregation. Use when building streaming APIs or services that charge incrementally, implementing pay-per-use or metered billing, or adding usage-based pricing to an API." allowed-tools: Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch --- @@ -12,91 +12,110 @@ allowed-tools: Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch 1. Fetch `https://www.npmjs.com/package/mppx` for the session middleware API and payment channel configuration 2. Fetch `https://paymentauth.org/` for the canonical session intent specification 3. Web-search `mpp session streaming micropayments payment channel` for session implementation patterns -4. Web-search `site:mpp.dev session` for session-specific documentation -## Conceptual Architecture - -### What Session Intent Is - -The session intent implements **streaming micropayments** — often described as **"OAuth for money"**. The agent authorizes a spending limit upfront, then streams micropayments continuously as it consumes resources: +## Session Lifecycle ``` -1. Client opens session with spending cap -2. Server creates payment channel -3. Client makes requests — each deducts from the spending cap -4. Micropayments stream at sub-cent costs, sub-millisecond latency -5. Session closes — final settlement on-chain (single transaction) +Open → Authorize → Active → Refill (optional) → Close → Settled ``` -### When to Use Session +1. **Open** — Client sends initial request; server returns 402 with session challenge +2. **Authorize** — Client authorizes spending cap (e.g., 10,000 units) +3. **Active** — Client makes requests; each deducts from the cap +4. **Refill** — Client can extend the cap before it runs out +5. **Close** — Either party closes; final settlement happens on-chain +6. **Settled** — Single on-chain transaction for the total consumed amount + +## Server-Side Implementation -- **Per-token billing** — LLM inference charged per token generated -- **Continuous data feeds** — Real-time market data, sensor streams -- **Compute metering** — Pay for actual CPU/GPU seconds used -- **Bandwidth metering** — Pay per KB transferred -- **Any high-frequency, low-value access pattern** +```typescript +import { mppx } from "mppx"; -### Session vs Charge Comparison +// Protect a streaming endpoint with session-based payment +app.get("/api/stream", mppx.session({ maxAmount: "10000" }), async (c) => { + return c.json({ data: "streaming content" }); +}); -| Dimension | Charge | Session | -|-----------|--------|---------| -| Settlement | Per-request on-chain/card | Aggregated at session close | -| Latency | Includes payment settlement per call | Sub-millisecond after session open | -| Cost | One tx per request | One tx for entire session | -| Pricing | Fixed per request | Variable, metered | -| Use case | Infrequent, high-value calls | Frequent, low-value calls | +// Metered endpoint — charge per unit consumed +app.post("/api/inference", mppx.session({ maxAmount: "50000" }), async (c) => { + const result = await runInference(c.req.body); + const tokensUsed = result.usage.totalTokens; + await c.mpp.charge(tokensUsed); + return c.json({ result: result.output, charged: tokensUsed }); +}); +``` -### Server-Side Implementation +## Client-Side Implementation ```typescript -// Protect a route with a session payment gate -app.get('/api/stream', mppx.session({ maxAmount: '10000' }), async (c) => { - // Deducts from the session's spending cap - return c.json({ data: 'streaming content' }); +import { MppClient } from "mppx/client"; + +const client = new MppClient({ wallet: agentWallet }); + +// Open a session with a spending cap +const session = await client.openSession("https://api.example.com/api/stream", { + spendingCap: 10000, }); -``` -### Session Lifecycle +// Make metered requests — each deducts from the cap +const response = await session.fetch("/api/inference", { + method: "POST", + body: JSON.stringify({ prompt: "Hello" }), +}); -1. **Open** — Client sends initial request; server returns 402 with session challenge -2. **Authorize** — Client authorizes spending cap (e.g., 10,000 units) -3. **Active** — Client makes requests; each deducts from the cap -4. **Refill** — Client can extend the cap before it runs out -5. **Close** — Either party closes; final settlement happens on-chain -6. **Settled** — Single on-chain transaction for the total consumed amount +// Monitor remaining balance +console.log(`Remaining: ${session.remainingBalance}`); -### Payment Channel +// Extend the cap before it runs out +if (session.remainingBalance < 1000) { + await session.refill(5000); +} -Session payments use a **payment channel** — an off-chain mechanism where: -- Funds are locked upfront in a channel -- Each micropayment updates the channel state without on-chain transactions -- Only the opening and closing transactions go on-chain -- This enables thousands of sub-cent payments at sub-millisecond latency +// Close the session — triggers final settlement +await session.close(); +``` -### Spending Cap Management +## Handling Cap Exhaustion -- Agent sets the maximum they're willing to spend in the session -- Server deducts from this cap per request/unit consumed -- Agent can monitor remaining balance -- If cap is exhausted, server returns 402 for a new session -- Agent can proactively extend the cap +```typescript +// Server: return 402 when cap is exhausted +app.use("/api/*", async (c, next) => { + try { + await next(); + } catch (err) { + if (err.code === "CAP_EXHAUSTED") { + return c.json({ error: "spending_cap_exhausted", remaining: 0 }, 402); + } + throw err; + } +}); + +// Client: handle 402 by opening a new session +async function fetchWithRetry(session, url, opts) { + const res = await session.fetch(url, opts); + if (res.status === 402) { + const newSession = await client.openSession(url, { spendingCap: 10000 }); + return newSession.fetch(url, opts); + } + return res; +} +``` -### Metering Patterns +## Verification Workflow -| Pattern | Description | -|---------|-------------| -| Fixed per request | Each request costs a fixed amount | -| Per-unit | Cost varies by units consumed (tokens, bytes, seconds) | -| Time-based | Cost accrues per time interval | -| Tiered | Rate decreases with volume (first 100 at $X, next 1000 at $Y) | +1. Start server with session middleware enabled +2. Open a session from the client — verify 402 challenge is returned, then authorization succeeds +3. Make a metered request — verify balance decreases by the correct amount +4. Exhaust the cap — verify server returns 402 +5. Refill or open a new session — verify requests resume +6. Close the session — verify settlement transaction is recorded -### Best Practices +## Best Practices - Set reasonable default spending caps (not too high for safety, not too low for UX) - Implement cap exhaustion warnings before the cap runs out - Log metering data for billing reconciliation - Handle session interruptions gracefully (network drops, server restarts) - Implement session resumption where possible -- Monitor session durations and spending patterns for pricing optimization Fetch the latest mppx SDK documentation and MPP specification for exact session API, payment channel mechanics, and configuration options before implementing. From 0c053e9ecd3727a002626f736b461c19d2b0f116 Mon Sep 17 00:00:00 2001 From: Yogesh Rao Date: Sun, 19 Apr 2026 20:40:44 +0530 Subject: [PATCH 2/2] ci: add skill-review GitHub Action for automated skill review on PRs --- .github/workflows/skill-review.yml | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 .github/workflows/skill-review.yml diff --git a/.github/workflows/skill-review.yml b/.github/workflows/skill-review.yml new file mode 100644 index 0000000..f0ab779 --- /dev/null +++ b/.github/workflows/skill-review.yml @@ -0,0 +1,13 @@ +name: Skill Review +on: + pull_request: + paths: ['**/SKILL.md'] +jobs: + review: + runs-on: ubuntu-latest + permissions: + pull-requests: write + contents: read + steps: + - uses: actions/checkout@v4 + - uses: tesslio/skill-review@22e928dd837202b2b1d1397e0114c92e0fae5ead # main