This document defines the mandatory output format for all audit reports.
Every audit — full, targeted, or single-instruction — MUST follow these rules.
No exceptions.
Before auditing project code, the agent MUST recursively read every markdown file in the AUDITOR folder.
Required coverage:
- Root docs:
README.md,SKILL.md,OUTPUT-RULES.md,FULL-AUDIT.md,QUESTIONS.md,COSTS.md,TOP-100-HACKS.md(if present) - Discovery: all files under
discovery/ - Templates: all files under
templates/ - Checklists: all files under
checklists/ - Known vectors:
known-vectors/INDEX.mdplus allknown-vectors/001...100
If ANY AUDITOR file is not loaded, report:
[INCOMPLETE — missing auditor corpus file load]
and stop. Do not produce a "complete" audit.
Every full report MUST include a Corpus Coverage table listing each AUDITOR file and load status.
All findings use a numeric severity from 1 to 10. The old letter scale (C/H/M/L/I) is replaced.
| Score | Label | Color | Meaning | Action Required |
|---|---|---|---|---|
| 10 | 🔴 CRITICAL | Red | Permissionless drain, total fund loss, instant exploit | Block deploy. Fix NOW. |
| 9 | 🔴 CRITICAL | Red | Fund loss with minimal preconditions, privilege to admin | Block deploy. Fix NOW. |
| 8 | 🟠 HIGH | Orange | Fund loss with specific preconditions, partial drain | Fix before any release. |
| 7 | 🟠 HIGH | Orange | Significant economic damage, privilege escalation, data breach | Fix before any release. |
| 6 | 🟡 MEDIUM | Yellow | State corruption, griefing, DoS, economic manipulation (limited) | Fix within 2 weeks. |
| 5 | 🟡 MEDIUM | Yellow | Logic bugs, inconsistent state, moderate information leak | Fix within 2 weeks. |
| 4 | 🔵 LOW | Blue | Minor information leak, code quality issue with security implications | Next sprint. |
| 3 | 🔵 LOW | Blue | Missing best practice with theoretical risk | Next sprint. |
| 2 | ⚪ INFO | Gray | Style, optimization, hardening suggestion | Backlog. |
| 1 | ⚪ INFO | Gray | Cosmetic, documentation, no security impact | Optional. |
Can an attacker steal funds without any preconditions?
YES → 10 (or 9 if amount is bounded)
Can an attacker steal funds WITH preconditions?
YES, large amount → 8
YES, small/bounded amount → 7
Can an attacker corrupt state, cause DoS, or manipulate economics?
YES, significant impact → 6
YES, limited impact → 5
Is there a security gap but no clear exploit path?
YES, data exposed → 4
YES, theoretical → 3
Is it a hardening suggestion with no vulnerability?
YES, some value → 2
YES, cosmetic → 1
After all items are checked, compute the Repository Risk Score:
If ANY finding ≥ 9: REPO SCORE = 10 (CRITICAL — do not deploy)
If ANY finding ≥ 7: REPO SCORE = max(finding) (HIGH — fix first)
If highest finding ≥ 5: REPO SCORE = max(finding) (MEDIUM — fix soon)
If highest finding ≥ 3: REPO SCORE = max(finding) (LOW — acceptable with tracking)
If highest finding ≤ 2: REPO SCORE = max(finding) (MINIMAL — clean)
Every audit output — regardless of size — MUST start with an Executive Summary block. This goes at the very top of the report.
## Executive Summary
**Repository:** {org/repo}
**Commit:** {short SHA}
**Date:** {YYYY-MM-DD}
**Scope:** {FULL / PROGRAM / BACKEND / FRONTEND / DEVOPS}
**Repository Risk Score:** {1-10} — {CRITICAL/HIGH/MEDIUM/LOW/MINIMAL}
### What We Found
{2-4 sentences in plain language. What was audited. What is the overall security posture.
Highlight the most important finding(s) if any critical/high exist.
State whether the code is safe to deploy or not.}
### Severity Distribution
| Score | Label | Count |
|-------|-------|-------|
| 10 | 🔴 CRITICAL | 0 |
| 9 | 🔴 CRITICAL | 0 |
| 8 | 🟠 HIGH | 0 |
| 7 | 🟠 HIGH | 0 |
| 6 | 🟡 MEDIUM | 0 |
| 5 | 🟡 MEDIUM | 0 |
| 4 | 🔵 LOW | 0 |
| 3 | 🔵 LOW | 0 |
| 2 | ⚪ INFO | 0 |
| 1 | ⚪ INFO | 0 |
| **Total Findings** | | **0** |
### Items Verified
| Metric | Count |
|--------|-------|
| Total checklist items | {N} |
| PASS | {N} |
| FAIL | {N} |
| PARTIAL | {N} |
| N/A | {N} |
| Completion | {%} |The auditor MUST walk through code files one at a time. Repositories can be any size — from 10 files to 10,000 files. The auditor does NOT guess. It reads every relevant file.
1. DISCOVER: List all files that need auditing (use discovery/file-map.md patterns)
2. CHUNK: Group files into manageable batches:
- On-chain: 1 instruction file per chunk
- Backend: 1 route file per chunk (+ its service dependency)
- Frontend: 2-3 components per chunk
- Config: all config files in 1 chunk
3. PROCESS: For each chunk:
a. Read the file(s) completely
b. Run applicable checklist items against the actual code
c. Record verdicts inline
d. Save findings to session memory
4. CHECKPOINT: After each chunk, save progress:
- Which files have been reviewed
- Which checklist items have been evaluated
- Any findings so far
5. RESUME: If context is lost, read the checkpoint and continue from where stopped
6. AGGREGATE: After all chunks processed, compile the full report
After each chunk, save to session memory:
## Audit Checkpoint — {timestamp}
### Progress
- Phase: {0/1/2/3/4}
- Files reviewed: {list}
- Files remaining: {list}
- Current checklist: {number}
### Findings So Far
- F-001: [severity 8] {brief description} @ {file:line}
- F-002: [severity 5] {brief description} @ {file:line}
### Next Action
- Read {next file} and continue checklist {XX} from item {YYY}- NEVER assume what a file contains — always read it
- NEVER mark a checklist item PASS/FAIL without having read the relevant code
- NEVER try to process all files at once — respect the chunked protocol
- ALWAYS read the full file, not just the first N lines
- ALWAYS cross-reference: if file A calls file B, read file B too
- Large files (>300 lines): read in two passes (first half, second half)
Every single checklist item and every known attack vector MUST appear in the report output with an explicit verdict. No items are skipped silently. This is the proof that the auditor actually checked everything.
[PASS] AV-001: Account ownership validated via Account<'info, T> typed accounts
[FAIL-8] AV-015: Token account missing token::authority constraint
File: programs/<program>/src/instructions/deposit.rs:42
Impact: Attacker can substitute token account, draining vault
Fix: Add `token::authority = fund` to account constraint
[PARTIAL] AV-021: has_one present but missing runtime require_keys_eq! backup
File: programs/<program>/src/instructions/swap.rs:18
Impact: Defense-in-depth gap — single point of failure
Fix: Add require_keys_eq! after has_one check
[N/A] AV-023: init_if_needed — not used in this program
| Verdict | Rules |
|---|---|
[PASS] |
Must cite which file/code proves it's secure. One line. |
[FAIL-N] |
Must include: severity (1-10), file:line, impact, fix. Minimum 3 lines. |
[PARTIAL] |
Must include: what's missing, file:line, recommended improvement. |
[N/A] |
Must include: WHY it doesn't apply (e.g., "feature not used"). Never bare N/A. |
Items listed in checklist order (AV-001, AV-002, ..., DEP-077, PY-001, ..., PC-060), followed by known vectors in order (KV-001 through KV-100). Never reorder or group by verdict.
Every finding (any item that is [FAIL-N] with severity ≥ 4) gets a full finding block in the Findings section of the report.
#### [F-{number}] {Title}
| Field | Value |
|---|---|
| **Severity** | {1-10} — {🔴/🟠/🟡/🔵/⚪} {CRITICAL/HIGH/MEDIUM/LOW/INFO} |
| **Checklist Item** | {XX-YYY} |
| **Category** | {e.g., Arithmetic, Access Control, Injection} |
| **Language** | {Rust / TypeScript / Python / Go / etc.} |
| **File** | {path/to/file:line} |
| **Status** | Open |
**Description:**
{What is the vulnerability? Be specific to the code found.}
**Impact:**
{What can an attacker do? What is the worst-case scenario? Quantify if possible.}
**Proof of Concept:**
```{language}
// Minimal code/steps that demonstrate the issueRecommendation:
// Specific code fix — not just "fix this"
### Findings with severity 1-3 are **not required** to have a full block — the inline verdict line is sufficient. But they CAN have one if the auditor judges it useful.
---
## Rule 6: Report Sections Order
Every full audit report follows this exact section order:
- Executive Summary (Rule 2 — always first)
- Corpus Coverage (proof every AUDITOR file was loaded)
- Scope & Methodology (languages, files, LOC, checklists applied)
- Findings (severity ≥ 4, full blocks, grouped by severity descending)
- Detailed Item Results (ALL checklist items, item-by-item verdicts)
- Known Vector Results (KV-001 through KV-100, each with verdict)
- Instruction Matrix (on-chain only — if applicable)
- State Model Verification (on-chain only — if applicable)
- Remediation Roadmap (prioritized by severity)
- Appendices (tool versions, environment, disclaimer)
---
## Rule 7: Language Detection Is Automatic
The auditor MUST auto-detect all languages present in the repository and apply the correct checklists.
### Detection Method
Step 1: Scan file extensions in the repo:
.rs → Checklists 01-07 (Solana/Rust)
.ts / .tsx → Checklist 08 (TypeScript Safety)
.ts (backend)→ Checklist 09 (Backend Security)
.tsx (web) → Checklist 10 (Frontend Security)
.py → Checklist 14 (Python Safety)
.go → Checklist 15, section 15.9 Go
.java / .kt → Checklist 15, section 15.9 Java/Kotlin
.rb → Checklist 15, section 15.9 Ruby
.php → Checklist 15, section 15.9 PHP
Other → Checklist 15, sections 15.1-15.8 (universal)
Step 2: Always apply: Checklist 11 (Supply Chain) Checklist 12 (Secrets) Checklist 13 (Deployment)
Step 3: If Anchor.toml exists: Apply Checklists 01-07 (Solana program)
Step 4: Record all detected languages in the report header
### Unsupported Language Handling
If a language has no dedicated checklist (e.g., Swift, Dart, Elixir):
1. Apply Checklist 15 (General Language Safety) — sections 15.1-15.8
2. Note in the report: "Language {X} audited using general safety checklist — no language-specific checklist available"
3. The audit is still valid — fundamentals (injection, auth, crypto, errors) are universal
---
## Rule 8: Metric Computation
The report MUST include computed metrics at the end of the Item Results section.
```markdown
### Audit Metrics
| Metric | Value |
|--------|-------|
| Total items evaluated | {N} |
| PASS | {N} ({%}) |
| FAIL | {N} ({%}) |
| PARTIAL | {N} ({%}) |
| N/A | {N} ({%}) |
| **Pass rate** (excl. N/A) | **{%}** |
| Highest severity found | {1-10} |
| Repository Risk Score | **{1-10}** |
### Known Vector Metrics
| Metric | Value |
|--------|-------|
| Total known vectors | 100 |
| PASS | {N} |
| FAIL | {N} |
| PARTIAL | {N} |
| N/A | {N} |
| Completion | {%} |
### Per-Checklist Summary
| # | Checklist | Items | Pass | Fail | Partial | N/A | Pass Rate |
|---|-----------|-------|------|------|---------|-----|-----------|
| 01 | Account Validation | 57 | | | | | % |
| ... | ... | ... | ... | ... | ... | ... | ... |
| **Total** | | **{N}** | | | | | **{%}** |
| Output | Filename |
|---|---|
| Full audit report | audit_{N}/REPORT.md |
| Remediation roadmap | audit_{N}/roadmap.md |
| Per-instruction worksheets | audit_{N}/worksheets/{instruction_name}.md |
| Checkpoint (session) | session memory: audit-checkpoint.md |
Where {N} is the next audit number (count existing audit_*/ directories + 1).
- NEVER mark an item
[PASS]if you did not read the code - NEVER mark an item
[N/A]just because you ran out of context - If context was lost mid-audit, say so:
[INCOMPLETE — context lost, needs re-review] - If a file was too large to fully read, say so:
[PARTIAL — file truncated at line X, remaining unreviewed] - If the repo uses a pattern the auditor doesn't recognize, say so:
[UNKNOWN — unfamiliar pattern, manual review recommended] - The Executive Summary must honestly state if the audit is partial
For any item where the auditor is less than fully certain:
[PASS*] AV-001: Ownership checked via Account<T> — *confidence: medium, complex macro usage
[FAIL-6*] AR-015: Division without zero check — *confidence: high, clear in code
The * suffix + confidence note signals that manual double-checking may be warranted.