AgentLint

Your AI agent is only as good as your repo.
33 checks. 5 dimensions. Evidence-backed.

Docs · Checks · Scoring · Evidence · Contributing

AgentLint finds what's broken — file structure, instruction quality, build setup, session continuity, security posture — and fixes it.

We analyzed 265 versions of Anthropic's Claude Code system prompt, documented the hard limits, audited thousands of real repos, and reviewed the academic research. The result: a single command that tells you exactly what your AI agent is struggling with and why.

Install

npm install -g @0xmariowu/agent-lint

Then start a new Claude Code session:

/al

That's it. AgentLint scans your projects, scores them, shows what's wrong, and fixes what it can.

What you get

$ /al

AgentLint — Score: 68/100

Findability      ██████████████░░░░░░  7/10
Instructions     ████████████████░░░░  8/10
Workability      ████████████░░░░░░░░  6/10
Safety           ██████████░░░░░░░░░░  5/10
Continuity       ██████████████░░░░░░  7/10

Fix Plan (7 items):
  [guided]   Pin 8 GitHub Actions to SHA (supply chain risk)
  [guided]   Add .env to .gitignore (AI exposes secrets)
  [assisted] Generate HANDOFF.md
  [guided]   Reduce IMPORTANT keywords (7 found, Anthropic uses 4)

Select items → AgentLint fixes → re-scores → saves HTML report

The HTML report shows a segmented gauge, expandable dimension breakdowns with per-check detail, and a prioritized issues list. Before/after comparison when fixes are applied.

Why this matters

AI coding agents read your repo structure, docs, CI config, and handoff notes. They git push, trigger pipelines, and write files. A well-structured repo gets dramatically better AI output. A poorly structured one wastes tokens, ignores rules, repeats mistakes, and may expose secrets.

AgentLint is built on data most developers never see:

265 versions of Anthropic's Claude Code system prompt — every word added, deleted, and rewritten
Claude Code internals — hard limits (40K char max, 256KB file read limit, pre-commit hook behavior) that silently break your setup
Production security audits across open-source codebases — the gaps AI agents walk into
6 academic papers on instruction-following, context files, and documentation decay

What it checks

Findability — can AI find what it needs?

Check	What	Why
F1	Entry file exists	No CLAUDE.md = AI starts blind
F2	Project description in first 10 lines	AI needs context before rules
F3	Conditional loading guidance	"If working on X, read Y" prevents context bloat
F4	Large directories have INDEX	>10 files without index = AI reads everything
F5	All references resolve	Broken links waste tokens on dead-end reads
F6	Standard file naming	README.md, CLAUDE.md are auto-discovered
F7	@include directives resolve	Missing targets are silently ignored — you think it's loaded, it isn't

Instructions — are your rules well-written?

Check	What	Why
I1	Emphasis keyword count	Anthropic cut IMPORTANT from 12 to 4 across 265 versions
I2	Keyword density	More emphasis = less compliance. Anthropic: 7.5 → 1.4 per 1K words
I3	Rule specificity	"Don't X. Instead Y. Because Z." — Anthropic's golden formula
I4	Action-oriented headings	Anthropic deleted all "You are a..." identity sections
I5	No identity language	"Follow conventions" removed — model already does this
I6	Entry file length	60-120 lines is the sweet spot. Longer dilutes priority
I7	Under 40,000 characters	Claude Code hard limit. Above this, your file is truncated

Workability — can AI build and test?

Check	What	Why
W1	Build/test commands documented	AI can't guess your test runner
W2	CI exists	Rules without enforcement are suggestions
W3	Tests exist (not empty shell)	A CI that runs pytest with 0 test files always "passes"
W4	Linter configured	Mechanical formatting frees AI from guessing style
W5	No files over 256 KB	Claude Code cannot read them — hard error
W6	Pre-commit hooks are fast	Claude Code never uses --no-verify. Slow hooks = stuck commits

Continuity — can next session pick up?

Check	What	Why
C1	Document freshness	Stale instructions are worse than no instructions
C2	Handoff file exists	Without it, every session starts from zero
C3	Changelog has "why"	"Updated INDEX" says nothing. "Fixed broken path" says everything
C4	Plans in repo	Plans in Jira don't exist for AI
C5	CLAUDE.local.md not in git	Private per-user file. Claude Code requires .gitignore

Safety — is AI working securely?

Check	What	Why
S1	.env in .gitignore	AI's Glob tool ignores .gitignore by default — secrets are visible
S2	Actions SHA pinned	AI push triggers CI. Floating tags = supply chain attack vector
S3	Secret scanning configured	AI won't self-check for accidentally written API keys
S4	SECURITY.md exists	AI needs security context for sensitive code decisions
S5	Workflow permissions minimized	AI-triggered workflows shouldn't have write access by default
S6	No hardcoded secrets	Detects `sk-`, `ghp_`, `AKIA`, private key patterns in source
S7	No personal paths	`/Users/xxx/` in source = AI copies and spreads the leak
S8	No pull_request_target	AI pushes trigger CI. Elevated permissions = attack vector

Optional: AI Deep Analysis

Spawns AI subagents to find what mechanical checks can't:

Contradictory rules that confuse the model
Dead-weight rules the model would follow without being told
Vague rules without decision boundaries

Optional: Session Analysis

Reads your Claude Code session logs to find:

Instructions you repeat across sessions (should be in CLAUDE.md)
Rules AI keeps ignoring (need rewriting)
Friction hotspots by project

How scoring works

Each check produces a 0-1 score, weighted by dimension, scaled to 100.

Dimension	Weight	Why?
Instructions	30%	Unique value. No other tool checks CLAUDE.md quality
Findability	20%	AI can't follow rules it can't find
Workability	20%	Can AI actually run your code?
Safety	15%	Is AI working without exposing secrets or triggering vulnerabilities?
Continuity	15%	Does knowledge survive across sessions?

Scores are measurements, not judgments. Reference values come from Anthropic's own data. You decide what to fix.

Update

claude plugin update agent-lint@agent-lint

Evidence

Every check cites its source. Full citations in standards/evidence.json.

Source	Type
Anthropic 265 versions	Primary dataset
Claude Code internals	Hard limits and observed behavior
IFScale (NeurIPS)	Instruction compliance at scale
ETH Zurich	Do context files help coding agents?
Codified Context	Stale content as #1 failure mode
Agent READMEs	Concrete vs abstract effectiveness

Requirements

Claude Code
jq and node 20+

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.claude-plugin		.claude-plugin
.github		.github
.husky		.husky
assets		assets
commands		commands
docs/content		docs/content
hooks		hooks
npm		npm
scripts		scripts
src		src
standards		standards
tests		tests
.gitbook.yaml		.gitbook.yaml
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package.json		package.json
release-metadata.json		release-metadata.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentLint

Install

What you get

Why this matters

What it checks

Findability — can AI find what it needs?

Instructions — are your rules well-written?

Workability — can AI build and test?

Continuity — can next session pick up?

Safety — is AI working securely?

Optional: AI Deep Analysis

Optional: Session Analysis

How scoring works

Update

Evidence

Requirements

License

About

Uh oh!

Releases 14

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentLint

Install

What you get

Why this matters

What it checks

Findability — can AI find what it needs?

Instructions — are your rules well-written?

Workability — can AI build and test?

Continuity — can next session pick up?

Safety — is AI working securely?

Optional: AI Deep Analysis

Optional: Session Analysis

How scoring works

Update

Evidence

Requirements

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages