This document details how Agent Validator invokes supported AI CLI tools to ensure:
- Non-interactive execution (no hanging on prompts)
- Read-only access (no file modifications)
- Repo-scoped visibility (limited to the project root)
All adapters write the prompt (including diff) to a temporary file and pipe it to the CLI.
- Dynamic Context: Agents are invoked in a non-interactive, read-only mode where they can use their own file-reading and search tools to pull additional context from your repository as needed.
- Security: By using standard CLI tools with strict flags (like
--sandboxor--allowed-tools), Agent Validator ensures that agents can read your code to review it without being able to modify your files or escape the repository scope. - Output Parsing: All agents are instructed to output strict JSON. The
ReviewGateExecutorparses this JSON to determine pass/fail status.
Adapter: src/cli-adapters/gemini.ts
cat "<tmpFile>" | gemini \
--sandbox \
--allowed-tools read_file list_directory glob search_file_content \
--output-format text--sandbox: Enables the execution sandbox for safety.--allowed-tools ...: Explicitly whitelists read-only tools. Any attempt to use other tools (likewrite_file) will fail or prompt (which fails in non-interactive mode), ensuring read-only safety.--output-format text: Ensures the output is plain text suitable for parsing.- Repo Scoping: Implicitly scoped to the Current Working Directory (CWD) because no
--include-directoriesare provided.
Adapter: src/cli-adapters/codex.ts
cat "<tmpFile>" | codex exec \
--cd "<repoRoot>" \
--sandbox read-only \
-c 'ask_for_approval="never"' \
-exec: Subcommand for non-interactive execution.--cd "<repoRoot>": Sets the working directory to the repository root.--sandbox read-only: Enforces a strict read-only sandbox policy for any shell commands the agent generates.-c 'ask_for_approval="never"': Config override to prevent the CLI from asking for user confirmation before running commands. This is critical for preventing hangs in CI/automated environments.-: Tells Codex to read the prompt from stdin.
Adapter: src/cli-adapters/claude.ts
cat "<tmpFile>" | claude -p \
--cwd "<repoRoot>" \
--allowedTools "Read,Glob,Grep" \
--max-turns 10-p(or--print): Runs Claude in non-interactive print mode. Output is printed to stdout.--cwd "<repoRoot>": Sets the working directory to the repository root.--allowedTools "Read,Glob,Grep": Restricts the agent to a specific set of read-only tools.Read: Read file contents.Glob: List files matching a pattern.Grep: Search file contents.
--max-turns 10: Limits the number of agentic turns (tool use loops) to prevent infinite loops or excessive costs.
Adapter: src/cli-adapters/github-copilot.ts
cat "<tmpFile>" | copilot -s \
--allow-tool 'shell(cat)' --allow-tool 'shell(grep)' \
--allow-tool 'shell(ls)' --allow-tool 'shell(find)' \
--allow-tool 'shell(head)' --allow-tool 'shell(tail)' \
--model "<model>" --effort <level>copilot: Invokes the standalone Copilot CLI directly.-s(silent): Suppresses UI output and stats, returning only the agent response for clean output parsing.--allow-tool 'shell(cat)' ...: Explicitly whitelists read-only shell tools. Tool names must use theshell(command)format. Any attempt to use other tools will fail, ensuring read-only safety. Whenallow_tool_useisfalsein the adapter config, no--allow-toolflags are passed.--model "<model>": Passes the configured model name directly (free-form, no resolution). If omitted, Copilot uses its default model. Invalid model names produce a clear error.--effort <level>: Maps from thethinking_budgetadapter config (low→low,medium→medium,high→high). Omitted whenthinking_budgetisoff.- Repo Scoping: Implicitly scoped to the Current Working Directory (CWD) where the command is executed (repository root).
- Availability: Checked via
copilot --helpwith a 10-second timeout.
- Detection: Reads
~/.copilot/config.jsonto check theinstalled_pluginsarray - Installation:
copilot plugin install Codagent-AI/agent-validator - Skill directories:
.github/skills/(project),~/.copilot/skills/(user) - Hooks: Supported via the Copilot CLI plugin system
Adapter: src/cli-adapters/cursor.ts
cat "<tmpFile>" | agent- No flags: The
agentcommand reads the prompt from stdin and processes it using Cursor's AI capabilities. - Repo Scoping: Implicitly scoped to the Current Working Directory (CWD) where the command is executed (repository root).
- Model: Uses the default model configured by the user in Cursor.
- Cursor does not support custom commands
- The
agentcommand is the CLI interface provided by Cursor for AI-assisted development
Review gates dispatch work to CLI adapters via round-robin. If an adapter hits a usage limit or quota error during a review, it is marked unhealthy for a 1-hour cooldown period. This prevents wasting time retrying adapters that are temporarily unavailable.
- Detection: When an adapter process exits with an error, the system checks the error output for usage-limit phrases (e.g., "usage limit", "quota exceeded", "credit balance is too low").
- Marking: If a usage limit is detected, the adapter is written to the
unhealthy_adaptersmap invalidator_logs/.execution_statewith amarked_attimestamp andreason. - Skipping: On each subsequent run, before dispatching reviews, the system checks the unhealthy map. Adapters within the 1-hour cooldown are skipped.
- Recovery: After the cooldown expires, the adapter's binary is probed via
checkHealth(). If healthy, the flag is cleared and the adapter rejoins the pool. - Round-robin fallback: The
num_reviewsround-robin assignment uses only healthy adapters. Ifnum_reviews: 2but only one adapter is healthy, both review slots are assigned to that adapter. - No mid-execution failover: If an adapter fails during a run, that review slot is lost for the current iteration. The adapter is marked unhealthy and skipped on the next rerun.
- No healthy adapters: If all configured adapters are unhealthy or unavailable, the review gate returns an error immediately.
With cli_preference: [codex, gemini] and num_reviews: 2, if codex hits a rate limit:
- Current run: codex@1 errors, gemini@2 passes → gate fails (incomplete reviews)
- Next run: codex is cooling down and skipped → gemini@1 and gemini@2 both assigned → gate can pass
The isUsageLimit() function checks error output for these phrases (case-insensitive):
- "usage limit"
- "quota exceeded"
- "quota will reset"
- "credit balance is too low"
- "out of extra usage"
- "out of usage"
Detection happens at two points:
- When review output fails to parse as valid JSON (the output itself contains the limit message)
- When the adapter process exits with a non-zero code (the stderr is included in the error message)