randomparity · randomparity · Aug 28, 2025 · Aug 28, 2025 · Aug 28, 2025 · Aug 28, 2025
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -9,14 +9,28 @@ on:
       - main
 
 jobs:
-  ci:
+  dev-ci:
+    name: Dev CI (Python 3.12)
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v4
       - uses: actions/setup-python@v5
         with:
           python-version: '3.12'
-      - name: Install dependencies
+      - name: Set up venv and install
         run: make setup
-      - name: Run CI checks
+      - name: Run full CI suite
         run: make ci
+
+  runtime-compat:
+    name: Runtime compatibility (Python 3.7 install/import)
+    runs-on: ubuntu-22.04
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.7'
+      - name: Install package (runtime deps only)
+        run: pip install .
+      - name: Import smoke test
+        run: python -c "import ai_review_hook; print(ai_review_hook.__version__)"
diff --git a/.gitignore b/.gitignore
@@ -75,9 +75,6 @@ target/
 profile_default/
 ipython_config.py
 
-# pyenv
-.python-version
-
 # poetry
 poetry.lock
 

diff --git a/.python-version b/.python-version
@@ -0,0 +1 @@
+3.12.11
diff --git a/Makefile b/Makefile
@@ -69,6 +69,10 @@ lint: ## Run ruff linting
 	$(REQUIRE_VENV)
 	$(RUFF) check $(SRC)/ $(TESTS)/
 
+lint-fix: ## Auto-fix lint issues with ruff
+	$(REQUIRE_VENV)
+	$(RUFF) check --fix $(SRC)/ $(TESTS)/
+
 format: ## Run black formatting
 	$(REQUIRE_VENV)
 	$(BLACK) $(SRC)/ $(TESTS)/

diff --git a/README.md b/README.md
@@ -1,8 +1,10 @@
 # AI Review Hook
 
+[![Python](https://img.shields.io/badge/python-3.7%2B%20runtime%20%7C%203.12%20dev%2FCI-blue)](#)
+
 This project grew out of my frustration with existing AI coding frameworks. I would follow the general guidance to add best practices requirements to CLAUDE.md, WARP.md, or other framework specific system prompts, but the AI tends to forget about them over time and moves towards the quickest method to push code on its way out the door.
 
-After a few atttemtps to ***vibe code*** my way to success I quickly recognized the need to setup adequate guard rails to keep an AI headed in the right direction.  Git hooks work as an excellent gate and [pre-commit](https://github.com/pre-commit/pre-commit) was a flexible way to add custom controls.  
+After a few atttemtps to ***vibe code*** my way to success I quickly recognized the need to setup adequate guard rails to keep an AI headed in the right direction.  Git hooks work as an excellent gate and [pre-commit](https://github.com/pre-commit/pre-commit) was a flexible way to add custom controls.
 
 The result is [AI Hook Review](https://github.com/randomparity/ai-review-hook), a python application that uses `pre-commit` to setup `pre-commit`/`pre-push` git hooks and add the missing ***vibe coding*** guard rails.
 
@@ -13,7 +15,7 @@ The result is [AI Hook Review](https://github.com/randomparity/ai-review-hook),
     ```yaml
     repos:
     -   repo: https://github.com/randomparity/ai-review-hook
-        rev: v0.2.0  # Replace with the desired tag or commit SHA
+        rev: v0.2.3
         hooks:
         -   id: ai-review
     ```
@@ -22,7 +24,7 @@ The result is [AI Hook Review](https://github.com/randomparity/ai-review-hook),
 
     ```yaml
     - repo: https://github.com/randomparity/ai-review-hook
-      rev: v0.2.0
+      rev: v0.2.3
       hooks:
         - id: ai-review
           name: AI Code Review
@@ -34,7 +36,9 @@ The result is [AI Hook Review](https://github.com/randomparity/ai-review-hook),
             - "--context-lines"
             - "5"
             - "--output-file"
-            - "ai-review.log"
+            - "ai-review.jsonl"
+            - "--format"
+            - "jsonl"
             - "--allow-unsafe-base-url"
             - "--base-url"
             - "https://openrouter.ai/api/v1"
@@ -89,18 +93,20 @@ The result is [AI Hook Review](https://github.com/randomparity/ai-review-hook),
 *   `--jobs`, `-j`: Number of parallel jobs for reviewing multiple files (default: 1)
 *   `--allow-unsafe-base-url`: Allow custom base URLs other than official OpenAI endpoints
 *   `--output-file`: File to save the complete review output
-*   `--format`: Output format: `text` (default), `json`, or `codeclimate`. `codeclimate` produces Code Climate-compatible JSON for GitLab/GitHub code-quality reports; `json` is machine-readable.
+*   `--format`: Output format: `text` (default), `json`, `jsonl`, or `codeclimate`. `codeclimate` produces Code Climate-compatible JSON for GitLab/GitHub code-quality reports; `json`/`jsonl` are machine-readable.
 *   `--include-files`: File patterns to include for review (e.g., '*.py' or '*.py,*.js'). Can be specified multiple times. If not specified, all files are included by default.
 *   `--exclude-files`: File patterns to exclude from review (e.g., '*.test.py' or '*.test.*,*.spec.*'). Can be specified multiple times. Exclude patterns take precedence over include patterns.
 *   `--no-default-excludes`: Disable the default exclude patterns for common non-reviewable files (e.g., lockfiles, vendored dependencies, minified assets).
 *   `--filetype-prompts`: Path to JSON file containing filetype-specific prompts. File should map glob patterns to custom prompt templates (e.g., `{"*.py": "Review this Python code...", "*.md": "Review this documentation...", "test_*.py": "Review this test file...", "src/**/*.js": "Review this JavaScript source..."}`)
+*   `--embed-json-in-log`: When using `--format text`, also embed a per-file JSON block between `=== AI_REVIEW_JSON_START ===` and `=== AI_REVIEW_JSON_END ===`.
 *   `-v`, `--verbose`: Enable verbose logging
 
 
 ### Output Formats
 
 - text (default): human-readable review summary suitable for local runs.
 - json: machine-readable array for scripting or tooling.
+- jsonl: machine-readable one-JSON-object-per-line; ideal for agents to stream/parse.
 - codeclimate: Code Climate-compatible JSON for GitHub/GitLab code-quality reports.
 
 Examples:
@@ -109,13 +115,16 @@ Examples:
 # Save JSON output to a file
 pre-commit run ai-review --all-files -- --format json --output-file ai-review.json
 
+# Save JSONL output to a file (agent-friendly)
+pre-commit run ai-review --all-files -- --format jsonl --output-file ai-review.jsonl
+
 # Generate a Code Climate report for CI
 pre-commit run ai-review --all-files -- --format codeclimate --output-file gl-code-quality-report.json
 
 # Example .pre-commit-config.yaml
 ```yaml
 - repo: https://github.com/randomparity/ai-review-hook
-  rev: v0.2.0
+  rev: v0.2.3
   hooks:
     - id: ai-review
       name: AI Code Review (Code Quality)
@@ -126,6 +135,16 @@ pre-commit run ai-review --all-files -- --format codeclimate --output-file gl-co
         - "gl-code-quality-report.json"
 ```
 
+### Embedded JSON in text logs (optional)
+
+Append a compact per-file JSON object into the text log, bracketed by sentinels:
+
+```bash
+pre-commit run ai-review --all-files -- \
+  --format text \
+  --output-file ai-review.log \
+  --embed-json-in-log
+```
 
 ## Security Features
 
@@ -244,7 +263,7 @@ pre-commit run ai-review --all-files -- \
 **Configuration in .pre-commit-config.yaml:**
 ```yaml
 - repo: https://github.com/randomparity/ai-review-hook
-  rev: v1.0.0
+  rev: v0.2.3
   hooks:
     - id: ai-review
       name: AI Code Review - Python Only
@@ -372,7 +391,7 @@ pre-commit run ai-review --all-files -- \
 **Pre-commit Configuration:**
 ```yaml
 - repo: https://github.com/randomparity/ai-review-hook
-  rev: v1.0.0
+  rev: v0.2.3
   hooks:
     - id: ai-review
       name: AI Code Review with Custom Prompts

diff --git a/WARP.md b/WARP.md
@@ -0,0 +1,98 @@
+# WARP.md
+
+This file provides guidance to WARP (warp.dev) when working with code in this repository.
+``
+
+Project overview
+- Python package that provides a pre-commit hook and CLI for AI-assisted code review using the OpenAI API.
+- Entry point: ai_review_hook.main:main exposed as the ai-review console script.
+- Key capabilities: secret redaction, diff-only mode, file filtering via glob patterns, optional filetype-specific prompts, retry/backoff, and parallel review.
+
+Common commands
+Environment setup
+- Install dev deps (pytest, pre-commit, etc.):
+  - pip install -r requirements-dev.txt
+- Optional (recommended for local CLI testing): install the package in editable mode:
+  - pip install -e .
+
+Build, linting, tests (Makefile)
+- One-time setup (creates .venv, installs dev deps):
+  - make setup
+- Lint:
+  - make lint
+- Format code:
+  - make format
+- Typecheck and security scan:
+  - make typecheck
+  - make security
+- Run tests:
+  - make test
+- Full CI suite (what CI runs):
+  - make ci
+- Run all pre-commit hooks locally:
+  - pre-commit run -a
+
+Tests (single-file or single-test examples)
+- Run a specific test file:
+  - .venv/bin/pytest tests/test_main.py -q
+- Run a single test:
+  - .venv/bin/pytest tests/test_main.py::test_review_file_pass -q
+- Filter by keyword expression:
+  - .venv/bin/pytest -k "redact and not jwt"
+
+CLI and hook usage (local development)
+- Ensure an API key env var is set (defaults to OPENAI_API_KEY):
+  - export OPENAI_API_KEY={{OPENAI_API_KEY}}
+- After editable install, view CLI help:
+  - ai-review --help
+- Try the hook as pre-commit would execute it, using this repo as the source of the hook definition:
+  - pre-commit try-repo . ai-review --all-files --verbose --hook-stage commit -- --diff-only -v
+  Notes:
+  - The hook id is defined in .pre-commit-hooks.yaml (ai-review).
+  - Arguments after -- are passed to the hook (e.g., --model, --base-url, --filetype-prompts, etc.).
+- Typical consumer configuration (from README) to add to another repo’s .pre-commit-config.yaml:
+  - repos:
+    - repo: https://github.com/randomparity/ai-review-hook
+      rev: v1.0.0
+      hooks:
+        - id: ai-review
+
+Important repo-specific behavior and conventions
+- PASS/FAIL contract: The model must begin its first line with AI-REVIEW:[PASS] or AI-REVIEW:[FAIL]. The hook fails closed if markers are missing or a FAIL marker appears anywhere in the response.
+- Secret redaction: Before sending any content to the AI API, the tool redacts common secrets (AWS keys, GitHub tokens, JWTs, bearer tokens, DB URLs, private keys, generic API keys). Redaction happens for both diff and file content; binary files are detected and replaced with a placeholder to avoid exfiltration.
+- Diff handling: The tool pulls git diffs (staged first, falls back to unstaged) with configurable context lines. For large diffs, it extracts hunks and truncates with explicit markers.
+- File filtering: Include/exclude glob patterns are supported; exclude has precedence. Patterns apply to both full paths and basenames. Helper: parse_file_patterns([...]) normalizes comma-separated inputs.
+- Filetype-specific prompts: Optional JSON mapping of glob patterns to prompt templates. Matching priority: exact filename, then full-path globs, then extension globs, then basename globs (first match wins). Placeholders {filename}, {diff}, {content}, {diff_only_note} are supported. If no custom match, a comprehensive default prompt is used.
+- Parallelism: When --jobs > 1, files are reviewed concurrently with ThreadPoolExecutor; results are re-ordered to match input.
+- Retry/backoff: API errors considered retryable (rate limit, timeout, connection, some 5xx/422) trigger exponential backoff with jitter; capped by --max-retries and delay settings.
+- Output: Optionally writes a combined review log with per-file sections when --output-file is provided. Process exit code is nonzero if any file fails review.
+
+Structure and key files
+- src/ai_review_hook/main.py: CLI, argument parsing, AIReviewer class, redaction, diff/content handling, pattern parsing, prompts selection, retry/parallel orchestration, and program exit control.
+- src/ai_review_hook/__init__.py: version metadata.
+- .pre-commit-hooks.yaml: defines the ai-review hook for consumers.
+- .pre-commit-config.yaml: local dev hooks (ruff, ruff-format, and a local pytest hook which runs on commit).
+- tests/: unit tests covering redaction, prompt selection and glob priority, truncation, retries/backoff, and parallel execution.
+- pyproject.toml: project metadata; pytest and ruff configuration; console script entry point (ai-review).
+- .github/workflows/ci.yml: runs pytest on pushes/PRs to main.
+
+Key options to know (see README for full list)
+- --api-key-env: environment variable name for API key (default OPENAI_API_KEY)
+- --base-url: custom API base for compatible providers; requires --allow-unsafe-base-url if not an official OpenAI endpoint
+- --model: model identifier (default gpt-4o-mini)
+- --diff-only: send only the diff to the model (useful for sensitive repos)
+- --jobs/-j: parallel reviews
+- --filetype-prompts: JSON file mapping glob patterns to prompt templates
+- --max-diff-bytes / --max-content-bytes: size limits with truncation markers
+- --context-lines: git diff context size
+
+CI
+- GitHub Actions runs `make ci` on Python 3.12. Prefer the same Makefile targets locally before commit/push.
+
+Prohibited commands
+- Never run: git push --no-verify (do not bypass pre-commit or CI gates)
+- Never run: git commit --no-verify (do not bypass pre-commit or CI gates)
+
+Notes from README
+- Quick Start and usage examples for consumers are in README.md, including how to add this hook to a project, configure models/base URLs, filter files, enable parallelism, and use filetype-specific prompts with glob patterns.
+- Development setup in this repo: pip install -r requirements-dev.txt and pre-commit install.
diff --git a/src/ai_review_hook/formatters.py b/src/ai_review_hook/formatters.py
@@ -60,3 +60,18 @@ def format_as_codeclimate(
             codeclimate_issues.append(issue)
 
     return json.dumps(codeclimate_issues, indent=2)
+
+
+def format_as_jsonl(
+    all_reviews: List[Tuple[str, bool, str, Optional[List[Dict[str, Any]]]]],
+) -> str:
+    """Formats the review results as JSON Lines (one object per file)."""
+    lines = []
+    for filename, passed, _, findings in all_reviews:
+        record = {
+            "filename": filename,
+            "passed": passed,
+            "findings": findings if findings else [],
+        }
+        lines.append(json.dumps(record, ensure_ascii=False))
+    return "\n".join(lines)
diff --git a/src/ai_review_hook/main.py b/src/ai_review_hook/main.py
@@ -8,13 +8,19 @@
 
 import argparse
 import concurrent.futures
+import json
 import logging
 import os
 import sys
 from typing import Dict, List, Optional, Tuple, Any
 
 from .reviewer import AIReviewer, DEFAULT_MODEL, DEFAULT_MAX_TOKENS, DEFAULT_TEMPERATURE
-from .formatters import format_as_text, format_as_json, format_as_codeclimate
+from .formatters import (
+    format_as_text,
+    format_as_json,
+    format_as_codeclimate,
+    format_as_jsonl,
+)
 from .utils import (
     should_review_file,
     parse_file_patterns,
@@ -99,10 +105,15 @@ def main() -> int:
     )
     parser.add_argument(
         "--format",
-        choices=["text", "codeclimate", "json"],
+        choices=["text", "codeclimate", "json", "jsonl"],
         default="text",
         help="Output format. 'text' is human-readable, 'codeclimate' is for GitLab/GitHub integration.",
     )
+    parser.add_argument(
+        "--embed-json-in-log",
+        action="store_true",
+        help="When writing text logs, also embed a per-file JSON object between sentinels.",
+    )
     parser.add_argument(
         "--max-retries",
         type=int,
@@ -304,6 +315,17 @@ def review_single_file(
 
 """
             review_log_entry += review
+            if args.embed_json_in_log:
+                per_file_json = {
+                    "filename": filename,
+                    "passed": passed,
+                    "findings": findings if findings else [],
+                }
+                review_log_entry += (
+                    "\n=== AI_REVIEW_JSON_START ===\n"
+                    + json.dumps(per_file_json, ensure_ascii=False)
+                    + "\n=== AI_REVIEW_JSON_END ===\n"
+                )
             all_reviews.append((filename, passed, review_log_entry, findings))
     else:
         # Parallel processing
@@ -364,6 +386,17 @@ def review_single_file(
 
 """
                 review_log_entry += review
+                if args.embed_json_in_log:
+                    per_file_json = {
+                        "filename": filename,
+                        "passed": passed,
+                        "findings": findings if findings else [],
+                    }
+                    review_log_entry += (
+                        "\n=== AI_REVIEW_JSON_START ===\n"
+                        + json.dumps(per_file_json, ensure_ascii=False)
+                        + "\n=== AI_REVIEW_JSON_END ===\n"
+                    )
                 all_reviews.append((filename, passed, review_log_entry, findings))
 
     # Generate output based on format
@@ -373,6 +406,8 @@ def review_single_file(
         output_content = format_as_json(all_reviews)
     elif args.format == "codeclimate":
         output_content = format_as_codeclimate(all_reviews)
+    elif args.format == "jsonl":
+        output_content = format_as_jsonl(all_reviews)
     else:
         # Should not happen due to argparse choices
         logging.error(f"Unknown format: {args.format}")