Refactor oversized harness files (starting with check_model.py) and add repo-wide size guardrails

## Context
`parakeet-stt-daemon/check_model.py` has grown into a large mixed-responsibility file. We should defer structural refactor until current worktree efforts are complete, but capture this as planned technical debt.

## Problem
- `check_model.py` currently mixes CLI parsing, benchmark case loading, runtime execution (offline + stream-seal), metrics, gating, baseline IO, and reporting.
- This raises maintenance risk and slows safe changes.
- We do not currently enforce broad checks to catch other files crossing practical complexity/size thresholds.

## Goal
Refactor the benchmark harness into smaller modules with no behavior regressions, and add repo-wide checks to surface oversized files early.

## Scope (later, not now)
1. Split `check_model.py` into focused modules (suggested):
   - `benchmark/io.py` (manifest/transcripts loading)
   - `benchmark/metrics.py` (WER/token/punctuation/thresholds)
   - `benchmark/runtime.py` (offline + stream-seal execution)
   - `benchmark/reporting.py` (JSON payload/baseline IO)
   - Keep `check_model.py` as thin CLI entrypoint.
2. Add compatibility coverage so existing `just eval` and `check_model.py` CLI usage remain unchanged.
3. Add broader checks to detect large/complex files across repo:
   - Size threshold check (lines per file), initially warn-only.
   - Optional complexity indicators (function length/cyclomatic) where practical.
   - Integrate in local quality flow (`prek`) and/or CI as non-blocking first.
4. Document policy and thresholds in harness docs.

## Acceptance Criteria
- Behavior parity verified by existing benchmark harness tests plus added regression tests for refactor boundaries.
- `just eval` flows remain stable (offline/stream compare + baseline calibrations).
- Repo-wide size check exists and reports offenders beyond configured thresholds.
- Documentation updated with rationale and maintenance path.

## Notes
This issue is intentionally deferred until active worktree changes settle.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor oversized harness files (starting with check_model.py) and add repo-wide size guardrails #1

Context

Problem

Goal

Scope (later, not now)

Acceptance Criteria

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Refactor oversized harness files (starting with check_model.py) and add repo-wide size guardrails #1

Description

Context

Problem

Goal

Scope (later, not now)

Acceptance Criteria

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions