Skip to content

Commit 0f901c6

Browse files
committed
Phase 6: add run artifact validation and verifier CLI
1 parent db852a0 commit 0f901c6

9 files changed

Lines changed: 397 additions & 2 deletions

CONTRIBUTING.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,8 @@ python -m unittest discover -s tests -p 'test_*.py'
4242
`--run-mode production` and verify preflight/metadata fields.
4343
4. For shared-checkout policy changes, run one manual task with
4444
`--run-mode production --require-release-tag` when feasible.
45-
5. Update docs/runbook when workflow or guardrails change.
45+
5. Validate one produced run artifact directory:
46+
```bash
47+
python tools/verify_run_artifacts.py .task_runs/<run_id>
48+
```
49+
6. Update docs/runbook when workflow or guardrails change.

MIGRATION_PLAN.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ This plan migrates behavioral task code out of the original monorepo into `RPi4_
1010
4. Phase 3: completed (experimental staging with opt-in execution).
1111
5. Phase 4: completed (production guardrails, runbook, and contributor boundaries).
1212
6. Phase 5: completed (optional tag-strict production lock and CI automation).
13+
7. Phase 6: completed (artifact integrity validation and verifier CLI).
1314

1415
Inputs agreed during planning:
1516
- Priority: architectural cleanup over direct code copy.
@@ -185,6 +186,11 @@ Validation:
185186
2. Add CI automation for smoke/parity test coverage.
186187
3. Document stricter shared-Pi production commands.
187188

189+
### Phase 6: Artifact Integrity and Operator Verification
190+
1. Add runtime validation of run artifacts (metadata, events, results).
191+
2. Add a CLI verification tool for existing run directories.
192+
3. Add tests covering valid and invalid artifact structures.
193+
188194
## Parity and Test Strategy
189195
### Deterministic parity
190196
Use fixed seeds and fixed presets to compare:

README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Behavioral task protocols for RPi4 behavior boxes, separated from hardware suppo
44

55
## Current status
66
Phase 0 scaffolding plus Phase 1/2 baselines are in place, and Phase 3
7-
experimental staging is now wired. Phase 4/5 release controls are now available:
7+
experimental staging is now wired. Phase 4/5/6 release controls are now available:
88
- Shared protocol contract and runtime modules.
99
- Preflight branch/commit checks.
1010
- User/project namespace under `users/`.
@@ -20,6 +20,8 @@ experimental staging is now wired. Phase 4/5 release controls are now available:
2020
- Run metadata now records branch, tag, commit, dirty state, and run mode.
2121
- Shared-checkout operator runbook and contributor ownership guidance.
2222
- CI workflow runs smoke/parity tests on push and pull requests.
23+
- Automatic run-artifact structural validation after each task run.
24+
- Debug-only escape hatch: `--no-validate-artifacts`
2325

2426
## Layout
2527
- `protocols/`: maintained shared protocol implementations.
@@ -89,3 +91,9 @@ Run tests:
8991
```bash
9092
python -m unittest discover -s tests -p 'test_*.py'
9193
```
94+
95+
Validate an existing run directory:
96+
97+
```bash
98+
python tools/verify_run_artifacts.py .task_runs/<run_id>
99+
```

RUNBOOK_SHARED_CHECKOUT.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,13 @@ If production preflight fails:
5353
3. Move to an allowed release branch or tag.
5454
4. Re-run preflight in production mode.
5555

56+
## Artifact integrity checks
57+
Runs are validated automatically after completion. To re-check an existing run:
58+
```bash
59+
python tools/verify_run_artifacts.py .task_runs/<run_id>
60+
```
61+
Note: production mode does not allow disabling artifact validation.
62+
5663
## Release cadence recommendation
5764
1. Validate on a staging Pi in debug mode.
5865
2. Tag the tested commit (`vX.Y.Z`).

run_task.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from dataclasses import replace
66
from pathlib import Path
77

8+
from runtime.artifact_validation import validate_run_directory
89
from runtime.compatibility_layer import parse_cli_overrides, resolve_runtime_parameters
910
from runtime.logging_schema import (
1011
RunMetadata,
@@ -82,6 +83,11 @@ def parse_args() -> argparse.Namespace:
8283
"(recommended on shared Pis)."
8384
),
8485
)
86+
parser.add_argument(
87+
"--no-validate-artifacts",
88+
action="store_true",
89+
help="Skip run artifact validation (debug-only escape hatch).",
90+
)
8591
return parser.parse_args()
8692

8793

@@ -114,6 +120,11 @@ def resolve_release_policy(require_release_tag: bool) -> ReleasePolicy:
114120
return replace(DEFAULT_RELEASE_POLICY, require_release_tag_in_production=True)
115121

116122

123+
def validate_runtime_options(run_mode: str, no_validate_artifacts: bool) -> None:
124+
if run_mode == "production" and no_validate_artifacts:
125+
raise ValueError("Artifact validation cannot be disabled in production mode.")
126+
127+
117128
def main() -> int:
118129
args = parse_args()
119130
repo_root = Path(__file__).resolve().parent
@@ -145,6 +156,7 @@ def main() -> int:
145156

146157
require_confirmation = True if args.run_mode == "production" else (not args.yes)
147158
release_policy = resolve_release_policy(require_release_tag=args.require_release_tag)
159+
validate_runtime_options(run_mode=args.run_mode, no_validate_artifacts=args.no_validate_artifacts)
148160

149161
git_state = run_preflight(
150162
repo_root=repo_root,
@@ -181,6 +193,12 @@ def emit_event(event_type: str, payload: dict[str, object]) -> None:
181193
result = run_protocol(session=session, emit_event=emit_event)
182194
write_result(run_paths.result_path, result)
183195

196+
if not args.no_validate_artifacts:
197+
validation_errors = validate_run_directory(run_paths.run_dir)
198+
if validation_errors:
199+
joined = "\n".join(f"- {error}" for error in validation_errors)
200+
raise RuntimeError(f"Run artifact validation failed:\n{joined}")
201+
184202
print(f"Run complete: {session.run_id}")
185203
print(f"Protocol: {session.protocol}")
186204
print(f"Output directory: {run_paths.run_dir}")

runtime/artifact_validation.py

Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
from __future__ import annotations
2+
3+
from datetime import datetime
4+
import json
5+
from pathlib import Path
6+
from typing import Any
7+
8+
9+
def _is_iso_timestamp(value: str) -> bool:
10+
try:
11+
datetime.fromisoformat(value)
12+
except ValueError:
13+
return False
14+
return True
15+
16+
17+
def _ensure_dict(value: Any, context: str) -> tuple[dict[str, Any] | None, list[str]]:
18+
if isinstance(value, dict):
19+
return value, []
20+
return None, [f"{context} must be a JSON object."]
21+
22+
23+
def _load_json_file(path: Path) -> tuple[dict[str, Any] | None, list[str]]:
24+
try:
25+
with path.open("r", encoding="utf-8") as handle:
26+
data = json.load(handle)
27+
except FileNotFoundError:
28+
return None, [f"Missing file: {path}"]
29+
except json.JSONDecodeError as exc:
30+
return None, [f"Invalid JSON in {path}: {exc}"]
31+
return _ensure_dict(data, str(path))
32+
33+
34+
def validate_run_metadata(metadata: dict[str, Any]) -> list[str]:
35+
errors: list[str] = []
36+
37+
required_string_fields = (
38+
"run_id",
39+
"protocol",
40+
"preset",
41+
"mouse_id",
42+
"project",
43+
"started_at",
44+
"git_branch",
45+
"git_commit",
46+
"run_mode",
47+
)
48+
for field in required_string_fields:
49+
value = metadata.get(field)
50+
if not isinstance(value, str) or not value:
51+
errors.append(f"run_metadata.{field} must be a non-empty string.")
52+
53+
if "git_tag" not in metadata:
54+
errors.append("run_metadata.git_tag is required (string or null).")
55+
else:
56+
git_tag = metadata.get("git_tag")
57+
if git_tag is not None and not isinstance(git_tag, str):
58+
errors.append("run_metadata.git_tag must be a string or null.")
59+
60+
git_dirty = metadata.get("git_dirty")
61+
if not isinstance(git_dirty, bool):
62+
errors.append("run_metadata.git_dirty must be boolean.")
63+
64+
run_mode = metadata.get("run_mode")
65+
if run_mode not in {"debug", "production"}:
66+
errors.append("run_metadata.run_mode must be one of: debug, production.")
67+
68+
schema_version = metadata.get("schema_version")
69+
if not isinstance(schema_version, int) or schema_version < 1:
70+
errors.append("run_metadata.schema_version must be an integer >= 1.")
71+
72+
started_at = metadata.get("started_at")
73+
if isinstance(started_at, str) and not _is_iso_timestamp(started_at):
74+
errors.append("run_metadata.started_at must be an ISO-8601 timestamp.")
75+
76+
return errors
77+
78+
79+
def validate_event_record(record: dict[str, Any], line_number: int) -> list[str]:
80+
errors: list[str] = []
81+
82+
timestamp = record.get("timestamp")
83+
if not isinstance(timestamp, str) or not timestamp:
84+
errors.append(f"events.jsonl line {line_number}: timestamp must be a non-empty string.")
85+
elif not _is_iso_timestamp(timestamp):
86+
errors.append(f"events.jsonl line {line_number}: timestamp must be ISO-8601.")
87+
88+
event_type = record.get("event_type")
89+
if not isinstance(event_type, str) or not event_type:
90+
errors.append(f"events.jsonl line {line_number}: event_type must be a non-empty string.")
91+
92+
payload = record.get("payload")
93+
if not isinstance(payload, dict):
94+
errors.append(f"events.jsonl line {line_number}: payload must be a JSON object.")
95+
96+
return errors
97+
98+
99+
def validate_result_payload(result: dict[str, Any]) -> list[str]:
100+
errors: list[str] = []
101+
102+
protocol = result.get("protocol")
103+
if not isinstance(protocol, str) or not protocol:
104+
errors.append("result.protocol must be a non-empty string.")
105+
106+
preset = result.get("preset")
107+
if not isinstance(preset, str) or not preset:
108+
errors.append("result.preset must be a non-empty string.")
109+
110+
total_trials = result.get("total_trials")
111+
if not isinstance(total_trials, int) or total_trials < 0:
112+
errors.append("result.total_trials must be an integer >= 0.")
113+
114+
outcomes = result.get("outcomes")
115+
if not isinstance(outcomes, list) or any(not isinstance(item, str) for item in outcomes):
116+
errors.append("result.outcomes must be a list of strings.")
117+
118+
outcome_counts = result.get("outcome_counts")
119+
if not isinstance(outcome_counts, dict):
120+
errors.append("result.outcome_counts must be a JSON object.")
121+
else:
122+
for key, value in outcome_counts.items():
123+
if not isinstance(key, str):
124+
errors.append("result.outcome_counts keys must be strings.")
125+
if not isinstance(value, int) or value < 0:
126+
errors.append("result.outcome_counts values must be integers >= 0.")
127+
128+
if isinstance(total_trials, int) and isinstance(outcomes, list):
129+
if len(outcomes) != total_trials:
130+
errors.append(
131+
f"result.total_trials ({total_trials}) must equal len(result.outcomes) ({len(outcomes)})."
132+
)
133+
134+
if isinstance(total_trials, int) and isinstance(outcome_counts, dict):
135+
count_sum = sum(value for value in outcome_counts.values() if isinstance(value, int))
136+
if count_sum != total_trials:
137+
errors.append(
138+
f"result.total_trials ({total_trials}) must equal sum(result.outcome_counts.values()) "
139+
f"({count_sum})."
140+
)
141+
142+
return errors
143+
144+
145+
def validate_run_directory(run_dir: Path) -> list[str]:
146+
errors: list[str] = []
147+
if not run_dir.exists():
148+
return [f"Run directory does not exist: {run_dir}"]
149+
if not run_dir.is_dir():
150+
return [f"Run path is not a directory: {run_dir}"]
151+
152+
metadata_path = run_dir / "run_metadata.json"
153+
events_path = run_dir / "events.jsonl"
154+
result_path = run_dir / "result.json"
155+
156+
metadata, metadata_errors = _load_json_file(metadata_path)
157+
errors.extend(metadata_errors)
158+
result, result_errors = _load_json_file(result_path)
159+
errors.extend(result_errors)
160+
161+
if metadata is not None:
162+
errors.extend(validate_run_metadata(metadata))
163+
run_id = metadata.get("run_id")
164+
if isinstance(run_id, str) and run_id and run_id != run_dir.name:
165+
errors.append(f"run_metadata.run_id ({run_id}) must match run directory name ({run_dir.name}).")
166+
167+
if result is not None:
168+
errors.extend(validate_result_payload(result))
169+
170+
if metadata is not None and result is not None:
171+
metadata_protocol = metadata.get("protocol")
172+
result_protocol = result.get("protocol")
173+
if metadata_protocol != result_protocol:
174+
errors.append("run_metadata.protocol must match result.protocol.")
175+
176+
metadata_preset = metadata.get("preset")
177+
result_preset = result.get("preset")
178+
if metadata_preset != result_preset:
179+
errors.append("run_metadata.preset must match result.preset.")
180+
181+
if not events_path.exists():
182+
errors.append(f"Missing file: {events_path}")
183+
else:
184+
line_count = 0
185+
parsed_event_count = 0
186+
with events_path.open("r", encoding="utf-8") as handle:
187+
for line_count, raw_line in enumerate(handle, start=1):
188+
line = raw_line.strip()
189+
if not line:
190+
continue
191+
try:
192+
record = json.loads(line)
193+
except json.JSONDecodeError as exc:
194+
errors.append(f"events.jsonl line {line_count}: invalid JSON ({exc}).")
195+
continue
196+
record_dict, record_errors = _ensure_dict(record, f"events.jsonl line {line_count}")
197+
if record_dict is None:
198+
errors.extend(record_errors)
199+
continue
200+
errors.extend(validate_event_record(record_dict, line_count))
201+
parsed_event_count += 1
202+
if line_count == 0:
203+
errors.append("events.jsonl must contain at least one event record.")
204+
elif parsed_event_count == 0:
205+
errors.append("events.jsonl must contain at least one non-empty event record.")
206+
207+
return errors

0 commit comments

Comments
 (0)