Skip to content

feat: v0.1.94#1113

Merged
Henry-811 merged 4 commits into
mainfrom
dev/v0.1.94
Jun 5, 2026
Merged

feat: v0.1.94#1113
Henry-811 merged 4 commits into
mainfrom
dev/v0.1.94

Conversation

@ncrispino
Copy link
Copy Markdown
Collaborator

@ncrispino ncrispino commented Jun 5, 2026

PR Title Format

Your PR title must follow the format: <type>: <brief description>

Valid types:

  • fix: - Bug fixes
  • feat: - New features
  • breaking: - Breaking changes
  • docs: - Documentation updates
  • refactor: - Code refactoring
  • test: - Test additions/modifications
  • chore: - Maintenance tasks
  • perf: - Performance improvements
  • style: - Code style changes
  • ci: - CI/CD configuration changes

Examples:

  • fix: resolve memory leak in data processing
  • feat: add export to CSV functionality
  • breaking: change API response format
  • docs: update installation guide

Description

Brief description of the changes in this PR

Type of change

  • Bug fix (fix:) - Non-breaking change which fixes an issue
  • New feature (feat:) - Non-breaking change which adds functionality
  • Breaking change (breaking:) - Fix or feature that would cause existing functionality to not work as expected
  • Documentation (docs:) - Documentation updates
  • Code refactoring (refactor:) - Code changes that neither fix a bug nor add a feature
  • Tests (test:) - Adding missing tests or correcting existing tests
  • Chore (chore:) - Maintenance tasks, dependency updates, etc.
  • Performance improvement (perf:) - Code changes that improve performance
  • Code style (style:) - Changes that do not affect the meaning of the code (formatting, missing semi-colons, etc.)
  • CI/CD (ci:) - Changes to CI/CD configuration files and scripts

Checklist

  • I have run pre-commit on my changed files and all checks pass
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Pre-commit status

# Paste the output of running pre-commit on your changed files:
# uv run pre-commit install
# git diff --name-only HEAD~1 | xargs uv run pre-commit run --files # for last commit
# git diff --name-only origin/<base branch>...HEAD | xargs uv run pre-commit run --files # for all commits in PR
# git add <your file> # if any fixes were applied
# git commit -m "chore: apply pre-commit fixes"
# git push origin <branch-name>

How to Test

Add test method for this PR.

Test CLI Command

Write down the test bash command. If there is pre-requests, please emphasize.

Expected Results

Description/screenshots of expected results.

Additional context

Add any other context about the PR here.

Summary by CodeRabbit

Release Notes — v0.1.94

  • Bug Fixes

    • Fixed multiple concurrency race conditions in orchestrator event handling and snapshot management.
    • Improved robustness of mid-stream injection execution across runtime scenarios.
    • Enhanced snapshot publication to prevent read-during-write races via immutable versioned storage.
  • Performance

    • Moved snapshot copying off the orchestrator event loop to preserve concurrent streaming responsiveness.
  • Tests

    • Added comprehensive regression test suite covering concurrency correctness, snapshot versioning, and parallel execution safety.

ncrispino and others added 4 commits June 4, 2026 14:56
…fload, injection de-dup)

Implements the actionable scope of docs/dev_notes/next_version_eng_health_plan.md
(from two adversarially-verified audit workflows). All under TDD with cost-free
simulation (mock backends + real collaborator code, no LLM calls). Zero regressions:
the orchestrator characterization safety net + injection/restart/hooks suites stay green.

Correctness (concurrency races, all lock-free):
- R1: capture peer revision counts at injection-selection time and thread them
  through register/mark_seen (seen_counts=) so a peer revision published during the
  snapshot-copy await is not silently marked "seen" and stays injectable.
- R2/R3: consume only delivered subagent ids instead of a blind whole-key pop, so a
  background result appended during the await window survives.
- R4: cancel+await detached background trace tasks in ActiveCoordinationCleanup
  before flush, so they don't outlive the hard timeout.
- R5: gather cancelled background tasks before clearing the registry in
  cancel_all_subagents.

Latency:
- B1: offload the blocking snapshot copy (rmtree/copytree/scrub) to asyncio.to_thread
  so it no longer stalls the event loop for other agents. Landed after R1/R2 per the
  critical sequencing (the now-yielding copy would otherwise expose those races).
- C2: loguru brace-style deferred formatting for the per-injection debug logs
  (no eager multi-KB f-string when no DEBUG sink). NOTE: logger is loguru, not stdlib.

Refactor:
- A1: unify the two ~150-line near-duplicate get_injection_content closures into
  MidStreamInjectionHookInstaller.build_midstream_injection(..., native=); both setup
  paths delegate. Closes a backend-parity hazard. orchestrator.py 8,561 -> 8,422.

Reliability:
- D2: record + surface per-round worktree isolation degradation on AgentState
  instead of swallowing it into one log line.
- D3 (scoped): guard the changedoc enrichment so a filesystem error can't kill a
  valid-answer agent.

Hygiene:
- E1/E2: rewrite assertion-free test files (test_message_context_building,
  test_grok_backend) with real assertions verified against actual output.
- A3: correct stale orchestrator_refactor_roadmap status.
- B5: correct the now-false "workspace clears remove .massgen/" comment.

New tests: test_concurrency_race_fixes, test_snapshot_copy_offload,
test_answer_normalizer_debug_guard, test_midstream_injection_unified (60+ cases).

Deferred (documented in the plan): B2 incremental snapshot copy (needs B1 measured),
C3 save_agent_snapshot offload (would add a race on shared counters), B4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the 5 remaining hook-installation methods out of orchestrator.py into the
MidStreamInjectionHookInstaller collaborator, completing the roadmap's deferred
Step 29. A1 (prior commit) was the callback-unification prerequisite; with the
two injection closures already unified, these now extract cleanly:

- _setup_hook_manager_for_agent (dispatcher)
- _setup_native_hooks_for_agent
- _setup_codex_mcp_hooks
- _setup_codex_hybrid_hooks
- _register_round_timeout_hooks

Each becomes a thin orchestrator delegator. Cross-method calls between the moved
methods route through orch._<delegator> to preserve test monkeypatch-safety.
_codex_mcp_hook_agents is still written on the orchestrator (a test reads it there).

orchestrator.py 8,422 -> 7,910 lines. Removed the now-unused hook imports.

Validated: hooks/restart-and-external-tools/broadcast-subagents integration suites,
the 37-test characterization safety net, and the injection/decomposition/novelty
suites all green. Only pre-existing test_subagent_round_timeouts FileNotFound
failures remain (unrelated, confirmed identical on baseline).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…v0.1.94 release docs

Fixes the read-during-write race that the B1 event-loop offload exposed: the
offloaded peer-context copytree could overlap an owner's in-place rmtree+rebuild
of the same snapshot dir. Snapshots are now immutable and versioned --
save_snapshot (and the interrupted-turn save) publish <base>/.versions/<id>/v<N>
and atomically repoint the <base>/<id> symlink; readers acquire/refcount the
current version for their copy's duration. GC never deletes a pinned or in-flight
version. New SnapshotVersionStore coordinates publish/acquire/release/GC.

Also includes the earlier review cleanups on this branch:
- D2: emit_status was called with an invalid status= kwarg whose TypeError was
  swallowed, so worktree-isolation degradation never surfaced -- now fixed.
- A1: the triplicated _wait_interrupt_provider closure consolidated into one
  _install_wait_interrupt_provider helper (backend-parity drift removed).
- Interrupted-turn save no longer rmtree's the (now symlinked) snapshot path.

All under TDD with red-verified regression tests:
- test_snapshot_version_store.py (concurrent-publish-during-read, concurrent-
  publisher GC, refcount-protects-from-GC, symlink fallback)
- test_snapshot_versioned_save.py (versioned save, interrupted-over-symlink,
  orchestrator-reader pin wiring)
- test_wait_interrupt_provider.py (consolidated provider contract)
- test_concurrency_race_fixes.py (D2 emit signature)

Release documentation for v0.1.94 "Parallelism Hardening" (engineering health):
CHANGELOG, README (Latest Features / Recent Achievements / TOC), ROADMAP,
docs/source/index.rst, architecture module doc, and the docs/announcements/
rotation (archive v0.1.93, rewrite current-release.md, swap github-release file).
Version bumped to 0.1.94. Updated the release-documenter SKILL.md to document the
announcements rotation + version bump (previously undocumented).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lease-documenter skill

The fresh-release-branch bootstrap (the `feat: v0.1.X` commit that starts each
dev branch) was partially skipped for dev/v0.1.94: the forward-looking roadmap
file was never rolled forward. Apply it now and document the step so it isn't
missed again.

- Rename ROADMAP_v0.1.94.md -> ROADMAP_v0.1.95.md and repoint its content at the
  next release (image/video edit #959 now planned for v0.1.95, deferred range
  bumped to v0.1.86-v0.1.94, v0.1.93/v0.1.94 added to Related Tracks).
  (The __version__ bump, the other half of the bootstrap, landed in the prior
  release-docs commit.)
- Document the previously-undocumented "Phase 0: Fresh Release Branch Bootstrap"
  (version bump + ROADMAP_v0.1.X.md -> ROADMAP_v0.1.X+1.md rename) in the
  release-documenter SKILL.md, including order list and validation checklist
  entries, derived from how dev/v0.1.93 started (commit 2deb48b).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 5, 2026 16:32
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Review Change Stack

Caution

Review failed

An error occurred during the review process. Please try again later.

📝 Walkthrough

Walkthrough

This PR releases v0.1.94 "Parallelism Hardening," hardening the orchestrator's concurrent execution: immutable snapshot versioning with atomic symlink repointing and refcount-based pinning, snapshot-copy work offloaded to event-loop worker threads, mid-stream injection hook setup consolidated across backends into a unified installer, and six targeted concurrency race fixes (peer-answer revision capture, subagent-result consumption filtering, background-task cleanup, isolation tracking, and changedoc enrichment).

Changes

v0.1.94 Parallelism Hardening Engineering Health Release

Layer / File(s) Summary
Release documentation and version bump
CHANGELOG.md, README.md, README_PYPI.md, ROADMAP.md, ROADMAP_v0.1.95.md, docs/announcements/archive/v0.1.93.md, docs/announcements/current-release.md, docs/announcements/github-release-v0.1.94.md, docs/dev_notes/next_version_eng_health_plan.md, docs/dev_notes/orchestrator_refactor_roadmap.md, docs/modules/architecture.md, docs/source/index.rst, massgen/__init__.py, massgen/skills/massgen-release-documenter/SKILL.md
Documentation updates reflecting v0.1.94 release: CHANGELOG entry detailing snapshot versioning, off-loop offloading, and race fixes; README/README_PYPI content refresh with v0.1.94 features and v0.1.95 roadmap; version constant bump; engineering health plan documenting implementation status, sequencing rationale, and correctness race details.
Immutable snapshot versioning infrastructure
massgen/filesystem_manager/_snapshot_version_store.py
New SnapshotVersionStore class providing per-base registry for immutable snapshot versions under .versions/<agent_id>/v<N> with atomic symlink repointing, refcount-based reader pinning, and GC protecting in-flight and pinned versions from deletion.
Snapshot version store test coverage
massgen/tests/test_snapshot_version_store.py
Comprehensive tests covering version publication/republishing, acquisition semantics, GC behavior, concurrent reader/publisher interleavings, and symlink unsupported fallback.
Snapshot publishing refactor and offloading
massgen/filesystem_manager/_filesystem_manager.py, massgen/orchestrator_collaborators/snapshot_manager.py
Refactored save_snapshot() to use SnapshotVersionStore.publish_version() for immutable publication; updated copy_snapshots_to_temp_workspace() to offload blocking filesystem work to asyncio.to_thread() worker thread; snapshot manager pins versions during peer-snapshot copying and rewrites stale paths in destination workspace.
Snapshot copy offloading validation
massgen/tests/test_snapshot_copy_offload.py
Regression tests validating snapshot copy operations run on worker threads (off event loop), event loop remains responsive during copy, functional semantics preserved including metadata exclusion, and stale workspace cleanup.
Snapshot versioned save integration tests
massgen/tests/test_snapshot_versioned_save.py
End-to-end integration tests for snapshot version publishing, GC cleanup, reader pinning/unpinning, interrupted-turn partial saves over published symlinks, and concurrent republish race handling.
Mid-stream injection hook consolidation
massgen/orchestrator_collaborators/midstream_injection_hook_installer.py
Major expansion: unified hook setup routing (Codex hybrid/native/MCP/fallback), backend-specific Codex MCP/hybrid/native wiring methods, round timeout hook registration with soft/hard coordinated hooks, and unified build_midstream_injection() callback consolidating per-backend injection construction, gating (disabled, cap, vote-only, deferred policy), snapshot pinning, revision-count capture, and checklist/context ordering enforcement.
Mid-stream injection unified behavior and wait interrupt tests
massgen/tests/test_midstream_injection_unified.py, massgen/tests/test_wait_interrupt_provider.py
Regression tests validating unified mid-stream injection behavior across native/non-native paths, load-bearing ordering invariants (context update before checklist refresh), early-exit conditions (disabled, vote-only, cap reached), and consolidated wait-interrupt provider contract (cancellation vs. runtime injection fallback).
Concurrency race fixes: peer answer visibility and result consumption
massgen/orchestrator.py, massgen/orchestrator_collaborators/peer_answer_visibility_tracker.py, massgen/orchestrator_collaborators/subagent_lifecycle_coordinator.py
R1 fix: peer "mark seen" now uses revision counts captured before async await (clamped to live counts, never regresses). R2/R3 fix: pending-result consumption removes only delivered subagent ids (not whole keys), preserving concurrent appends. Corresponding API extensions with optional seen_counts parameter.
Concurrency race fixes: task cleanup, isolation tracking, and changedoc enrichment
massgen/orchestrator.py, massgen/orchestrator_collaborators/active_coordination_cleanup.py, massgen/subagent/manager.py, massgen/orchestrator_collaborators/answer_text_normalizer.py
R4 fix: pre-flush cancellation of background trace tasks. R5 fix: cancel_all_subagents() awaits background task cancellation completion before clearing registry. D2 fix: AgentState records round isolation degradation with error text and emits status. D3 fix: centralized _attach_changedoc_to_latest_answer() helper swallows read failures. C2 fix: debug logging uses deferred Loguru formatting to avoid eager string construction.
Comprehensive concurrency race fixes regression test suite
massgen/tests/test_concurrency_race_fixes.py
Deterministic regression tests covering R1 (peer revision capture, fallback, clamping), R2/R3 (pending result consumption filtering and concurrent append simulation), R4 (trace task cleanup skipping already-done), R5 (cancel-all task await semantics), D2 (isolation degradation recording and status emission), D3 (changedoc attachment happy path and read-failure resilience).
Test infrastructure cleanup: debug guard and script conversion
massgen/tests/test_answer_normalizer_debug_guard.py, massgen/tests/test_grok_backend.py, massgen/tests/test_message_context_building.py
New debug-logging guard test validating deferred formatting behavior and answer path rewriting. Converted manual test scripts to proper pytest test cases with appropriate markers (@pytest.mark.live_api) and offline/live test separation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • massgen/MassGen#1053: Directly connected via refactored mid-stream injection/Codex hook wiring and the unified wait-interrupt provider installation.
  • massgen/MassGen#896: Related via changedoc attachment workflow refactoring in orchestrator (main centralizes attachment, retrieved implements broader changedoc integration).
  • massgen/MassGen#964: Related via snapshot lifecycle changes in save_snapshot() code path (retrieved changes publication behavior, main hardens versioning and offloading).

Suggested reviewers

  • a5507203
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev/v0.1.94

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cuts the v0.1.94 release (“Parallelism Hardening / Engineering Health”) by updating release documentation and shipping a set of orchestrator correctness + concurrency improvements (offloading blocking snapshot-copy work, hardening snapshot storage against read/write races, fixing several yield-window races, and unifying mid-stream injection behavior across backends).

Changes:

  • Bumps version + updates release/roadmap/readme/announcement docs for v0.1.94 (and rolls the forward-looking roadmap to v0.1.95).
  • Moves snapshot copy work off the event loop and introduces immutable, versioned snapshots to eliminate read-during-write corruption/races.
  • Fixes multiple concurrency/teardown races and adds substantial regression test coverage for the new behavior.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
ROADMAP.md Bump current version/date; move deferred item to v0.1.95; add v0.1.94 completed section.
ROADMAP_v0.1.95.md Rename/update roadmap content for v0.1.95; add related track notes.
README.md Update “Latest Features”, install version, achievements, and roadmap anchors for v0.1.94/v0.1.95.
README_PYPI.md Sync README changes for PyPI presentation (v0.1.94 updates).
CHANGELOG.md Add v0.1.94 release entry describing concurrency hardening + tests.
massgen/init.py Bump __version__ to 0.1.94.
docs/source/index.rst Add v0.1.94 entry to “Recent Releases”.
docs/modules/architecture.md Document immutable/versioned snapshot design + caveats.
docs/dev_notes/orchestrator_refactor_roadmap.md Update refactor roadmap status for hook installer extraction/unification.
docs/dev_notes/next_version_eng_health_plan.md Add engineering health plan writeup (new doc).
docs/announcements/github-release-v0.1.94.md Add GitHub release highlights for v0.1.94 (new).
docs/announcements/github-release-v0.1.93.md Remove prior release highlights file.
docs/announcements/current-release.md Update current release announcement content to v0.1.94.
docs/announcements/archive/v0.1.93.md Archive v0.1.93 announcement (new).
massgen/subagent/manager.py Await cancelled background tasks before clearing registry (R5).
massgen/orchestrator.py Delegate hook setup to collaborator; add revision-count capture plumbing; add D2/D3 helpers; integrate subagent consume + worktree degradation surfacing.
massgen/orchestrator_collaborators/subagent_lifecycle_coordinator.py Add consumed-id based pending-result consumption (R2/R3).
massgen/orchestrator_collaborators/snapshot_manager.py Acquire/release pinned snapshot versions for safe copying; publish interrupted-turn snapshots via version store.
massgen/orchestrator_collaborators/peer_answer_visibility_tracker.py Support captured revision counts to prevent “seen” drift across await (R1).
massgen/orchestrator_collaborators/midstream_injection_hook_installer.py Centralize hook setup, unify midstream injection callback, dedupe background-wait interrupt provider.
massgen/orchestrator_collaborators/answer_text_normalizer.py Switch debug logging to deferred formatting (avoid eager formatting cost).
massgen/orchestrator_collaborators/active_coordination_cleanup.py Cancel/await detached trace tasks during cleanup (R4).
massgen/filesystem_manager/_snapshot_version_store.py New immutable/versioned snapshot publisher + pinning + GC.
massgen/filesystem_manager/_filesystem_manager.py Publish snapshots via version store; offload snapshot copy to worker thread via asyncio.to_thread.
massgen/tests/test_wait_interrupt_provider.py Tests for consolidated background-wait interrupt provider.
massgen/tests/test_snapshot_versioned_save.py Integration tests for versioned snapshot publish + pinning behavior.
massgen/tests/test_snapshot_version_store.py Unit tests for version store versioning/pinning/GC + concurrency scenarios.
massgen/tests/test_snapshot_copy_offload.py Regression tests ensuring snapshot copy is offloaded and loop remains responsive.
massgen/tests/test_midstream_injection_unified.py Regression tests asserting unified injection behavior/order parity across paths.
massgen/tests/test_message_context_building.py Convert print-based “tests” into real assertions.
massgen/tests/test_grok_backend.py Convert print-based “tests” into offline assertions + gated live-api tests.
massgen/tests/test_concurrency_race_fixes.py Regression suite for multiple concurrency races + D2/D3 behavior.
massgen/tests/test_answer_normalizer_debug_guard.py Tests ensuring debug messages are not eagerly formatted without a DEBUG sink.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread massgen/filesystem_manager/_snapshot_version_store.py
@Henry-811 Henry-811 merged commit cb2aef7 into main Jun 5, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants