fix: prevent action data loss from partial line reads in _read_action_log by voidborne-d · Pull Request #460 · 666ghj/MiroFish

voidborne-d · 2026-04-03T17:10:34Z

Problem

During active simulations, the monitor thread can permanently lose agent actions from the JSONL action log (twitter/actions.jsonl, reddit/actions.jsonl).

Root cause

_read_action_log() uses for line in f to iterate the log file while the simulation subprocess is actively writing to it. When the writer hasn't flushed a complete line yet (no trailing \n), Python's file iterator returns the partial data as a line. The sequence:

Writer appends: {"agent_id": 1, "action_type": "CRE (partial, no \n yet)
Reader iterates, gets partial line → json.loads fails → silently skipped (except json.JSONDecodeError: pass)
Reader returns f.tell() which is past the partial data
Writer completes: ATE_POST", "round": 1}\n
Next poll starts from the new position (mid-line), reads: ATE_POST", "round": 1}\n → json.loads fails again
❌ The action is permanently lost

This affects any simulation with concurrent write/read — which is every running simulation, since the monitor thread polls every 2 seconds while the subprocess writes actions continuously.

Additional issue

The original code also called f.tell() after a for line in f loop. In CPython, tell() is disabled inside a file iterator and raises OSError: telling position disabled by next() call. The current code only avoids this because f.tell() is called after the loop exits — but it reports the position at EOF, including any partial data.

Fix

Replace for line in f with f.readline() loop (which supports interleaved f.tell() calls)
Track safe_position: only advance after reading a complete line (ends with \n)
Partial lines (no trailing \n) cause the reader to stop and retry on the next poll — no data is skipped

# Before (buggy):
for line in f:
    # ... process line ...
return f.tell()  # may be past partial data!

# After (safe):
safe_position = position
while True:
    line = f.readline()
    if not line:
        break  # EOF
    if not line.endswith("\n"):
        break  # partial line — retry next poll
    safe_position = f.tell()
    # ... process line ...
return safe_position  # never past incomplete data

Tests

Added 16 tests covering:

Category	Tests
Partial line regression	`test_partial_line_not_consumed`, `test_partial_line_recovered_on_next_poll`, `test_multiple_partial_writes_no_loss`
Basic functionality	`test_empty_file`, `test_single_complete_action`, `test_multiple_actions`, `test_event_lines_parsed`, `test_simulation_end_event`, `test_incremental_reading`
Edge cases	`test_malformed_json_skipped`, `test_blank_lines_skipped`, `test_unicode_content`, `test_nonexistent_file`, `test_reddit_platform`, `test_only_partial_line_no_advance`
Concurrent simulation	`test_writer_mid_line_then_completes`

All 16 tests pass.

Changed files

backend/app/services/simulation_runner.py — fix _read_action_log() to use readline() + safe_position tracking
backend/tests/test_read_action_log.py — 16 new tests (with lightweight dependency stubs)
backend/tests/__init__.py — new test package

…_log When the simulation subprocess writes a JSONL action line, the monitor thread may read it before the write is flushed (no trailing newline). The old code used `for line in f` + `f.tell()` which advanced past the partial data. On the next poll the reader started mid-line, causing both halves to fail json.loads — the action was permanently lost. Root cause: `for line in f` returns partial lines at EOF (without \n) and then `f.tell()` reports a position past the incomplete data. Fix: - Switch from `for line in f` to `f.readline()` (which supports interleaved `f.tell()` calls) - Track `safe_position`: only advance after reading a complete line (one that ends with \n) - Partial lines (no trailing \n) cause the reader to stop and retry on the next poll cycle — no data is skipped Added 16 tests covering: - Partial line not consumed (core regression) - Partial line recovered on next poll - Multiple partial writes with no loss - Concurrent writer simulation - Basic action/event reading - Edge cases (malformed JSON, blank lines, Unicode, nonexistent file)

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent action data loss from partial line reads in _read_action_log#460

fix: prevent action data loss from partial line reads in _read_action_log#460
voidborne-d wants to merge 1 commit into666ghj:mainfrom
voidborne-d:fix/action-log-partial-line-data-loss

voidborne-d commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

voidborne-d commented Apr 3, 2026

Problem

Root cause

Additional issue

Fix

Tests

Changed files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant