UI stuck at 10% 'Analyzing' while backend completes successfully - task_logs.json parsing failures

## Summary

The Electron UI shows a task stuck at 10% progress on "Analyzing project structure..." for 24+ hours, while the Python backend actually completed all work successfully (8/8 subtasks, QA approved, ready to merge). The UI never updates because `task_logs.json` encounters JSON parsing errors during live updates, causing `TaskLogService` to fail silently and freeze the progress display.

This leads users to believe tasks are stuck and delete completed work.

## Environment

- **Auto-Claude UI Version**: v2.3.0
- **Latest Commit**: `a20b8cf` (Enhance task status handling...)
- **OS**: macOS 14.6.0 (Darwin 24.6.0)
- **Node.js**: v24.11
- **Python**: 3.14 (in venv)
- **Project**: Medium-sized React/TypeScript project

## Problem Description

### What the User Saw

1. Started a new task via Auto-Claude UI
2. Progress bar showed "Analyzing project structure... 10%"
3. Progress remained at 10% for ~24 hours despite high CPU activity
4. Terminal showed repeated JSON parsing errors in `task_logs.json`
5. User assumed task was stuck and deleted it
6. **Actual state**: All work was complete, QA approved, ready to merge

### What Actually Happened (Backend)

The Python backend (`spec_runner.py` → `run.py` → agents) completed successfully:

- ✅ All 8 subtasks completed
- ✅ 90/90 unit tests passing
- ✅ TypeScript build: SUCCESS
- ✅ Type check: SUCCESS  
- ✅ Lint: 0 errors
- ✅ QA validation: APPROVED at 2025-12-18T16:17:00Z
- ✅ 11 commits on feature branch
- ✅ Status in `implementation_plan.json`: `"qa_signoff": { "status": "approved" }`

### Root Cause

The Electron UI's `TaskLogService` (`src/main/task-log-service.ts`) watches `task_logs.json` and attempts to parse it every few seconds. During active agent sessions, the file is being written to concurrently, resulting in **transient malformed JSON**.

**Terminal errors observed:**
```
[TaskLogService] Failed to load logs from .../task_logs.json: 
  SyntaxError: Expected ',' or '}' after property value in JSON at position 128794 (line 584 column 7712)

[TaskLogService] Failed to load logs from .../task_logs.json:
  SyntaxError: Expected ',' or '}' after property value in JSON at position 346013 (line 1984 column 8331)

[TaskLogService] Failed to load logs from .../task_logs.json:
  SyntaxError: Unexpected end of JSON input
```

When parsing fails, `TaskLogService.loadLogsFromPath()` catches the error, logs it, and **returns `null`**. The UI receives no updates, and the progress bar freezes at its last known state (10%).

## Expected Behavior

1. **UI should show task progress** even if `task_logs.json` has transient parsing errors
2. **Progress should advance** as phases complete (10% → 25% → 50% → 75% → 100%)
3. **Final status should reflect QA approval**, not show "stuck at Analyze 10%"
4. **User should see "Ready to Merge"** notification when QA approves

## Actual Behavior

1. ✗ UI shows "Analyzing... 10%" indefinitely
2. ✗ Progress bar never advances past 10%
3. ✗ No indication that backend completed successfully
4. ✗ User deletes what they believe is a stuck task (losing completed work)

## Technical Details

### File Sizes at Completion

```
task_logs.json: 884,531 bytes (864 KB)
Lines: 6,432
```

### Progress Tracking Logic

From `auto-claude-ui/src/main/agent/agent-process.ts:216-221`:

```typescript
// Reset phase progress on phase change, otherwise increment
if (phaseChanged) {
  phaseProgress = 10; // Start new phase at 10%
} else {
  phaseProgress = Math.min(90, phaseProgress + 5); // Increment within phase
}
```

**Problem**: When the first phase ("Analyze") starts, progress is set to 10%. If subsequent progress updates fail due to JSON parsing errors, it never advances.

### JSON Parsing Logic

From `auto-claude-ui/src/main/task-log-service.ts:43-50`:

```typescript
try {
  const content = readFileSync(logFile, 'utf-8');
  const logs = JSON.parse(content) as TaskLogs;
  return logs;
} catch (error) {
  console.error(`[TaskLogService] Failed to load logs from ${logFile}:`, error);
  return null; // ← Silent failure
}
```

**Problem**: No retry logic, no incremental parsing, no fallback to last known good state.

## Steps to Reproduce

1. Create a new task on a medium-to-large project (triggers long-running spec creation)
2. Observe terminal output for `[TaskLogService]` JSON parsing errors
3. Note that UI progress remains at "Analyzing... 10%" despite errors
4. Let task run to completion (~hours for complex projects)
5. Check `implementation_plan.json` - will show `"qa_signoff": { "status": "approved" }`
6. UI still shows "Analyzing... 10%" - disconnect between UI and reality

## Impact

- **Critical**: Users delete completed work thinking tasks are stuck
- **Wasted compute**: Hours of Claude API usage discarded
- **Loss of trust**: Users lose confidence in Auto-Claude's reliability
- **Poor UX**: No visibility into actual backend progress

## Suggested Fixes

### 1. Graceful JSON Parsing (Short-term)

```typescript
// task-log-service.ts
private loadLogsFromPath(logFile: string): TaskLogs | null {
  try {
    const content = readFileSync(logFile, 'utf-8');
    const logs = JSON.parse(content) as TaskLogs;
    this.lastKnownGoodState = logs; // Cache last good parse
    return logs;
  } catch (error) {
    console.warn(`[TaskLogService] JSON parse failed, using cached state:`, error.message);
    return this.lastKnownGoodState ?? null; // Fall back to cache
  }
}
```

### 2. File Locking or Atomic Writes (Medium-term)

Ensure Python agents write to `task_logs.json.tmp`, then atomically rename to `task_logs.json` to prevent reading mid-write.

### 3. Streaming JSON or NDJSON Format (Long-term)

Replace single JSON object with newline-delimited JSON (NDJSON) where each log entry is a separate line. Allows reading partial files without parsing errors.

### 4. Direct IPC from Python Backend (Long-term)

Instead of polling file changes, have Python agents send progress updates via IPC/WebSocket directly to Electron main process.

### 5. UI Fallback to `implementation_plan.json` (Short-term)

If `task_logs.json` fails to parse, read `implementation_plan.json` to determine actual phase/subtask status:

```typescript
// Check subtask completion percentage
const completed = phases.flatMap(p => p.subtasks).filter(s => s.status === 'completed').length;
const total = phases.flatMap(p => p.subtasks).length;
const progress = Math.round((completed / total) * 100);
```

## Evidence

### Terminal Output Showing Repeated Errors

```
[TaskLogService] Failed to load logs from .../task_logs.json: SyntaxError: Expected ',' or '}' after property value in JSON at position 128794 (line 584 column 7712)
[TaskLogService] Failed to load logs from .../task_logs.json: SyntaxError: Expected ',' or '}' after property value in JSON at position 346013 (line 1984 column 8331)
[TaskLogService] Failed to load logs from .../task_logs.json: SyntaxError: Unexpected end of JSON input
```

*(Repeated dozens of times during task execution)*

### Worktree Contents Prove Completion

```bash
$ git log --oneline auto-claude/001-example-feature --not main | head -5
abc1234 qa: Sign off - all verification passed
def5678 auto-claude: subtask-4-2 - Export new utility functions
ghi9012 fix: Restore implementation plan status after accidental reset
jkl3456 auto-claude: subtask-4-1 - Run full test suite and fix any failures
mno7890 auto-claude: subtask-3-2 - Visual verification and edge case testing
```

### QA Report Confirms Approval

From `.auto-claude/specs/.../qa_report.md`:

```markdown
## Verdict

**SIGN-OFF**: APPROVED ✓

**Reason**:
All automated verification steps pass:
- All 8 implementation subtasks completed
- 90/90 unit tests passing for new functions
- TypeScript compilation succeeds with no errors
- Lint passes with 0 errors
- Production build succeeds
```

## Related Issues

After searching, no duplicate issues found for this specific problem (UI stuck at 10% with JSON parsing failures).

Possibly related:
- #38 - Planning phase failures (validation mismatch, but different symptom)
- #11 - Tasks skip to human review (missing auth, but different symptom)

## Workaround (For Users Who Hit This)

If your UI shows a task stuck at "Analyzing... 10%" for hours:

1. **Check the actual status** in the worktree:
   ```bash
   cat .worktrees/{task-name}/.auto-claude/specs/{task-name}/implementation_plan.json | grep "qa_signoff" -A 5
   ```

2. **If QA approved**, the work is done! Merge it:
   ```bash
   git checkout main
   git merge auto-claude/{task-name}
   git worktree remove .worktrees/{task-name}
   ```

3. **If still in progress**, check last subtask status:
   ```bash
   cat .worktrees/{task-name}/.auto-claude/specs/{task-name}/implementation_plan.json | grep '"status"' | tail -5
   ```

---

**Request**: Please prioritize this issue as it causes users to discard completed work, wasting both time and API costs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

UI stuck at 10% 'Analyzing' while backend completes successfully - task_logs.json parsing failures #42

Summary

Environment

Problem Description

What the User Saw

What Actually Happened (Backend)

Root Cause

Expected Behavior

Actual Behavior

Technical Details

File Sizes at Completion

Progress Tracking Logic

JSON Parsing Logic

Steps to Reproduce

Impact

Suggested Fixes

1. Graceful JSON Parsing (Short-term)

2. File Locking or Atomic Writes (Medium-term)

3. Streaming JSON or NDJSON Format (Long-term)

4. Direct IPC from Python Backend (Long-term)

5. UI Fallback to `implementation_plan.json` (Short-term)

Evidence

Terminal Output Showing Repeated Errors

Worktree Contents Prove Completion

QA Report Confirms Approval

Related Issues

Workaround (For Users Who Hit This)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

UI stuck at 10% 'Analyzing' while backend completes successfully - task_logs.json parsing failures #42

Description

Summary

Environment

Problem Description

What the User Saw

What Actually Happened (Backend)

Root Cause

Expected Behavior

Actual Behavior

Technical Details

File Sizes at Completion

Progress Tracking Logic

JSON Parsing Logic

Steps to Reproduce

Impact

Suggested Fixes

1. Graceful JSON Parsing (Short-term)

2. File Locking or Atomic Writes (Medium-term)

3. Streaming JSON or NDJSON Format (Long-term)

4. Direct IPC from Python Backend (Long-term)

5. UI Fallback to implementation_plan.json (Short-term)

Evidence

Terminal Output Showing Repeated Errors

Worktree Contents Prove Completion

QA Report Confirms Approval

Related Issues

Workaround (For Users Who Hit This)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

5. UI Fallback to `implementation_plan.json` (Short-term)