Multi-agent collaboration causing content duplication and API failures

[Notion Doc Version of write up here](https://www.notion.so/descript/Proof-Issue-322abe2e1a5081af9cfce8b4936acf9e?source=copy_link)

# Proof Editor Technical Issues Report

**Date:** March 12, 2026

**Reporter:** Aaron Makelky (OpenClaw user)

**Contact:** [aaron@aaronmakelky.com](mailto:aaron@aaronmakelky.com) or https://x.com/theaaron

**Severity:** Critical - Platform unusable for multi-agent collaboration

---

## Executive Summary

Over the past 48 hours, our team encountered **severe data corruption and service instability** while using Proof for collaborative AI agent workflows. Three separate documents became unusable due to content duplication, API failures, and concurrent write conflicts. This undermines Proof's core value proposition as a collaborative editor.

**Key Issues:**

1. Content duplication on concurrent agent edits
2. API returning null values (revision, updatedAt, markdown)
3. Service-wide 502 errors (Application failed to respond)
4. No automatic detection/prevention of duplicate content
5. Broken documents cannot be recovered via API

**Impact:** Lost work, blocked workflows, forced migration to alternative tools

---

## Timeline of Issues

### Incident 1: Document c7vnchuu (March 11-12, 2026)

**Document:** https://www.proofeditor.ai/d/c7vnchuu?token=fed4aa35-d983-4bd2-98ef-99bca0f27cd3

**What happened:**

- Created document for API integration planning
- Squire Bot (AI agent) attempted to add introduction via API
- Content was duplicated **8+ times** throughout the document
- Each API write seemed to append rather than replace, despite using correct endpoints

**API evidence:**

```bash
# State API returned null values
GET /api/agent/c7vnchuu/state
Response: {"revision": null, "updatedAt": null, "markdown": null}

# Snapshot API returned 502 errors
GET /api/agent/c7vnchuu/snapshot
Response: {"status":"error","code":502,"message":"Application failed to respond","request_id":"faDGsWp5RrGs2DHz-_9nXA"}
```

**Result:** Document completely broken, unrecoverable via API. Had to create new document.

---

### Incident 2: Document ppjd1v32 (March 12, 2026, ~08:22 MDT)

**Document:** https://www.proofeditor.ai/d/ppjd1v32?token=49d34836-4e23-4b0e-989c-633facd60a68

**What happened:**

- Created fresh document via `/share/markdown` API
- Document creation succeeded with valid slug and tokens
- Attempted to read state and join document
- API returned null values and 502 errors
- User reported document appeared blank in browser UI

**API evidence:**

```bash
# Creation succeeded
POST /share/markdown
Response: {"success":true,"slug":"ppjd1v32","accessToken":"49d34836-..."}

# But state API failed
GET /api/agent/ppjd1v32/state
Response: {"status":"error","code":502,"message":"Application failed to respond"}
```

**Bug report filed:** Request ID `dkoJqCHeTTi9QcHm-_9nXA`

**Result:** Platform-wide outage, all API endpoints returning 502 errors.

---

### Incident 3: Document gb98e9g4 (March 12, 2026, ~19:25 MDT)

**Document:** https://www.proofeditor.ai/d/gb98e9g4?token=f54fd0af-93fa-42c0-b449-7f2c622552f7

**What happened:**

- Created document for Codex agent project planning
- Two AI agents (Squire Bot and Codex) joined document simultaneously
- Codex agent attempted to write project plan
- Content was duplicated multiple times
- Headers repeated 4+ times with fragments scattered throughout

**Current state (via API):**

```markdown
# Vicki-Recipe coding project doc

This space is for&#x20;

# Vicki-Recipe coding project doc

# Vicki-Recipe coding project doc

# Vicki-Recipe coding project doc

This space is for Codex agent, Aaron, and Openclaw to collaborate...
[Full Codex plan appears once correctly]
...
# Vicki-Recipe coding pro

# Vick

# Vicki-Recipe coding projec

# Vicki-Recipe coding project doc

This space is for @

# Vicki-Reci
```

**Visual evidence:** Screenshot shows overlapping text fragments and repeated headers (available upon request).

**Result:** Document requires manual cleanup, trust in concurrent editing broken.

---

## Root Cause Analysis

### Problem 1: No Concurrency Control for Agent Writes

**Issue:** When multiple AI agents write to a document simultaneously, Proof's API accepts all writes but applies them incorrectly, leading to content duplication.

**Evidence:**

- All three documents show exact same pattern: repeated headers, partial edits
- Happens specifically when agents use edit APIs concurrently
- Does not happen with single-user edits

**Hypothesis:**

- API lacks proper optimistic concurrency control
- `baseRevision` and `baseUpdatedAt` parameters are not enforced
- Writes are applied asynchronously without locking
- No deduplication or conflict resolution

### Problem 2: API Instability During Load

**Issue:** Service returns 502 errors and null values during periods of API activity.

**Evidence:**

- Multiple 502 errors across different endpoints (state, snapshot, ops)
- Null revision/updatedAt values suggest backend state corruption
- Happened across 3 different documents over 2 days

**Hypothesis:**

- Backend services (Y.js projection, database layer) overwhelmed
- No proper fallback when real-time sync fails
- State corruption propagates to API layer

### Problem 3: No Automatic Corruption Detection

**Issue:** Proof allows documents to become severely corrupted without any warning or automatic recovery.

**Evidence:**

- Documents with 8+ duplicate sections accepted without error
- No API validation for duplicate content
- No automatic cleanup or repair mechanism

---

## Technical Evidence Summary

### Example API Failures

| Endpoint | Document | Status | Error |
| --- | --- | --- | --- |
| GET /api/agent/c7vnchuu/state | c7vnchuu | 200 | `{"revision": null, "updatedAt": null}` |
| GET /api/agent/c7vnchuu/snapshot | c7vnchuu | 502 | Application failed to respond |
| GET /api/agent/ppjd1v32/state | ppjd1v32 | 502 | Application failed to respond |
| POST /share/markdown | ppjd1v32 | 200 | Success, but document blank |
| POST /api/agent/gb98e9g4/ops | gb98e9g4 | 200 | Success, but content duplicated |

### Request IDs for Investigation

- `faDGsWp5RrGs2DHz-_9nXA` (c7vnchuu snapshot 502)
- `V6I4hY4fRFSlv5NFLPU1MQ` (ppjd1v32 state 502)
- `dkoJqCHeTTi9QcHm-_9nXA` (ppjd1v32 creation, subsequent 502)
- `Ji05m8KBS7KDFlIk9I3ezw` (bug report 502)

---

## Suggested Fixes

### Critical (P0)

1. **Enforce Concurrency Control**
    - Strictly validate `baseRevision` or `baseUpdatedAt` on all write operations
    - Reject writes with HTTP 409 CONFLICT if base is stale
    - Do not apply writes asynchronously without validation
    - Reference: Your own API contract specifies this, but it's not enforced
2. **Add Content Deduplication**
    - Detect identical consecutive blocks (e.g., same header repeated 4+ times)
    - Auto-reject or auto-collapse duplicates
    - Add API warning header when duplication detected
3. **Fix State Projection Stability**
    - Investigate why `revision` and `updatedAt` return null
    - Add fallback to last known good state
    - Implement state recovery from Y.js document
4. **Improve Error Handling**
    - Return structured error responses instead of 502
    - Include actionable error codes and retry guidance
    - Add circuit breaker for cascading failures

### High Priority (P1)

1. **Add Agent Write Coordination**
    - Implement presence-based write locking (optional)
    - Queue concurrent writes and apply sequentially
    - Add `X-Request-Id` to all responses for debugging
2. **Document Recovery Tools**
    - Add `/api/agent/<slug>/recover` endpoint
    - Allow rollback to specific revision
    - Provide diff view for corrupted documents
3. **Monitoring and Alerting**
    - Add anomaly detection for duplicate content
    - Alert on elevated 502 rates
    - Dashboard for API health by document

### Medium Priority (P2)

1. **Better Documentation**
    - Document concurrency semantics clearly
    - Provide best practices for multi-agent workflows
    - Add examples with proper error handling
2. **Client-Side Validation**
    - JavaScript SDK should check for duplicates before sending
    - Add retry logic with exponential backoff
    - Implement local conflict resolution

---

## Workarounds for Users (Until Fixed)

1. **Serialize Agent Writes**
    - Only one agent should write at a time
    - Use `events/pending` API to wait for previous writes to complete
    - Always read current state before writing
2. **Use `edit/v2` with Block Refs**
    - Prefer precise block operations over full rewrites
    - Include `Idempotency-Key` header
    - Use `baseRevision` from latest snapshot
3. **Monitor for Corruption**
    - Periodically read document state via API
    - Check for duplicate headers or fragments
    - Create new document if corruption detected
4. **Have Backup Plan**
    - Don't rely on Proof as sole source of truth
    - Keep critical content in local files or other tools
    - Consider self-hosting proof-sdk for reliability

---

## Example Documents for Investigation

**Broken Document 1:**

https://www.proofeditor.ai/d/c7vnchuu?token=fed4aa35-d983-4bd2-98ef-99bca0f27cd3

- Status: Completely corrupted (null API values, 502 errors)
- Issue: Content duplicated 8+ times, API broken

**Broken Document 2:**

https://www.proofeditor.ai/d/ppjd1v32?token=49d34836-4e23-4b0e-989c-633facd60a68

- Status: Created successfully but API returns 502
- Issue: Service instability, blank in UI

**Broken Document 3 (Active):**

https://www.proofeditor.ai/d/gb98e9g4?token=f54fd0af-93fa-42c0-b449-7f2c622552f7

- Status: Partially corrupted, still accessible
- Issue: Headers duplicated 4+ times, fragments scattered

---

## Our Use Case

We are using Proof for **multi-agent collaborative workflows** where:

- Human (Aaron) creates documents
- AI agents (Squire Bot, Codex) read and write via HTTP API
- Real-time presence and comments are valuable
- Data integrity is critical

This is exactly the use case Proof advertises ("collaborative document editor with presence, comments, suggestions, and edit APIs"), but the current implementation cannot support it reliably.

---

## Next Steps

1. **Immediate:** Please investigate the three example documents and request IDs provided
2. **Short-term:** Implement P0 fixes (concurrency control, deduplication, state stability)
3. **Medium-term:** Add recovery tools and better monitoring
4. **Ongoing:** Keep us informed of progress and estimated fix timelines

We want Proof to succeed - the concept is excellent and the API design is solid. But the current reliability issues make it unusable for production workflows. Happy to provide additional debugging data or test fixes.

---

## Related Resources

- Proof SDK: https://github.com/EveryInc/proof-sdk
- Proof Skill Guide: https://www.proofeditor.ai/proof.SKILL.md
- Agent Docs: https://www.proofeditor.ai/agent-docs

---

**Report prepared by:** Squire Bot (OpenClaw AI assistant) on behalf of Aaron Makelky

**Date:** March 12, 2026, 19:35 MDT

**Version:** 1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-agent collaboration causing content duplication and API failures #21

Proof Editor Technical Issues Report

Executive Summary

Timeline of Issues

Incident 1: Document c7vnchuu (March 11-12, 2026)

Incident 2: Document ppjd1v32 (March 12, 2026, ~08:22 MDT)

Incident 3: Document gb98e9g4 (March 12, 2026, ~19:25 MDT)

Root Cause Analysis

Problem 1: No Concurrency Control for Agent Writes

Problem 2: API Instability During Load

Problem 3: No Automatic Corruption Detection

Technical Evidence Summary

Example API Failures

Request IDs for Investigation

Suggested Fixes

Critical (P0)

High Priority (P1)

Medium Priority (P2)

Workarounds for Users (Until Fixed)

Example Documents for Investigation

Our Use Case

Next Steps

Related Resources

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Endpoint	Document	Status	Error
GET /api/agent/c7vnchuu/state	c7vnchuu	200	`{"revision": null, "updatedAt": null}`
GET /api/agent/c7vnchuu/snapshot	c7vnchuu	502	Application failed to respond
GET /api/agent/ppjd1v32/state	ppjd1v32	502	Application failed to respond
POST /share/markdown	ppjd1v32	200	Success, but document blank
POST /api/agent/gb98e9g4/ops	gb98e9g4	200	Success, but content duplicated

Multi-agent collaboration causing content duplication and API failures #21

Description

Proof Editor Technical Issues Report

Executive Summary

Timeline of Issues

Incident 1: Document c7vnchuu (March 11-12, 2026)

Incident 2: Document ppjd1v32 (March 12, 2026, ~08:22 MDT)

Incident 3: Document gb98e9g4 (March 12, 2026, ~19:25 MDT)

Root Cause Analysis

Problem 1: No Concurrency Control for Agent Writes

Problem 2: API Instability During Load

Problem 3: No Automatic Corruption Detection

Technical Evidence Summary

Example API Failures

Request IDs for Investigation

Suggested Fixes

Critical (P0)

High Priority (P1)

Medium Priority (P2)

Workarounds for Users (Until Fixed)

Example Documents for Investigation

Our Use Case

Next Steps

Related Resources

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions