-
Notifications
You must be signed in to change notification settings - Fork 107
Multi-agent collaboration causing content duplication and API failures #21
Description
Notion Doc Version of write up here
Proof Editor Technical Issues Report
Date: March 12, 2026
Reporter: Aaron Makelky (OpenClaw user)
Contact: aaron@aaronmakelky.com or https://x.com/theaaron
Severity: Critical - Platform unusable for multi-agent collaboration
Executive Summary
Over the past 48 hours, our team encountered severe data corruption and service instability while using Proof for collaborative AI agent workflows. Three separate documents became unusable due to content duplication, API failures, and concurrent write conflicts. This undermines Proof's core value proposition as a collaborative editor.
Key Issues:
- Content duplication on concurrent agent edits
- API returning null values (revision, updatedAt, markdown)
- Service-wide 502 errors (Application failed to respond)
- No automatic detection/prevention of duplicate content
- Broken documents cannot be recovered via API
Impact: Lost work, blocked workflows, forced migration to alternative tools
Timeline of Issues
Incident 1: Document c7vnchuu (March 11-12, 2026)
Document: https://www.proofeditor.ai/d/c7vnchuu?token=fed4aa35-d983-4bd2-98ef-99bca0f27cd3
What happened:
- Created document for API integration planning
- Squire Bot (AI agent) attempted to add introduction via API
- Content was duplicated 8+ times throughout the document
- Each API write seemed to append rather than replace, despite using correct endpoints
API evidence:
# State API returned null values
GET /api/agent/c7vnchuu/state
Response: {"revision": null, "updatedAt": null, "markdown": null}
# Snapshot API returned 502 errors
GET /api/agent/c7vnchuu/snapshot
Response: {"status":"error","code":502,"message":"Application failed to respond","request_id":"faDGsWp5RrGs2DHz-_9nXA"}Result: Document completely broken, unrecoverable via API. Had to create new document.
Incident 2: Document ppjd1v32 (March 12, 2026, ~08:22 MDT)
Document: https://www.proofeditor.ai/d/ppjd1v32?token=49d34836-4e23-4b0e-989c-633facd60a68
What happened:
- Created fresh document via
/share/markdownAPI - Document creation succeeded with valid slug and tokens
- Attempted to read state and join document
- API returned null values and 502 errors
- User reported document appeared blank in browser UI
API evidence:
# Creation succeeded
POST /share/markdown
Response: {"success":true,"slug":"ppjd1v32","accessToken":"49d34836-..."}
# But state API failed
GET /api/agent/ppjd1v32/state
Response: {"status":"error","code":502,"message":"Application failed to respond"}Bug report filed: Request ID dkoJqCHeTTi9QcHm-_9nXA
Result: Platform-wide outage, all API endpoints returning 502 errors.
Incident 3: Document gb98e9g4 (March 12, 2026, ~19:25 MDT)
Document: https://www.proofeditor.ai/d/gb98e9g4?token=f54fd0af-93fa-42c0-b449-7f2c622552f7
What happened:
- Created document for Codex agent project planning
- Two AI agents (Squire Bot and Codex) joined document simultaneously
- Codex agent attempted to write project plan
- Content was duplicated multiple times
- Headers repeated 4+ times with fragments scattered throughout
Current state (via API):
# Vicki-Recipe coding project doc
This space is for 
# Vicki-Recipe coding project doc
# Vicki-Recipe coding project doc
# Vicki-Recipe coding project doc
This space is for Codex agent, Aaron, and Openclaw to collaborate...
[Full Codex plan appears once correctly]
...
# Vicki-Recipe coding pro
# Vick
# Vicki-Recipe coding projec
# Vicki-Recipe coding project doc
This space is for @
# Vicki-ReciVisual evidence: Screenshot shows overlapping text fragments and repeated headers (available upon request).
Result: Document requires manual cleanup, trust in concurrent editing broken.
Root Cause Analysis
Problem 1: No Concurrency Control for Agent Writes
Issue: When multiple AI agents write to a document simultaneously, Proof's API accepts all writes but applies them incorrectly, leading to content duplication.
Evidence:
- All three documents show exact same pattern: repeated headers, partial edits
- Happens specifically when agents use edit APIs concurrently
- Does not happen with single-user edits
Hypothesis:
- API lacks proper optimistic concurrency control
baseRevisionandbaseUpdatedAtparameters are not enforced- Writes are applied asynchronously without locking
- No deduplication or conflict resolution
Problem 2: API Instability During Load
Issue: Service returns 502 errors and null values during periods of API activity.
Evidence:
- Multiple 502 errors across different endpoints (state, snapshot, ops)
- Null revision/updatedAt values suggest backend state corruption
- Happened across 3 different documents over 2 days
Hypothesis:
- Backend services (Y.js projection, database layer) overwhelmed
- No proper fallback when real-time sync fails
- State corruption propagates to API layer
Problem 3: No Automatic Corruption Detection
Issue: Proof allows documents to become severely corrupted without any warning or automatic recovery.
Evidence:
- Documents with 8+ duplicate sections accepted without error
- No API validation for duplicate content
- No automatic cleanup or repair mechanism
Technical Evidence Summary
Example API Failures
| Endpoint | Document | Status | Error |
|---|---|---|---|
| GET /api/agent/c7vnchuu/state | c7vnchuu | 200 | {"revision": null, "updatedAt": null} |
| GET /api/agent/c7vnchuu/snapshot | c7vnchuu | 502 | Application failed to respond |
| GET /api/agent/ppjd1v32/state | ppjd1v32 | 502 | Application failed to respond |
| POST /share/markdown | ppjd1v32 | 200 | Success, but document blank |
| POST /api/agent/gb98e9g4/ops | gb98e9g4 | 200 | Success, but content duplicated |
Request IDs for Investigation
faDGsWp5RrGs2DHz-_9nXA(c7vnchuu snapshot 502)V6I4hY4fRFSlv5NFLPU1MQ(ppjd1v32 state 502)dkoJqCHeTTi9QcHm-_9nXA(ppjd1v32 creation, subsequent 502)Ji05m8KBS7KDFlIk9I3ezw(bug report 502)
Suggested Fixes
Critical (P0)
- Enforce Concurrency Control
- Strictly validate
baseRevisionorbaseUpdatedAton all write operations - Reject writes with HTTP 409 CONFLICT if base is stale
- Do not apply writes asynchronously without validation
- Reference: Your own API contract specifies this, but it's not enforced
- Strictly validate
- Add Content Deduplication
- Detect identical consecutive blocks (e.g., same header repeated 4+ times)
- Auto-reject or auto-collapse duplicates
- Add API warning header when duplication detected
- Fix State Projection Stability
- Investigate why
revisionandupdatedAtreturn null - Add fallback to last known good state
- Implement state recovery from Y.js document
- Investigate why
- Improve Error Handling
- Return structured error responses instead of 502
- Include actionable error codes and retry guidance
- Add circuit breaker for cascading failures
High Priority (P1)
- Add Agent Write Coordination
- Implement presence-based write locking (optional)
- Queue concurrent writes and apply sequentially
- Add
X-Request-Idto all responses for debugging
- Document Recovery Tools
- Add
/api/agent/<slug>/recoverendpoint - Allow rollback to specific revision
- Provide diff view for corrupted documents
- Add
- Monitoring and Alerting
- Add anomaly detection for duplicate content
- Alert on elevated 502 rates
- Dashboard for API health by document
Medium Priority (P2)
- Better Documentation
- Document concurrency semantics clearly
- Provide best practices for multi-agent workflows
- Add examples with proper error handling
- Client-Side Validation
- JavaScript SDK should check for duplicates before sending
- Add retry logic with exponential backoff
- Implement local conflict resolution
Workarounds for Users (Until Fixed)
- Serialize Agent Writes
- Only one agent should write at a time
- Use
events/pendingAPI to wait for previous writes to complete - Always read current state before writing
- Use
edit/v2with Block Refs- Prefer precise block operations over full rewrites
- Include
Idempotency-Keyheader - Use
baseRevisionfrom latest snapshot
- Monitor for Corruption
- Periodically read document state via API
- Check for duplicate headers or fragments
- Create new document if corruption detected
- Have Backup Plan
- Don't rely on Proof as sole source of truth
- Keep critical content in local files or other tools
- Consider self-hosting proof-sdk for reliability
Example Documents for Investigation
Broken Document 1:
https://www.proofeditor.ai/d/c7vnchuu?token=fed4aa35-d983-4bd2-98ef-99bca0f27cd3
- Status: Completely corrupted (null API values, 502 errors)
- Issue: Content duplicated 8+ times, API broken
Broken Document 2:
https://www.proofeditor.ai/d/ppjd1v32?token=49d34836-4e23-4b0e-989c-633facd60a68
- Status: Created successfully but API returns 502
- Issue: Service instability, blank in UI
Broken Document 3 (Active):
https://www.proofeditor.ai/d/gb98e9g4?token=f54fd0af-93fa-42c0-b449-7f2c622552f7
- Status: Partially corrupted, still accessible
- Issue: Headers duplicated 4+ times, fragments scattered
Our Use Case
We are using Proof for multi-agent collaborative workflows where:
- Human (Aaron) creates documents
- AI agents (Squire Bot, Codex) read and write via HTTP API
- Real-time presence and comments are valuable
- Data integrity is critical
This is exactly the use case Proof advertises ("collaborative document editor with presence, comments, suggestions, and edit APIs"), but the current implementation cannot support it reliably.
Next Steps
- Immediate: Please investigate the three example documents and request IDs provided
- Short-term: Implement P0 fixes (concurrency control, deduplication, state stability)
- Medium-term: Add recovery tools and better monitoring
- Ongoing: Keep us informed of progress and estimated fix timelines
We want Proof to succeed - the concept is excellent and the API design is solid. But the current reliability issues make it unusable for production workflows. Happy to provide additional debugging data or test fixes.
Related Resources
- Proof SDK: https://github.com/EveryInc/proof-sdk
- Proof Skill Guide: https://www.proofeditor.ai/proof.SKILL.md
- Agent Docs: https://www.proofeditor.ai/agent-docs
Report prepared by: Squire Bot (OpenClaw AI assistant) on behalf of Aaron Makelky
Date: March 12, 2026, 19:35 MDT
Version: 1.0