Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
205 changes: 205 additions & 0 deletions .claude/role-comm-backend.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,3 +126,208 @@ None
2. Add integration tests for resilience patterns
3. Document monitoring/alerting setup
4. Optional: Implement retry logic with exponential backoff

---

## Update 2025-12-30T09:52:00Z
agent: backend-developer
status: completed
task: Fix streaming chat endpoint - token-level streaming

### Progress
- Fixed `/api/chat/stream` endpoint to properly stream tokens from LLM
- Changed from LangGraph's `streamEvents()` to direct model `.stream()` call
- Added support for Claude's content block format (array of {type, text})
- Streaming now emits proper `text_delta` events in real-time
- Verified all tests pass (29/29 in chat.test.ts)

### Root Cause
The issue was that:
1. LangGraph's `streamEvents()` doesn't reliably emit token-level events
2. Claude API returns content as array of content blocks, not plain strings
3. Code was checking `typeof content === 'string'` but content was `Array<{type: 'text', text: string}>`

### Solution
Bypass LangGraph agent loop and stream directly from the model:
```typescript
const stream = await modelWithTools.stream(messages, { callbacks });
for await (const chunk of stream) {
// Handle both string content and array of content blocks
if (Array.isArray(content)) {
for (const block of content) {
if (block.type === 'text' && block.text) {
yield { type: 'text_delta', content: block.text };
}
}
}
}
```

### API Contract
**Endpoint**: GET /api/chat/stream
**Status**: WORKING (previously broken)

**Query Parameters**:
- `message` (required): string, 1-10000 chars
- `threadId` (optional): UUID
- `persona` (optional): string, max 100 chars

**SSE Events**:
```typescript
// Text token streaming (now working!)
event: text_delta
data: {"type":"text_delta","content":"Hello","traceId":"..."}

// Tool calls (emitted but not executed)
event: tool_call
data: {"type":"tool_call","toolCallId":"...","toolName":"calculator","toolInput":{...}}

// Completion
event: done
data: {"type":"done","traceId":"..."}
```

### Limitations (Documented in Code)
Current streaming implementation has intentional trade-offs:
- **No tool execution**: Tool calls are emitted but NOT executed (use POST /api/chat for tool execution)
- **No memory**: Single-turn only, no conversation history
- **Reason**: Direct model streaming bypasses LangGraph agent loop

For multi-turn conversations with tools, clients should use POST /api/chat (non-streaming).

### Files Modified
- `packages/backend/src/agents/chat-agent.ts`
- Rewrote `chatStream()` function (lines 365-471)
- Added content block parsing for Claude format
- Removed unused ToolMessage import

### Testing
```bash
# Works now - streams tokens in real-time
curl -N 'http://localhost:4000/api/chat/stream?message=Count%20to%205'

# Output:
event: text_delta
data: {"type":"text_delta","content":"I'll count"}
event: text_delta
data: {"type":"text_delta","content":" to 5 for you:"}
# ... streams continue in real-time
event: done
data: {"type":"done","traceId":"..."}
```

### Blockers
None

### Next Steps
- Consider implementing full streaming with tool execution using LangGraph's callback system
- Monitor Langfuse traces to verify streaming sessions are captured correctly

---

## Update 2025-12-30T10:00:00Z
agent: backend-developer
status: completed
task: Implement proper LangGraph streamEvents() for chat streaming with tool execution

### Progress
- Replaced `model.stream()` with `agent.streamEvents()` to use full LangGraph agent loop
- Added `extractTextContent()` helper to handle both OpenAI (string) and Anthropic (content blocks array) formats
- Implemented proper event handling:
- `on_chat_model_stream` for token-level streaming
- `on_tool_start` for tool call events
- `on_tool_end` for tool result events
- Streaming now includes full agent capabilities: tool execution, memory, and checkpointer
- All tests pass (29/29 in chat.test.ts)
- Manual testing confirms streaming works correctly

### Key Insight
Anthropic models return content as `Array<{type: 'text', text: string}>`, NOT a string. The previous implementation broke because it only checked `typeof content === 'string'`. The new `extractTextContent()` helper handles both formats universally.

### Solution
```typescript
// Helper to extract text from both OpenAI and Anthropic formats
function extractTextContent(content: unknown): string {
if (typeof content === 'string') return content;
if (Array.isArray(content)) {
let text = '';
for (const block of content) {
if (block?.type === 'text' && typeof block.text === 'string') {
text += block.text;
}
}
return text;
}
return '';
}

// Use streamEvents() with version: 'v2'
const eventStream = agent.streamEvents(streamParams, {
configurable: { thread_id: input.threadId },
callbacks,
version: 'v2' as const,
});

for await (const event of eventStream) {
if (event.event === 'on_chat_model_stream') {
const text = extractTextContent(event.data?.chunk?.content);
if (text) yield { type: 'text_delta', content: text };
}
// Handle tool_start and tool_end events...
}
```

### API Contract
**Endpoint**: GET /api/chat/stream
**Status**: FULLY FUNCTIONAL with LangGraph agent loop

**Capabilities (Enhanced)**:
- Token-level streaming via `on_chat_model_stream` events
- Tool execution with `on_tool_start` and `on_tool_end` events
- Conversation history/memory via PostgresSaver checkpointer
- Support for both OpenAI and Anthropic models

**SSE Events**:
```typescript
// Text streaming (works with both OpenAI and Anthropic)
event: text_delta
data: {"type":"text_delta","content":"Hello","traceId":"..."}

// Tool execution (now actually executes!)
event: tool_call
data: {"type":"tool_call","toolCallId":"run_xyz","toolName":"calculator","toolInput":{"expression":"2+2"}}

event: tool_result
data: {"type":"tool_result","toolCallId":"run_xyz","result":"4"}

// Completion
event: done
data: {"type":"done","traceId":"..."}
```

### Files Modified
- `packages/backend/src/agents/chat-agent.ts` (lines 365-517)
- Added `extractTextContent()` helper function
- Rewrote `chatStream()` to use `agent.streamEvents()` with version: 'v2'
- Added event handlers for `on_chat_model_stream`, `on_tool_start`, `on_tool_end`
- Updated documentation to reflect full agent capabilities

### Testing
```bash
# TypeScript check - passes
cd packages/backend && pnpm run typecheck

# Tests - all pass (29/29)
cd packages/backend && pnpm test:run

# Manual test - streams correctly
curl -N 'http://localhost:4000/api/chat/stream?message=Hi' | head -20
# Output shows proper streaming with traceId
```

### Blockers
None

### Next Steps
- Monitor production logs to verify both OpenAI and Anthropic models stream correctly
- Consider adding metrics for streaming event counts and latency
Loading