Skip to content

feat(telegram): stream LLM responses via sendMessageDraft#1101

Merged
alexhoshina merged 23 commits intosipeed:mainfrom
amirmamaghani:feat/telegram-streaming
Mar 20, 2026
Merged

feat(telegram): stream LLM responses via sendMessageDraft#1101
alexhoshina merged 23 commits intosipeed:mainfrom
amirmamaghani:feat/telegram-streaming

Conversation

@amirmamaghani
Copy link
Contributor

@amirmamaghani amirmamaghani commented Mar 4, 2026

Summary

Implements real-time LLM response streaming to Telegram using the sendMessageDraft API, replacing the current "Thinking..." placeholder pattern with live token-by-token output as it's generated.

Closes #1098

What changed

New interfaces and capabilities

  • StreamingProvider (providers/types.go): opt-in ChatStream() method with onChunk callback receiving accumulated text
  • StreamingCapable + Streamer (channels/interfaces.go): opt-in channel capability (like TypingCapable/PlaceholderCapable)
  • StreamDelegate (bus/bus.go): decouples agent loop from channel manager

Provider streaming (works with any OpenAI-compatible API)

  • openai_compat/provider.go: SSE streaming parser (stream: true) — handles text deltas, tool call assembly, and usage tracking. Uses a separate HTTP client without timeout for long streams (context cancellation provides safety).
  • http_provider.go: delegates ChatStream to openai_compat
  • anthropic/provider.go: native SDK streaming via Messages.NewStreaming() for direct Anthropic API connections

Telegram streaming

  • telegram/telegram.go: BeginStream() returns a telegramStreamer that calls sendMessageDraft with throttling (3s / 200 chars minimum) to stay within Telegram rate limits
  • Graceful degradation: first sendMessageDraft error (e.g., no forum/topics mode) sets failed=true — subsequent Update() calls become no-ops while Finalize() still delivers via SendMessage
  • Config opt-out: streaming.enabled (default: true) in telegram channel config

Agent loop + manager integration

  • agent/loop.go: when both provider and channel support streaming, uses ChatStream with streaming callback. Cancels stream on tool calls. Skips PublishOutbound when Finalize already delivered.
  • channels/manager.go: implements StreamDelegate, tracks streamActive state per channel+chatID, coordinates with placeholder editing in preSend
  • channels/base.go: always shows typing + placeholder on inbound (streaming coordination happens on the output side via streamActive map)

Fallback behavior

Scenario Behavior
Bot without forum/topics mode First sendMessageDraft fails → streamer degrades silently → Finalize sends via SendMessage
Non-streaming provider Type assertion fails → falls back to Chat() → normal placeholder flow
streaming.enabled: false BeginStream returns error → GetStreamer returns nil → normal placeholder flow
Tool calls mid-stream Stream cancelled → tools execute → response sent via normal outbound path

Test plan

  • Verified streamed=true in logs with OpenRouter provider
  • Verified graceful degradation on 429 rate limit (streamer disables, Finalize delivers)
  • Verified go build ./pkg/... and go vet ./pkg/... pass
  • Verified full binary builds cleanly
  • Manual test with forum-mode-enabled bot for full draft streaming UX
  • Verify tool-call scenarios cancel stream correctly

Implements real-time token streaming to Telegram using the sendMessageDraft
API (telego v1.6.0). Instead of showing only a "Thinking..." placeholder
until the full response arrives, users now see partial LLM output appear
in the chat as it's generated.

The streaming pipeline threads through all layers:

- StreamingProvider interface (providers/types.go): opt-in ChatStream()
  method that receives an onChunk callback with accumulated text
- OpenAI-compatible SSE streaming (openai_compat/provider.go): parses
  SSE events with stream:true, handles text deltas and tool call assembly
- Anthropic native streaming (anthropic/provider.go): uses SDK's
  NewStreaming() for direct Anthropic API connections
- HTTPProvider delegation (http_provider.go): delegates ChatStream to
  the underlying openai_compat provider
- StreamingCapable + Streamer interfaces (channels/interfaces.go):
  opt-in channel capability like TypingCapable/PlaceholderCapable
- Telegram streamer (telegram/telegram.go): BeginStream returns a
  telegramStreamer that throttles sendMessageDraft calls (3s/200 chars)
  with graceful degradation on API errors
- StreamDelegate bridge (bus/bus.go): decouples agent loop from channel
  manager without tight imports
- Manager integration (manager.go): implements StreamDelegate, tracks
  streamActive state, coordinates with placeholder editing
- Agent loop (loop.go): uses ChatStream when both provider and channel
  support streaming, cancels stream on tool calls, skips PublishOutbound
  when Finalize already delivered the message

Graceful degradation:
- Bots without forum/topics mode: first sendMessageDraft error sets
  failed=true, subsequent Updates become no-ops, Finalize still delivers
  via SendMessage. User sees normal non-streaming behavior.
- Non-streaming providers: type assertion fails, falls back to Chat()
- Config opt-out: streaming.enabled (default true) in telegram config

Closes sipeed#1098
…ponse

When streaming was active, the "Thinking..." placeholder message stayed
in the chat because preSend only deleted the tracking entry without
removing the actual Telegram message. Now preSend deletes the placeholder
via the new MessageDeleter interface when streamActive is set.
- Delete unused Anthropic ChatStream/parseStream (-131 lines) — factory
  creates HTTPProvider for all OpenAI-compat providers including OpenRouter
- Simplify runLLMIteration from 4 to 3 return values (remove unused
  streamed bool)
- Replace managerStreamer struct with finalizeHookStreamer using embedding
  (Update/Cancel promoted, only Finalize overridden)
Heartbeat messages set SendResponse=false but the streaming path
was unconditionally acquiring a streamer, causing HEARTBEAT_OK to
leak to Telegram via streamer.Finalize().
…ng config

Skip streamer acquisition for heartbeat (NoHistory=true), preventing
HEARTBEAT_OK from leaking to Telegram via streamer.Finalize().

Add streaming.enabled to Telegram defaults and example config.
amirmamaghani and others added 7 commits March 5, 2026 10:33
Implements real-time token streaming to Telegram using the sendMessageDraft
API (telego v1.6.0). Instead of showing only a "Thinking..." placeholder
until the full response arrives, users now see partial LLM output appear
in the chat as it's generated.

The streaming pipeline threads through all layers:

- StreamingProvider interface (providers/types.go): opt-in ChatStream()
  method that receives an onChunk callback with accumulated text
- OpenAI-compatible SSE streaming (openai_compat/provider.go): parses
  SSE events with stream:true, handles text deltas and tool call assembly
- Anthropic native streaming (anthropic/provider.go): uses SDK's
  NewStreaming() for direct Anthropic API connections
- HTTPProvider delegation (http_provider.go): delegates ChatStream to
  the underlying openai_compat provider
- StreamingCapable + Streamer interfaces (channels/interfaces.go):
  opt-in channel capability like TypingCapable/PlaceholderCapable
- Telegram streamer (telegram/telegram.go): BeginStream returns a
  telegramStreamer that throttles sendMessageDraft calls (3s/200 chars)
  with graceful degradation on API errors
- StreamDelegate bridge (bus/bus.go): decouples agent loop from channel
  manager without tight imports
- Manager integration (manager.go): implements StreamDelegate, tracks
  streamActive state, coordinates with placeholder editing
- Agent loop (loop.go): uses ChatStream when both provider and channel
  support streaming, cancels stream on tool calls, skips PublishOutbound
  when Finalize already delivered the message

Graceful degradation:
- Bots without forum/topics mode: first sendMessageDraft error sets
  failed=true, subsequent Updates become no-ops, Finalize still delivers
  via SendMessage. User sees normal non-streaming behavior.
- Non-streaming providers: type assertion fails, falls back to Chat()
- Config opt-out: streaming.enabled (default true) in telegram config

Closes sipeed#1098
…ponse

When streaming was active, the "Thinking..." placeholder message stayed
in the chat because preSend only deleted the tracking entry without
removing the actual Telegram message. Now preSend deletes the placeholder
via the new MessageDeleter interface when streamActive is set.
- Delete unused Anthropic ChatStream/parseStream (-131 lines) — factory
  creates HTTPProvider for all OpenAI-compat providers including OpenRouter
- Simplify runLLMIteration from 4 to 3 return values (remove unused
  streamed bool)
- Replace managerStreamer struct with finalizeHookStreamer using embedding
  (Update/Cancel promoted, only Finalize overridden)
Heartbeat messages set SendResponse=false but the streaming path
was unconditionally acquiring a streamer, causing HEARTBEAT_OK to
leak to Telegram via streamer.Finalize().
…ng config

Skip streamer acquisition for heartbeat (NoHistory=true), preventing
HEARTBEAT_OK from leaking to Telegram via streamer.Finalize().

Add streaming.enabled to Telegram defaults and example config.
…coclaw into feat/telegram-streaming

# Conflicts:
#	pkg/agent/loop.go
#	pkg/providers/types.go
@CLAassistant
Copy link

CLAassistant commented Mar 5, 2026

CLA assistant check
All committers have signed the CLA.

amirmamaghani and others added 2 commits March 5, 2026 10:50
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix gci import ordering in telegram and anthropic provider, and break
long function signature in openai_compat provider to satisfy golines.
@amirmamaghani
Copy link
Contributor Author

ready for merge guys @yinwm @alexhoshina @lxowalle

Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Telegram Streaming Implementation

This PR implements real-time LLM response streaming to Telegram using sendMessageDraft API. The overall architecture is well-designed with clean separation of concerns.


🔴 Critical Issues

1. Duplicate Streamer Interface Definition - Potential Type Mismatch

pkg/channels/interfaces.go and pkg/bus/bus.go both define Streamer interface:

// channels/interfaces.go
type Streamer interface {
    Update(ctx context.Context, content string) error
    Finalize(ctx context.Context, content string) error
    Cancel(ctx context.Context)
}

// bus/bus.go  
type Streamer interface {
    Update(ctx context.Context, content string) error
    Finalize(ctx context.Context, content string) error
    Cancel(ctx context.Context)
}

Problem: These are two different types in Go. manager.go returns channels.Streamer, but bus.GetStreamer declares return type as bus.Streamer. This could cause subtle bugs or compilation issues if the interfaces drift apart.

Recommendation: Keep only one definition. Define Streamer in bus/bus.go and have channels package import and use bus.Streamer. Or use type aliasing.


2. Scanner Buffer Size Limitation in SSE Parsing

pkg/providers/openai_compat/provider.go:247:

scanner := bufio.NewScanner(reader)
for scanner.Scan() {

bufio.Scanner has a default max token size of 64KB. If an LLM response contains a single line (without newlines) exceeding 64KB, it will trigger bufio.ErrTooLong.

Recommendation: Increase buffer size:

scanner := bufio.NewScanner(reader)
buf := make([]byte, 0, 1024*1024) // 1MB initial capacity
scanner.Buffer(buf, 10*1024*1024) // max 10MB

🟡 Medium Issues

3. DraftID Potential Collision Risk

pkg/channels/telegram/telegram.go:815:

draftID: rand.Intn(1<<31-1) + 1, // non-zero random draft ID

math/rand is not cryptographically secure, and without proper seeding could produce predictable/colliding values.

Recommendation: Use crypto/rand:

import "crypto/rand"
import "encoding/binary"

func randomDraftID() int {
    var b [4]byte
    rand.Read(b[:])
    return int(binary.BigEndian.Uint32(b[:])) | 1 // ensure non-zero
}

4. Missing Context Cancellation Check in Stream Parsing

In parseStreamResponse, if the context is cancelled mid-stream, the function continues processing until the stream ends naturally.

Recommendation: Add context check in the parsing loop:

select {
case <-ctx.Done():
    return nil, ctx.Err()
default:
    // process chunk
}

5. Silent Failure in Finalize

When Update fails (sets failed=true), Finalize is still called and sends the message. However, if Finalize also fails after the fallback, the user might not see any message at all.

Recommendation: Consider retry mechanism or logging to persistent storage for recovery.


📋 Suggestions

6. Add Observability/Metrics

Streaming success rate, fallback rate, and latency are important for operations.

Recommendation: Add Prometheus metrics or structured logging.


7. Configuration Could Be More Flexible

The throttle constants are hardcoded:

const (
    streamThrottleInterval = 3 * time.Second
    streamMinGrowth        = 200
)

Recommendation: Consider making these configurable for different deployment scenarios.


✅ Well Done

  1. Graceful Degradation: The failed flag pattern provides smooth fallback
  2. Interface Design: StreamingCapable as optional interface maintains backward compatibility
  3. Throttling Strategy: Prevents Telegram API rate limiting
  4. Tool Call Handling: Correctly cancels stream when tool calls are detected
  5. Heartbeat Fix: Properly guards internal messages from triggering streaming

Summary

Severity Count
🔴 Critical 2
🟡 Medium 3
🟢 Suggestion 2

Recommendation: Address at least the two critical issues before merging.

🤖 Generated with Claude Code

- Deduplicate Streamer interface: alias channels.Streamer to bus.Streamer
  to prevent type drift across packages
- Increase SSE scanner buffer to 10MB max to handle large single-line
  responses that exceed bufio.Scanner's 64KB default
- Switch draftID generation from math/rand to crypto/rand for
  collision-resistant random IDs
- Add context cancellation check in SSE parsing loop so cancelled
  streams stop processing immediately
- Log Finalize failures with chat_id and content length for debugging
  silent message delivery failures
@amirmamaghani amirmamaghani force-pushed the feat/telegram-streaming branch from 22bba53 to fa3c00a Compare March 5, 2026 17:46
@amirmamaghani
Copy link
Contributor Author

amirmamaghani commented Mar 5, 2026

Thanks for the thorough review @yinwm! Pushed fixes for all actionable items:

Critical:

  1. Duplicate Streamer interfacechannels.Streamer is now a type alias for bus.Streamer, single source of truth
  2. Scanner buffer size — Increased to 1MB initial / 10MB max to handle large single-line SSE payloads

Medium:
3. DraftID collision — Switched from math/rand to crypto/rand
4. Context cancellation in stream parsing — Added ctx.Err() check between chunks in the SSE loop
5. Finalize logging — Added structured error logging with chat_id and content length when both HTML and plain-text delivery fail

Skipped #6 (observability/metrics) and #7 (configurable throttle) for now — can follow up in separate PRs if needed.

EDIT: you know what, let me do the throttle config too 😁

Move hardcoded streamThrottleInterval (3s) and streamMinGrowth (200)
into StreamingConfig so they can be tuned per deployment via config
or environment variables.
Resolve conflicts in agent/loop.go (streaming + candidate routing)
and channels/interfaces.go (bus + commands imports).
@amirmamaghani
Copy link
Contributor Author

forgot to write you.. ready to merge! 🙌

Resolve conflicts in pkg/agent/loop.go (keep both reasoning
fallback and stream finalize) and pkg/providers/openai_compat/provider.go
(keep ChatStream + HTML error helpers + io.Reader parseResponse).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@amirmamaghani
Copy link
Contributor Author

guys, just a friendly reminder that ready to merge @yinwm @alexhoshina

@alexhoshina
Copy link
Collaborator

guys, just a friendly reminder that ready to merge @yinwm @alexhoshina

I've briefly reviewed the changes related to the channel, and they should be fine. The other parts might need to be reviewed by @yinwm.

@amirmamaghani
Copy link
Contributor Author

hey bro, just a reminder @yinwm

@alexhoshina
Copy link
Collaborator

Perhaps the CI issues need to be resolved, as all three checks have failed

These two functions called undefined parseChatID. Use
parseTelegramChatID with _ for the unused threadID instead of adding
a wrapper function. Fixes all three CI checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@amirmamaghani amirmamaghani force-pushed the feat/telegram-streaming branch from 8acb6a1 to 4669e5a Compare March 11, 2026 16:56
@amirmamaghani
Copy link
Contributor Author

Updated — removed the wrapper function, now just uses parseTelegramChatID directly with _ for the unused threadID. Same fix, no new functions.

Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Ready to Merge ✅

This PR is well-designed with a solid streaming pipeline implementation and robust graceful degradation.

Before merging, please address:

  1. 🗑️ Remove the pico-echo-server binary file that was accidentally committed
  2. 🔀 Resolve any conflicts with the main branch

Once these are addressed, this is good to merge. The remaining suggestions I had (tool call index handling, unit tests) are incremental improvements that can be addressed in follow-up PRs.

… binary

Resolves merge conflicts in config, agent loop, bus, defaults, and
openai_compat provider. Updates streaming code to use refactored
common package helpers (ReadAndParseResponse, HandleErrorResponse,
AsInt, AsFloat). Removes accidentally committed pico-echo-server binary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@amirmamaghani
Copy link
Contributor Author

amirmamaghani commented Mar 18, 2026

Resolved the two items from the latest review:

  1. Removed pico-echo-server binary — accidentally committed binary has been deleted
  2. Resolved merge conflicts with main — updated 5 files:
    • config/config.example.json — merged use_markdown_v2 field + streaming config
    • pkg/agent/loop.go — merged activeRequests tracking with streaming support
    • pkg/bus/bus.go — merged closeOnce/wg refactoring with streamDelegate
    • pkg/config/defaults.go — merged UseMarkdownV2 default with Streaming default
    • pkg/providers/openai_compat/provider.go — adapted ChatStream to use refactored common package (HandleErrorResponse, AsInt, AsFloat, supportsPromptCacheKey)

All checks pass locally: go build, go vet, golangci-lint, and go test ./pkg/... (all green).

@amirmamaghani
Copy link
Contributor Author

ready for merge

}

func (s *finalizeHookStreamer) Finalize(ctx context.Context, content string) error {
s.onFinalize()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting streamActive to True too early may cause the message sending to not fall back to the regular path after Streamer.Finalize fails.

amirmamaghani and others added 2 commits March 20, 2026 13:48
Keep both streamActive map from streaming feature and channelHashes
map + initChannels signature change from main.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move onFinalize hook to run after Streamer.Finalize succeeds, so that
if Finalize fails the streamActive flag stays false and the regular
placeholder fallback path remains available.

Addresses review feedback from @alexhoshina.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexhoshina alexhoshina merged commit 71134ba into sipeed:main Mar 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Telegram realtime stream response

4 participants