-
Notifications
You must be signed in to change notification settings - Fork 342
Description
Executive Summary
This feature request proposes implementing session persistence across tool calls and session pooling to address performance bottlenecks and state continuity issues in Context Forge. The approach prioritizes using existing stateful session support (USE_STATEFUL_SESSIONS
) and extending it to other transports.
Key Goals:
- Session persistence: reuse one MCP session across multiple tool calls for the same client workflow
- Session pooling: cache sessions for reuse across separate HTTP requests where a persistent channel is not used
- Enable state continuity without weakening isolation or multi-tenant boundaries
Problem Statement
Context Forge currently implements a per-request session strategy for SSE transport, where every tool invocation creates a new MCP session. While this prioritizes isolation and fault tolerance, it introduces significant performance penalties and breaks stateful tool workflows.
Current Behavior Issues:
- Network Overhead: Each tool call requires 4-6 network round trips before execution
- State Loss: Conversational context lost between sequential tool calls
- Resource Consumption: Memory and TCP connections grow linearly with tool invocation frequency
- Protocol Compliance: Violates MCP specification expectations for stateful sessions
Performance Impact:
- Resource Consumption Per Session: 2-4KB memory (asyncio.Queue + Event + registry entry)
- Network Overhead: 4-6 HTTP round trips (initialize + initialized + 3x notifications)
- Scaling Issues: At 100 tools/second = 400KB/sec memory + 400-600 network calls/sec
Root Cause Analysis:
SSE Endpoint (mcpgateway/main.py:1313-1347
):
@server_router.get("/{server_id}/sse")
async def sse_endpoint(...):
transport = SSETransport(base_url=server_sse_url) # ❌ NEW EVERY TIME
await transport.connect()
await session_registry.add_session(transport.session_id, transport)
Session ID Generation (mcpgateway/transports/sse_transport.py:114
):
def __init__(self, base_url: str = None):
self._session_id = str(uuid.uuid4()) # ❌ FRESH UUID PER INSTANCE
Current Configuration Landscape
Existing Stateful Sessions (already implemented but disabled):
# Session behavior flag (used by streamable HTTP; not yet wired for SSE/WS)
USE_STATEFUL_SESSIONS=false
# Cache backend selection (impacts session storage)
CACHE_TYPE=database
# CACHE_TYPE=redis
Current Limitations:
- No session pooling configuration options
- No per-server session strategy control
- No session reuse timeout settings
- No user-scoped session limits
Quick Win Opportunity
Enable existing stateful sessions for streamable HTTP transport:
- Set
USE_STATEFUL_SESSIONS=true
in.env
- Behavior: single client workflow reuses one MCP session across multiple tool invocations
- Reference:
mcpgateway/transports/streamablehttp_transport.py
already supports this
Acceptance Criteria
- Streamable HTTP with
USE_STATEFUL_SESSIONS=true
reuses single MCP session across multiple tool calls - SSE/WS with pooling enabled reuses sessions for same
(user, server)
across requests - Per-server override can enable/disable pooling regardless of global defaults
- User isolation maintained: sessions never shared across users or tenants
- Auth context preserved and validated on session reuse
- Observability: metrics expose pool hit/miss, active/idle counts, cleanup events
- Backward compatibility: with pooling disabled, behavior matches current per-request sessions
Expected Performance Improvements
- Response Time: 30%+ improvement expected
- Memory Usage: 70%+ reduction expected
- Network Calls: 75%+ reduction expected
- Initialization Skipped: 80%+ of calls should skip init sequence
Impact Analysis
Cache Type Compatibility:
- Memory Backend (
CACHE_TYPE=memory
): ❌ Not suitable for multiple workers - Redis Backend (
CACHE_TYPE=redis
): ✅ Fully compatible with multi-worker deployments - Database Backend (
CACHE_TYPE=database
):⚠️ Works but creates DB performance bottleneck
Multi-Worker/Container Impact:
- Memory backend + Multiple Workers: ❌ Session loss across workers
- Redis backend + Multiple Workers: ✅ Sessions shared across workers
- Database backend + Multiple Workers:
⚠️ Works but 3x DB load increase
Labels: enhancement, performance, session-management
Priority: high
Note: Design details, implementation specifications, and comprehensive testing strategy will be provided in follow-up comments.