Skip to content

Feature Request: Implement Session Persistence & Pooling for Improved Performance and State Continuity #975

@crivetimihai

Description

@crivetimihai

Executive Summary

This feature request proposes implementing session persistence across tool calls and session pooling to address performance bottlenecks and state continuity issues in Context Forge. The approach prioritizes using existing stateful session support (USE_STATEFUL_SESSIONS) and extending it to other transports.

Key Goals:

  • Session persistence: reuse one MCP session across multiple tool calls for the same client workflow
  • Session pooling: cache sessions for reuse across separate HTTP requests where a persistent channel is not used
  • Enable state continuity without weakening isolation or multi-tenant boundaries

Problem Statement

Context Forge currently implements a per-request session strategy for SSE transport, where every tool invocation creates a new MCP session. While this prioritizes isolation and fault tolerance, it introduces significant performance penalties and breaks stateful tool workflows.

Current Behavior Issues:

  • Network Overhead: Each tool call requires 4-6 network round trips before execution
  • State Loss: Conversational context lost between sequential tool calls
  • Resource Consumption: Memory and TCP connections grow linearly with tool invocation frequency
  • Protocol Compliance: Violates MCP specification expectations for stateful sessions

Performance Impact:

  • Resource Consumption Per Session: 2-4KB memory (asyncio.Queue + Event + registry entry)
  • Network Overhead: 4-6 HTTP round trips (initialize + initialized + 3x notifications)
  • Scaling Issues: At 100 tools/second = 400KB/sec memory + 400-600 network calls/sec

Root Cause Analysis:

SSE Endpoint (mcpgateway/main.py:1313-1347):

@server_router.get("/{server_id}/sse")
async def sse_endpoint(...):
    transport = SSETransport(base_url=server_sse_url)  # ❌ NEW EVERY TIME
    await transport.connect()
    await session_registry.add_session(transport.session_id, transport)

Session ID Generation (mcpgateway/transports/sse_transport.py:114):

def __init__(self, base_url: str = None):
    self._session_id = str(uuid.uuid4())  # ❌ FRESH UUID PER INSTANCE

Current Configuration Landscape

Existing Stateful Sessions (already implemented but disabled):

# Session behavior flag (used by streamable HTTP; not yet wired for SSE/WS)
USE_STATEFUL_SESSIONS=false

# Cache backend selection (impacts session storage)
CACHE_TYPE=database
# CACHE_TYPE=redis

Current Limitations:

  • No session pooling configuration options
  • No per-server session strategy control
  • No session reuse timeout settings
  • No user-scoped session limits

Quick Win Opportunity

Enable existing stateful sessions for streamable HTTP transport:

  • Set USE_STATEFUL_SESSIONS=true in .env
  • Behavior: single client workflow reuses one MCP session across multiple tool invocations
  • Reference: mcpgateway/transports/streamablehttp_transport.py already supports this

Acceptance Criteria

  • Streamable HTTP with USE_STATEFUL_SESSIONS=true reuses single MCP session across multiple tool calls
  • SSE/WS with pooling enabled reuses sessions for same (user, server) across requests
  • Per-server override can enable/disable pooling regardless of global defaults
  • User isolation maintained: sessions never shared across users or tenants
  • Auth context preserved and validated on session reuse
  • Observability: metrics expose pool hit/miss, active/idle counts, cleanup events
  • Backward compatibility: with pooling disabled, behavior matches current per-request sessions

Expected Performance Improvements

  • Response Time: 30%+ improvement expected
  • Memory Usage: 70%+ reduction expected
  • Network Calls: 75%+ reduction expected
  • Initialization Skipped: 80%+ of calls should skip init sequence

Impact Analysis

Cache Type Compatibility:

  • Memory Backend (CACHE_TYPE=memory): ❌ Not suitable for multiple workers
  • Redis Backend (CACHE_TYPE=redis): ✅ Fully compatible with multi-worker deployments
  • Database Backend (CACHE_TYPE=database): ⚠️ Works but creates DB performance bottleneck

Multi-Worker/Container Impact:

  • Memory backend + Multiple Workers: ❌ Session loss across workers
  • Redis backend + Multiple Workers: ✅ Sessions shared across workers
  • Database backend + Multiple Workers: ⚠️ Works but 3x DB load increase

Labels: enhancement, performance, session-management
Priority: high

Note: Design details, implementation specifications, and comprehensive testing strategy will be provided in follow-up comments.

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions