Feature Request: Implement Session Persistence & Pooling for Improved Performance and State Continuity

## Executive Summary

This feature request proposes implementing session persistence across tool calls and session pooling to address performance bottlenecks and state continuity issues in Context Forge. The approach prioritizes using existing stateful session support (`USE_STATEFUL_SESSIONS`) and extending it to other transports.

**Key Goals:**
- Session persistence: reuse one MCP session across multiple tool calls for the same client workflow
- Session pooling: cache sessions for reuse across separate HTTP requests where a persistent channel is not used
- Enable state continuity without weakening isolation or multi-tenant boundaries

## Problem Statement

Context Forge currently implements a **per-request session strategy** for SSE transport, where every tool invocation creates a new MCP session. While this prioritizes isolation and fault tolerance, it introduces significant performance penalties and breaks stateful tool workflows.

### Current Behavior Issues:
- **Network Overhead**: Each tool call requires 4-6 network round trips before execution
- **State Loss**: Conversational context lost between sequential tool calls  
- **Resource Consumption**: Memory and TCP connections grow linearly with tool invocation frequency
- **Protocol Compliance**: Violates MCP specification expectations for stateful sessions

### Performance Impact:
- **Resource Consumption Per Session**: 2-4KB memory (asyncio.Queue + Event + registry entry)
- **Network Overhead**: 4-6 HTTP round trips (initialize + initialized + 3x notifications) 
- **Scaling Issues**: At 100 tools/second = 400KB/sec memory + 400-600 network calls/sec

### Root Cause Analysis:

**SSE Endpoint** (`mcpgateway/main.py:1313-1347`):
```python
@server_router.get("/{server_id}/sse")
async def sse_endpoint(...):
    transport = SSETransport(base_url=server_sse_url)  # ❌ NEW EVERY TIME
    await transport.connect()
    await session_registry.add_session(transport.session_id, transport)
```

**Session ID Generation** (`mcpgateway/transports/sse_transport.py:114`):
```python
def __init__(self, base_url: str = None):
    self._session_id = str(uuid.uuid4())  # ❌ FRESH UUID PER INSTANCE
```

## Current Configuration Landscape

**Existing Stateful Sessions** (already implemented but disabled):
```bash
# Session behavior flag (used by streamable HTTP; not yet wired for SSE/WS)
USE_STATEFUL_SESSIONS=false

# Cache backend selection (impacts session storage)
CACHE_TYPE=database
# CACHE_TYPE=redis
```

**Current Limitations**:
- No session pooling configuration options
- No per-server session strategy control
- No session reuse timeout settings  
- No user-scoped session limits

## Quick Win Opportunity

**Enable existing stateful sessions** for streamable HTTP transport:
- Set `USE_STATEFUL_SESSIONS=true` in `.env`
- Behavior: single client workflow reuses one MCP session across multiple tool invocations
- Reference: `mcpgateway/transports/streamablehttp_transport.py` already supports this

## Acceptance Criteria

- [ ] Streamable HTTP with `USE_STATEFUL_SESSIONS=true` reuses single MCP session across multiple tool calls
- [ ] SSE/WS with pooling enabled reuses sessions for same `(user, server)` across requests
- [ ] Per-server override can enable/disable pooling regardless of global defaults
- [ ] User isolation maintained: sessions never shared across users or tenants
- [ ] Auth context preserved and validated on session reuse
- [ ] Observability: metrics expose pool hit/miss, active/idle counts, cleanup events
- [ ] Backward compatibility: with pooling disabled, behavior matches current per-request sessions

## Expected Performance Improvements

- **Response Time**: 30%+ improvement expected
- **Memory Usage**: 70%+ reduction expected
- **Network Calls**: 75%+ reduction expected  
- **Initialization Skipped**: 80%+ of calls should skip init sequence

## Impact Analysis

### Cache Type Compatibility:
- **Memory Backend** (`CACHE_TYPE=memory`): ❌ Not suitable for multiple workers
- **Redis Backend** (`CACHE_TYPE=redis`): ✅ Fully compatible with multi-worker deployments
- **Database Backend** (`CACHE_TYPE=database`): ⚠️ Works but creates DB performance bottleneck

### Multi-Worker/Container Impact:
- **Memory backend + Multiple Workers**: ❌ Session loss across workers
- **Redis backend + Multiple Workers**: ✅ Sessions shared across workers
- **Database backend + Multiple Workers**: ⚠️ Works but 3x DB load increase

---

**Labels:** enhancement, performance, session-management
**Priority:** high

**Note**: Design details, implementation specifications, and comprehensive testing strategy will be provided in follow-up comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Implement Session Persistence & Pooling for Improved Performance and State Continuity #975

Executive Summary

Problem Statement

Current Behavior Issues:

Performance Impact:

Root Cause Analysis:

Current Configuration Landscape

Quick Win Opportunity

Acceptance Criteria

Expected Performance Improvements

Impact Analysis

Cache Type Compatibility:

Multi-Worker/Container Impact:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Implement Session Persistence & Pooling for Improved Performance and State Continuity #975

Description

Executive Summary

Problem Statement

Current Behavior Issues:

Performance Impact:

Root Cause Analysis:

Current Configuration Landscape

Quick Win Opportunity

Acceptance Criteria

Expected Performance Improvements

Impact Analysis

Cache Type Compatibility:

Multi-Worker/Container Impact:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions