gRPC proxy for multi-user Spark Connect access via KBase authentication.
Routes PySpark Spark Connect gRPC traffic from a single endpoint to the correct user's notebook pod based on the KBase token in the request metadata.
User (PySpark) → Ingress (spark.berdl.kbase.us:443) → Spark Connect Proxy → jupyter-{username}:15002
- User sends gRPC request with
x-kbase-tokenmetadata - Proxy validates token, resolves username
- Proxy forwards gRPC to
jupyter-{username}.jupyterhub-prod.svc.cluster.local:15002 - Responses stream back to user transparently
| Env Var | Description | Default |
|---|---|---|
KBASE_AUTH_URL |
KBase Auth2 service URL | https://kbase.us/services/auth/ |
PROXY_LISTEN_PORT |
Port the proxy listens on | 15002 |
BACKEND_PORT |
Spark Connect port on notebooks | 15002 |
BACKEND_NAMESPACE |
K8s namespace for notebooks | jupyterhub-prod |
SERVICE_TEMPLATE |
Backend service pattern | jupyter-{username}.{namespace}.svc.cluster.local |
TOKEN_CACHE_TTL |
Token cache TTL (seconds) | 300 |
# Install
uv sync --dev
# Run tests
uv run pytest
# Run locally
uv run python -m spark_connect_proxy