OpenGradient TEE-gateway is an LLM routing service designed to run within AWS Nitro Enclave TEE (Trusted Execution Environment). It provides a secure, cryptographically verifiable interface to multiple LLM providers (OpenAI, Anthropic, Google Gemini, xAI Grok) with remote attestation, response signing, and x402v2 micropayment access control. The tee-gateway is a part of the decentralized OpenGradient network providing verifiable inference.
The repo must provide a stable AWS Nitro PCR when the code doesn't change in order to allow anyone to reproduce the PCRs locally by building the image as a way to verify what code we are running and also for 3rd party operators to set up their own tee-gateway nodes with the same PCRs in order to participate in the network.
├── tee_gateway/ # Main application package (Flask/connexion)
│ ├── __main__.py # Entry point: app factory, x402 middleware setup, key injection
│ ├── llm_backend.py # LLM provider routing via LangChain, HTTP client management
│ ├── tee_manager.py # TEE key generation, nitriding registration, response signing
│ ├── model_registry.py # Model config and per-token pricing
│ ├── definitions.py # On-chain addresses, network IDs, payment amounts
│ ├── facilitator_api.py # x402 facilitator API client
│ ├── heartbeat/ # Heartbeat/health monitoring
│ ├── controllers/ # Request handlers (chat, completions, security)
│ ├── models/ # OpenAI-compatible Pydantic models
│ ├── openapi/ # openapi.yaml spec
│ └── test/ # Unit tests
├── scripts/
│ ├── start.sh # Enclave startup script (nitriding + server)
│ ├── run-enclave.sh # EC2 host launcher (gvproxy, EIF, key injection)
├── pyproject.toml # Project metadata and dependencies (managed by uv)
├── Dockerfile # Multi-stage: nitriding builder + python:3.12-slim-bullseye + uv
├── Makefile
└── measurements.txt # PCR measurements for the deployed enclave image
# Dependency management (uses uv — https://docs.astral.sh/uv/)
uv sync # Install/update dependencies from uv.lock
uv add <package> # Add a new dependency
uv lock # Regenerate lockfile after editing pyproject.toml
# IMPORTANT: uv.lock is baked into the Docker image and affects PCR measurements.
# Only regenerate the lockfile when intentionally changing dependencies.
# Run server locally for development (without TEE)
make test-local # Runs: uv run python -m tee_gateway
# Linting and type checking
make lint # Run ruff format + ruff check + mypy
make mypy # Run mypy type checker only
# Build enclave image
make image # Build Docker image as TAR using Kaniko
# Build EIF and run in Nitro Enclave
make run # or: make all
# Clean build artifacts
make clean
# Show all available targets
make helpAPI keys (injected at runtime via POST /v1/keys — do NOT bake into the image):
OPENAI_API_KEYANTHROPIC_API_KEYGOOGLE_API_KEYXAI_API_KEYARK_API_KEY(BytePlus / ByteDance ModelArk; injected asbytedance_api_key)NOUS_API_KEY(Nous Research / Nous Portal; injected asnous_api_key)
Server configuration:
API_SERVER_PORT(default: 8000)API_SERVER_HOST(default: 0.0.0.0)EVM_PAYMENT_ADDRESS— wallet address to receive x402 paymentsFACILITATOR_URL— x402 facilitator endpoint
- TEEKeyManager (
tee_manager.py) generates RSA-2048 key pair on startup and registers the public key hash with the nitriding daemon - Incoming requests pass through x402 payment middleware before reaching handlers
- Requests are routed to the appropriate LLM provider via LangChain (
llm_backend.py) - All responses are signed with RSA-PSS-SHA256 over
keccak256(requestHash || outputHash || timestamp) - Clients verify attestation → get public key → verify signatures
tee_manager.py: RSA key generation, nitriding registration (/enclave/hash), response signingllm_backend.py: LangChain model instantiation, HTTP client management, provider routing from model namemodel_registry.py: Maps model names to providers and per-token USD pricing (used by dynamic cost calculator)definitions.py: On-chain constants (addresses, network IDs, payment amounts) — configure here for your deploymentutil.py:dynamic_session_cost_calculatorconverts actual token usage to x402 payment amounts
| Endpoint | Purpose |
|---|---|
/health |
Health check (status, version, tee_enabled) |
/signing-key |
TEE public key (PEM) and tee_id |
/enclave/attestation |
Nitro attestation document (served by nitriding) |
/v1/keys |
One-time API key injection (POST, loopback-only) |
/v1/completions |
Text completion (signed) |
/v1/chat/completions |
Chat completion with tool support (signed) |
- Nitriding daemon runs on localhost:8080, provides TLS termination (port 443 externally)
- Endpoints
/enclave/readyand/enclave/hashused for nitriding registration - PCR measurements in
measurements.txtfingerprint the exact enclave image
Model name prefixes determine routing:
- OpenAI: gpt-4.1, gpt-5, gpt-5-mini, gpt-5.2, o4-mini
- Anthropic: claude-sonnet-4-0/4-5/4-6, claude-haiku-4-5, claude-opus-4-5/4-6/4-7/4-8, claude-fable-5, claude-3-7-sonnet, claude-3-5-haiku
- Google: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro, gemini-3-pro-preview, gemini-3-flash-preview, gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview, gemini-3.5-flash; image generation: gemini-2.5-flash-image, gemini-3.1-flash-image
- xAI: grok-2, grok-3, grok-3-mini, grok-4, grok-4-fast, grok-4-1-fast; image generation: grok-2-image
- ByteDance (BytePlus ModelArk, OpenAI-compatible, ap-southeast): seed-1.6, seed-1.8, seed-2.0-lite; image generation: seedream-4.0
- Nous Research (Nous Portal, OpenAI-compatible): hermes-4-405b, hermes-4-70b
Image generation via xAI (grok-2-image) and ByteDance (seedream-4.0) is served
through a provider /images/generations endpoint rather than the chat path, but
is surfaced on /v1/chat/completions exactly like Gemini's inline-image models
(images returned out-of-band under the message images key). These models are
billed a flat per-image price (see per_image_price_usd in model_registry.py),
not per token.
examples/verify_attestation.py— Validates AWS Nitro attestation documents against the root CAexamples/verify_signature_example.py— Demonstrates request hash and RSA-PSS signature verification
Multi-stage Docker build: nitriding compiled from source (brave/nitriding-daemon), then copied into python:3.12.10-slim-bullseye. Dependencies are installed via uv sync --frozen from the lockfile for reproducible builds. Enclave launched via scripts/run-enclave.sh with gvproxy as the vsock network bridge, allocating 2 CPUs and 8GB memory.
Port 8000 is forwarded to 127.0.0.1 only on the EC2 host (loopback-only for key injection). Port 443 is forwarded publicly via gvproxy.