Project Overview

OpenGradient TEE-gateway is an LLM routing service designed to run within AWS Nitro Enclave TEE (Trusted Execution Environment). It provides a secure, cryptographically verifiable interface to multiple LLM providers (OpenAI, Anthropic, Google Gemini, xAI Grok) with remote attestation, response signing, and x402v2 micropayment access control. The tee-gateway is a part of the decentralized OpenGradient network providing verifiable inference.

The repo must provide a stable AWS Nitro PCR when the code doesn't change in order to allow anyone to reproduce the PCRs locally by building the image as a way to verify what code we are running and also for 3rd party operators to set up their own tee-gateway nodes with the same PCRs in order to participate in the network.

Project Structure highlighting core files

├── tee_gateway/             # Main application package (Flask/connexion)
│   ├── __main__.py          # Entry point: app factory, x402 middleware setup, key injection
│   ├── llm_backend.py       # LLM provider routing via LangChain, HTTP client management
│   ├── tee_manager.py       # TEE key generation, nitriding registration, response signing
│   ├── model_registry.py    # Model config and per-token pricing
│   ├── definitions.py       # On-chain addresses, network IDs, payment amounts
│   ├── facilitator_api.py   # x402 facilitator API client
│   ├── heartbeat/           # Heartbeat/health monitoring
│   ├── controllers/         # Request handlers (chat, completions, security)
│   ├── models/              # OpenAI-compatible Pydantic models
│   ├── openapi/             # openapi.yaml spec
│   └── test/                # Unit tests
├── scripts/
│   ├── start.sh             # Enclave startup script (nitriding + server)
│   ├── run-enclave.sh       # EC2 host launcher (gvproxy, EIF, key injection)
├── pyproject.toml           # Project metadata and dependencies (managed by uv)
├── Dockerfile               # Multi-stage: nitriding builder + python:3.12-slim-bullseye + uv
├── Makefile
└── measurements.txt         # PCR measurements for the deployed enclave image

Common Commands

# Dependency management (uses uv — https://docs.astral.sh/uv/)
uv sync                      # Install/update dependencies from uv.lock
uv add <package>             # Add a new dependency
uv lock                      # Regenerate lockfile after editing pyproject.toml
# IMPORTANT: uv.lock is baked into the Docker image and affects PCR measurements.
# Only regenerate the lockfile when intentionally changing dependencies.

# Run server locally for development (without TEE)
make test-local              # Runs: uv run python -m tee_gateway

# Linting and type checking
make lint                    # Run ruff format + ruff check + mypy
make mypy                    # Run mypy type checker only

# Build enclave image
make image                   # Build Docker image as TAR using Kaniko

# Build EIF and run in Nitro Enclave
make run                     # or: make all

# Clean build artifacts
make clean

# Show all available targets
make help

Environment Variables

API keys (injected at runtime via POST /v1/keys — do NOT bake into the image):

OPENAI_API_KEY
ANTHROPIC_API_KEY
GOOGLE_API_KEY
XAI_API_KEY
ARK_API_KEY (BytePlus / ByteDance ModelArk; injected as bytedance_api_key)
NOUS_API_KEY (Nous Research / Nous Portal; injected as nous_api_key)

Server configuration:

API_SERVER_PORT (default: 8000)
API_SERVER_HOST (default: 0.0.0.0)
EVM_PAYMENT_ADDRESS — wallet address to receive x402 payments
FACILITATOR_URL — x402 facilitator endpoint

Architecture

Core Flow

TEEKeyManager (tee_manager.py) generates RSA-2048 key pair on startup and registers the public key hash with the nitriding daemon
Incoming requests pass through x402 payment middleware before reaching handlers
Requests are routed to the appropriate LLM provider via LangChain (llm_backend.py)
All responses are signed with RSA-PSS-SHA256 over keccak256(requestHash || outputHash || timestamp)
Clients verify attestation → get public key → verify signatures

Key Components

tee_manager.py: RSA key generation, nitriding registration (/enclave/hash), response signing
llm_backend.py: LangChain model instantiation, HTTP client management, provider routing from model name
model_registry.py: Maps model names to providers and per-token USD pricing (used by dynamic cost calculator)
definitions.py: On-chain constants (addresses, network IDs, payment amounts) — configure here for your deployment
util.py: dynamic_session_cost_calculator converts actual token usage to x402 payment amounts

API Endpoints

Endpoint	Purpose
`/health`	Health check (status, version, tee_enabled)
`/signing-key`	TEE public key (PEM) and tee_id
`/enclave/attestation`	Nitro attestation document (served by nitriding)
`/v1/keys`	One-time API key injection (POST, loopback-only)
`/v1/completions`	Text completion (signed)
`/v1/chat/completions`	Chat completion with tool support (signed)

TEE Integration

Nitriding daemon runs on localhost:8080, provides TLS termination (port 443 externally)
Endpoints /enclave/ready and /enclave/hash used for nitriding registration
PCR measurements in measurements.txt fingerprint the exact enclave image

Supported Providers

Model name prefixes determine routing:

OpenAI: gpt-4.1, gpt-5, gpt-5-mini, gpt-5.2, o4-mini
Anthropic: claude-sonnet-4-0/4-5/4-6, claude-haiku-4-5, claude-opus-4-5/4-6/4-7/4-8, claude-fable-5, claude-3-7-sonnet, claude-3-5-haiku
Google: gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro, gemini-3-pro-preview, gemini-3-flash-preview, gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview, gemini-3.5-flash; image generation: gemini-2.5-flash-image, gemini-3.1-flash-image
xAI: grok-2, grok-3, grok-3-mini, grok-4, grok-4-fast, grok-4-1-fast; image generation: grok-2-image
ByteDance (BytePlus ModelArk, OpenAI-compatible, ap-southeast): seed-1.6, seed-1.8, seed-2.0-lite; image generation: seedream-4.0
Nous Research (Nous Portal, OpenAI-compatible): hermes-4-405b, hermes-4-70b

Image generation via xAI (grok-2-image) and ByteDance (seedream-4.0) is served through a provider /images/generations endpoint rather than the chat path, but is surfaced on /v1/chat/completions exactly like Gemini's inline-image models (images returned out-of-band under the message images key). These models are billed a flat per-image price (see per_image_price_usd in model_registry.py), not per token.

Verification Examples

examples/verify_attestation.py — Validates AWS Nitro attestation documents against the root CA
examples/verify_signature_example.py — Demonstrates request hash and RSA-PSS signature verification

Deployment

Multi-stage Docker build: nitriding compiled from source (brave/nitriding-daemon), then copied into python:3.12.10-slim-bullseye. Dependencies are installed via uv sync --frozen from the lockfile for reproducible builds. Enclave launched via scripts/run-enclave.sh with gvproxy as the vsock network bridge, allocating 2 CPUs and 8GB memory.

Port 8000 is forwarded to 127.0.0.1 only on the EC2 host (loopback-only for key injection). Port 443 is forwarded publicly via gvproxy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Overview

Project Structure highlighting core files

Common Commands

Environment Variables

Architecture

Core Flow

Key Components

API Endpoints

TEE Integration

Supported Providers

Verification Examples

Deployment

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Project Overview

Project Structure highlighting core files

Common Commands

Environment Variables

Architecture

Core Flow

Key Components

API Endpoints

TEE Integration

Supported Providers

Verification Examples

Deployment