Skip to content

Commit f6c7ed0

Browse files
alessfgCopilot
andauthored
feat: Add NGINX AI proxy demo (#45)
Co-authored-by: Copilot <[email protected]>
1 parent c2484f0 commit f6c7ed0

File tree

9 files changed

+772
-7
lines changed

9 files changed

+772
-7
lines changed

.github/CODEOWNERS

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,14 @@
33
###############
44

55
nginx/advanced-healthchecks @fabriziofiorucci
6+
nginx/ai-proxy @pleshakov
67
nginx/api-gateway @alessfg
78
nginx/api-steering @fabriziofiorucci
89
nginx/docker-image-builder @fabriziofiorucci
910
nginx/multicloud-gateway @fabriziofiorucci
1011
nginx/soap-to-rest @fabriziofiorucci
1112
nginx-gateway-fabric/traffic-splitting @sjberman
12-
nginx-ingress-controller/ingress-deployment @DylenTurnbull
13+
nginx-ingress-controller/ingress-deployment @nginx/demos
1314
nginx-instance-manager/docker-deployment @fabriziofiorucci
14-
nginx-workshops @apcurrier @chrisakker @sdutta9
15+
nginx-workshops @nginx/demos
1516
* @nginx/demos

README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Each demo might have unique deployment requirements. Please refer to each indivi
2222
|Title|Description|Owner|
2323
|-----|-----------|-----|
2424
|[NGINX advanced healthchecks](nginx/advanced-healthchecks/)|Advanced active healthchecks for NGINX Plus|@fabriziofiorucci|
25+
|[NGINX AI Proxy](nginx/ai-proxy)|Configure NGINX as a simple AI proxy|@pleshakov|
2526
|[NGINX API gateway](nginx/api-gateway/)|Configure NGINX as an API gateway|@alessfg|
2627
|[NGINX API steering](nginx/api-steering/)|NGINX as an API gateway using an external data source for authentication, authorization and steering|@fabriziofiorucci|
2728
|[NGINX Docker image builder](nginx/docker-image-builder/)|Tool to build several Docker images for NGINX Plus, NGINX App Protect, NGINX Agent|@fabriziofiorucci|
@@ -38,7 +39,7 @@ Each demo might have unique deployment requirements. Please refer to each indivi
3839

3940
|Title|Description|Owner|
4041
|-----|-----------|-----|
41-
|[NGINX Ingress Controller deployment](nginx-ingress-controller/ingress-deployment/)|Simple overview of deploying and configuring NGINX Ingress Controller|@DylenTurnbull|
42+
|[NGINX Ingress Controller deployment](nginx-ingress-controller/ingress-deployment/)|Simple overview of deploying and configuring NGINX Ingress Controller|TBD|
4243

4344
### NGINX Instance Manager (NIM)
4445

@@ -50,10 +51,10 @@ Each demo might have unique deployment requirements. Please refer to each indivi
5051

5152
|Title|Description|Owner|
5253
|-----|-----------|-----|
53-
|[NGINX Basics](nginx-workshops/README.md)|A 101 level introduction to NGINX|@apcurrier, @chrisakker, @sdutta9|
54-
|[NGINX Ingress Controller](nginx-workshops/README.md)|Learn everything you need to get started with NGINX Ingress Controller and its capabilities|@apcurrier, @chrisakker, @sdutta9|
55-
|[NGINXaaS for Azure](nginx-workshops/README.md)|Learn everything you need to get started with NGINX as a Service for Azure (NGINXaaS) and its capabilities|@apcurrier, @chrisakker, @sdutta9|
56-
|[NGINX One Console](nginx-workshops/README.md)|Learn everything you need to get started with NGINX One Console and its capabilities|@apcurrier, @chrisakker, @sdutta9|
54+
|[NGINX Basics](nginx-workshops/README.md)|A 101 level introduction to NGINX|TBD|
55+
|[NGINX Ingress Controller](nginx-workshops/README.md)|Learn everything you need to get started with NGINX Ingress Controller and its capabilities|TBD|
56+
|[NGINXaaS for Azure](nginx-workshops/README.md)|Learn everything you need to get started with NGINX as a Service for Azure (NGINXaaS) and its capabilities|TBD|
57+
|[NGINX One Console](nginx-workshops/README.md)|Learn everything you need to get started with NGINX One Console and its capabilities|TBD|
5758
|[NGINX Ingress Controller Lab](nginx-workshops/README.md)|NGINX Ingress Controller lab|@fabriziofiorucci|
5859
|[NGINX Gateway Fabric Lab](nginx-workshops/README.md)|NGINX Gateway Fabric lab|@fabriziofiorucci|
5960

nginx/ai-proxy/README.md

Lines changed: 328 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,328 @@
1+
# NGINX AI Proxy
2+
3+
## Demo Overview
4+
5+
Simple demo showcasing how to use NGINX and NGINX JavaScript (NJS) to act as a simple AI proxy. This demo covers how to use NGINX to provide the following AI proxy capabilities:
6+
7+
- User-based AI model access control.
8+
- AI model abstraction (OpenAI ↔ Anthropic) with request/response translation.
9+
- Per-model failover.
10+
- AI model token usage extraction into access logs.
11+
12+
This demo has the following limitations:
13+
14+
- The JSON config is statically loaded (no dynamic reload logic here).
15+
- Only a subset of OpenAI → Anthropic fields are properly translated (enough for basic prompts).
16+
- No handling of AI streaming.
17+
- Authentication is done via header-based user identification (`X-User`); there is no actual auth.
18+
- Failover only triggers on non-200 HTTP status.
19+
- No rate limiting or caching.
20+
21+
## Demo Walkthrough
22+
23+
### Prerequisites
24+
25+
Before you can run this demo, you will need:
26+
27+
- An OpenAI API key exported as an environment variable:
28+
29+
```bash
30+
export OPENAI_API_KEY=<API_KEY>
31+
```
32+
33+
- An Anthropic API key exported as an environment variable:
34+
35+
```bash
36+
export ANTHROPIC_API_KEY=<API_KEY>
37+
```
38+
39+
- A functional Docker installation.
40+
41+
### Launching the Container Demo Environment on Docker
42+
43+
1. Clone this repo and change directory to the AI proxy directory inside the cloned repo:
44+
45+
```bash
46+
git clone https://github.com/nginx/nginx-demos
47+
cd nginx-demos/nginx/ai-proxy
48+
```
49+
50+
2. Create a persistent volume for generated key snippets:
51+
52+
```bash
53+
docker volume create nginx-keys
54+
```
55+
56+
3. Launch the Docker NGINX container with all the necessary configuration settings:
57+
58+
```bash
59+
docker run -it --rm -p 4242:4242 \
60+
-v $(pwd)/config:/etc/nginx \
61+
-v $(pwd)/njs:/etc/njs \
62+
-v $(pwd)/templates:/etc/nginx-ai-proxy/templates \
63+
-v nginx-keys:/etc/nginx-ai-proxy/keys \
64+
-e NGINX_ENVSUBST_TEMPLATE_DIR=/etc/nginx-ai-proxy/templates \
65+
-e NGINX_ENVSUBST_OUTPUT_DIR=/etc/nginx-ai-proxy/keys \
66+
-e OPENAI_API_KEY \
67+
-e ANTHROPIC_API_KEY \
68+
--name nginx-ai-proxy \
69+
nginx:1.29.1
70+
```
71+
72+
The official NGINX image entrypoint runs `envsubst` on templates and creates an `openai-key.conf` and `anthropic-key.conf` NGINX config files under `/etc/nginx-ai-proxy/keys/` which are then `included` by the `aiproxy.conf` NGINX config file.
73+
74+
### Testing Basic Requests
75+
76+
1. Try sending a request as `user-a` to the OpenAI model:
77+
78+
```bash
79+
curl -s -X POST http://localhost:4242/v1/chat/completions \
80+
-H 'Content-Type: application/json' \
81+
-H 'X-User: user-a' \
82+
-d '{"model":"gpt-5","messages":[{"role":"user","content":"Hello"}]}'
83+
```
84+
85+
Expected response:
86+
87+
```json
88+
{
89+
"id": "...",
90+
"object": "chat.completion",
91+
"created": ...,
92+
"model": "gpt-5-2025-08-07",
93+
"choices": [
94+
{
95+
"index": 0,
96+
"message": {
97+
"role": "assistant",
98+
"content": "Hello! How can I help you today?",
99+
"refusal": null,
100+
"annotations": []
101+
},
102+
"finish_reason": "stop"
103+
}
104+
],
105+
"usage": {
106+
"prompt_tokens": 7,
107+
"completion_tokens": 82,
108+
"total_tokens": 89,
109+
"prompt_tokens_details": {
110+
"cached_tokens": 0,
111+
"audio_tokens": 0
112+
},
113+
"completion_tokens_details": {
114+
"reasoning_tokens": 64,
115+
"audio_tokens": 0,
116+
"accepted_prediction_tokens": 0,
117+
"rejected_prediction_tokens": 0
118+
}
119+
},
120+
"service_tier": "default",
121+
"system_fingerprint": null
122+
}
123+
```
124+
125+
2. Send a different request as `user-a` to the Anthropic model (still using the OpenAI schema as the AI model translation happens server-side in the NJS code):
126+
127+
```bash
128+
curl -s -X POST http://localhost:4242/v1/chat/completions \
129+
-H 'Content-Type: application/json' \
130+
-H 'X-User: user-a' \
131+
-d '{"model":"claude-sonnet-4-20250514","messages":[{"role":"user","content":"Hello"}]}'
132+
```
133+
134+
Expected response:
135+
136+
```json
137+
{
138+
"id": "...",
139+
"object": "chat.completion",
140+
"model": "claude-sonnet-4-20250514",
141+
"choices": [
142+
{
143+
"index": 0,
144+
"finish_reason": "end_turn",
145+
"message": {
146+
"role": "assistant",
147+
"content": "Hello! How are you doing today? Is there anything I can help you with?"
148+
}
149+
}
150+
],
151+
"usage": {
152+
"prompt_tokens": 8,
153+
"completion_tokens": 20,
154+
"total_tokens": 28
155+
}
156+
}
157+
```
158+
159+
3. Send a request as `user-b`. This user does not have access to Anthropic:
160+
161+
```bash
162+
curl -s -X POST http://localhost:4242/v1/chat/completions \
163+
-H 'Content-Type: application/json' \
164+
-H 'X-User: user-b' \
165+
-d '{"model":"claude-sonnet-4-20250514","messages":[{"role":"user","content":"Hello"}]}'
166+
```
167+
168+
Expected response:
169+
170+
```json
171+
{
172+
"error": {
173+
"message": "The model 'claude-sonnet-4-20250514' was not found or is not accessible to the user"
174+
}
175+
}
176+
```
177+
178+
### Testing the Failover Mechanism
179+
180+
1. Stop the previous running NGINX AI proxy Docker container. It should automatically get deleted from your container cache:
181+
182+
```bash
183+
docker stop nginx-ai-proxy
184+
```
185+
186+
2. Start a new Docker container with an invalid OpenAI key to force failure:
187+
188+
```bash
189+
docker run -it --rm -p 4242:4242 \
190+
-v $(pwd)/config:/etc/nginx \
191+
-v $(pwd)/njs:/etc/njs \
192+
-v $(pwd)/templates:/etc/nginx-ai-proxy/templates \
193+
-v nginx-keys:/etc/nginx-ai-proxy/keys \
194+
-e NGINX_ENVSUBST_TEMPLATE_DIR=/etc/nginx-ai-proxy/templates \
195+
-e NGINX_ENVSUBST_OUTPUT_DIR=/etc/nginx-ai-proxy/keys \
196+
-e OPENAI_API_KEY=bad \
197+
-e ANTHROPIC_API_KEY \
198+
--name nginx-ai-proxy \
199+
nginx:1.29.1
200+
```
201+
202+
3. Send a request as `user-a` to the OpenAI model. `user-a` has configured Anthropic as a failover model:
203+
204+
```bash
205+
curl -s -X POST http://localhost:4242/v1/chat/completions \
206+
-H 'Content-Type: application/json' \
207+
-H 'X-User: user-a' \
208+
-d '{"model":"gpt-5","messages":[{"role":"user","content":"Hello"}]}'
209+
```
210+
211+
Expected response:
212+
213+
```json
214+
{
215+
"id": "...",
216+
"object": "chat.completion",
217+
"model": "claude-sonnet-4-20250514",
218+
"choices": [
219+
{
220+
"index": 0,
221+
"finish_reason": "end_turn",
222+
"message": {
223+
"role": "assistant",
224+
"content": "Hello! How are you doing today? Is there anything I can help you with?"
225+
}
226+
}
227+
],
228+
"usage": {
229+
"prompt_tokens": 8,
230+
"completion_tokens": 20,
231+
"total_tokens": 28
232+
}
233+
}
234+
```
235+
236+
4. Send a request as `user-b` to the OpenAI model. `user-b` has no failover models available:
237+
238+
```bash
239+
curl -s -X POST http://localhost:4242/v1/chat/completions \
240+
-H 'Content-Type: application/json' \
241+
-H 'X-User: user-b' \
242+
-d '{"model":"gpt-5","messages":[{"role":"user","content":"Hello"}]}'
243+
```
244+
245+
Expected response:
246+
247+
```json
248+
{
249+
"error": {
250+
"message": "Incorrect API key provided: bad. You can find your API key at https://platform.openai.com/account/api-keys.",
251+
"type": "invalid_request_error",
252+
"param": null,
253+
"code": "invalid_api_key"
254+
}
255+
}
256+
```
257+
258+
Output should show `"claude-sonnet-4-20250514"` model indicating fallback.
259+
260+
## Cleanup
261+
262+
1. Stop the running NGINX AI proxy Docker container. It should automatically get deleted from your container cache:
263+
264+
```bash
265+
docker stop nginx-ai-proxy
266+
```
267+
268+
2. Cleanup the Docker key volume we created in one of the first steps:
269+
270+
```bash
271+
docker volume rm nginx-keys
272+
```
273+
274+
## Demo Structure
275+
276+
### Files
277+
278+
| Path | Purpose |
279+
|------|---------|
280+
| [`config/nginx.conf`](config/nginx.conf) | Includes the default `nginx.conf` file with a few modifications. Major differences are loading the NJS module, tweaking the log format to include token vars and "including" the AI proxy NGINX config (`aiproxy.conf`) |
281+
| [`config/aiproxy.conf`](config/aiproxy.conf) | Includes upstream blocks for OpenAI/Anthropic with dynamic DNS resolution, sets up a server listening on port 4242, loads a JSON config into the `$ai_proxy_config` variable using NJS, exposes a `/v1/chat/completions` location entrypoint, and setups internal locations for the `/openai` and `/anthropic` models |
282+
| [`config/rbac.json`](config/rbac.json) | Includes the RBAC data in a JSON data format -- See section below for more information |
283+
| [`njs/aiproxy.js`](njs/aiproxy.js) | NJS script including JSON RBAC parsing and AI proxy routing logic (authorization, model lookup, model failover, provider-specific transforms, and token extraction) |
284+
| [`templates/*.template`](templates/) | `envsubst` templates to inject API keys into included snippets |
285+
286+
### RBAC JSON Configuration Model
287+
288+
The [JSON RBAC model](config/rbac.json) looks like this:
289+
290+
```json
291+
{
292+
"users": {
293+
"user-a": {
294+
"models": [
295+
{"name": "gpt-5", "failover": "claude-sonnet-4-20250514"},
296+
{"name": "claude-sonnet-4-20250514"}
297+
]
298+
},
299+
"user-b": {
300+
"models": [{"name": "gpt-5"}]
301+
}
302+
},
303+
"models": {
304+
"gpt-5": {"provider": "openai", "location": "/openai"},
305+
"claude-sonnet-4-20250514": {"provider": "anthropic", "location": "/anthropic"}
306+
}
307+
}
308+
```
309+
310+
Each user contains a list of allowed models (and an optional `failover` model). The model section maps logical model names to a provider name and the internal location used by NGINX.
311+
312+
### NGINX Request Processing Flow
313+
314+
1. A client POSTs an OpenAI chat completion request containing the appropriate JSON data to `/v1/chat/completions`. The header `X-User` details which user this client corresponds to.
315+
2. The `aiproxy.js` NJS script validates the user and model access.
316+
3. NGINX proxies the request to the appropriate model via an internal location block (`/openai` or `/anthropic`).
317+
4. If the provider is Anthropic, the request is transformed by the NJS script to an Anthropic API compatible request. The response is then transformed back to an OpenAI compatible response.
318+
5. If the primary model returns a non-200 status code and a `failover` model is defined, a second attempt is made to the `failover` model.
319+
6. Once a successful request is completed, token counts are extracted from the response and logged within the NGINX access log.
320+
321+
### Token Usage Logging in NGINX
322+
323+
Token usage data is saved into NGINX variables using NJS. These variables, `$ai_proxy_response_prompt_tokens`, `$ai_proxy_response_completion_tokens`, and `$ai_proxy_response_total_tokens`, are then included into the access log format in the core NGINX config file (`nginx.conf`). Failed requests produce empty values. The resulting access log could look something along these lines:
324+
325+
```console
326+
... 401 ... prompt_tokens= completion_tokens= total_tokens=
327+
... 200 ... prompt_tokens=13 completion_tokens=39 total_tokens=52
328+
```

0 commit comments

Comments
 (0)