Pizza Palace — Voice Ordering Workshop

A hands-on workshop for learning Deepgram's Voice Agent API. Order pizza by talking to an AI agent powered by real-time speech-to-text, an LLM, and text-to-speech — all over a single WebSocket.

Prerequisites

Node.js 18+ — download
Deepgram API key — get a free key
A modern browser (Chrome, Firefox, Edge, Safari)
A working microphone

Demo

Pizza.Palace.Demo.mov

Quick Start

# 1. Install dependencies
npm install

# 2. (Optional) Add your API key to .env so you don't have to enter it in the UI
echo "DEEPGRAM_API_KEY=your_key_here" > .env

# 3. Start the server
npm start

Open http://localhost:3000 in your browser.

If you added a .env key, just click Connect. Otherwise, paste your API key into the input field first.

The microphone starts automatically when you connect. Talk to order, click the button to mute/unmute, and click End to disconnect.

How It Works

Browser (mic) ──audio──▶ Express server ──proxy──▶ Deepgram Voice Agent
Browser (speaker) ◀──audio── Express server ◀──proxy── Deepgram Voice Agent

The Express server proxies WebSocket connections to Deepgram so the API key never reaches the browser. All audio and JSON messages pass through unchanged.

Pipeline:

Stage	Model	Role
Listen (STT)	Flux (`flux-general-en` v2)	Transcribes your speech
Think (LLM)	GPT-4o-mini	Decides what to say, calls functions
Speak (TTS)	Aura-2 (`aura-2-thalia-en`)	Speaks the response

Three functions are registered with the agent:

Function	Description	Status
`getMenuItems()`	Returns the full menu with prices	Implemented
`addToOrder(item, quantity)`	Adds items to the order, updates the UI	Implemented
`removeFromOrder(item)`	Removes items from the order	Skeleton (workshop exercise)

Project Structure

├── server.js              # Express + WebSocket proxy to Deepgram
├── package.json
├── .env                   # Optional: DEEPGRAM_API_KEY (gitignored)
├── public/
│   ├── index.html         # Single-page UI
│   ├── css/styles.css     # Pizza-themed styling
│   └── js/
│       ├── app.js         # Entry point, wires modules together
│       ├── audio.js       # Mic capture + agent audio playback
│       ├── deepgram.js    # WebSocket connection, Settings, function dispatch
│       ├── functions.js   # Menu data, order state, function handlers
│       └── ui.js          # DOM rendering
├── EXTENSIONS.md          # Workshop expansion ideas
└── RESEARCH.md            # Deepgram Voice Agent API research notes

Try Saying

"What's on the menu?"
"I'll have two pepperoni pizzas and a Caesar salad"
"Add a lemonade"
"What's my total?"
"Remove the salad" (requires completing the workshop exercise)

Workshop Exercises

See EXTENSIONS.md for 7 expansion ideas ranging from beginner to advanced:

Implement removeFromOrder() — Complete the skeleton function
Voice picker — Change the agent's voice mid-conversation
Order summary function — Add a new callable function
Live transcript panel — Display the conversation as chat bubbles
Delivery or pickup — Add address collection and fulfillment method
Order status tracker — Time-based progress from kitchen to delivery
Dynamic upselling — Use prompt engineering to suggest pairings

Troubleshooting

Problem	Fix
No audio / static noise	Check that your mic is set to the correct input device in system settings
"WebSocket error" on connect	Verify your API key is valid at console.deepgram.com
Agent doesn't respond	Button should show green "Listening" — if muted (red), click to unmute
Port 3000 in use	Kill the other process or set `PORT=3001 npm start`
Browser blocks mic	Must be on `localhost` or HTTPS — `file://` won't work

Security

This project is designed for local workshop use only. Do not deploy to a public network without adding:

Authentication on the WebSocket proxy endpoint
Rate limiting to prevent API credit abuse
TLS (HTTPS/WSS) — API keys sent via the browser input travel as WebSocket query parameters in cleartext over ws://

The recommended approach is to set your API key in .env (server-side) rather than entering it in the browser. See .env.example for the expected format.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
public		public
.env.example		.env.example
.gitignore		.gitignore
EXTENSIONS.md		EXTENSIONS.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
server.js		server.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pizza Palace — Voice Ordering Workshop

Prerequisites

Demo

Quick Start

How It Works

Project Structure

Try Saying

Workshop Exercises

Troubleshooting

Security

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pizza Palace — Voice Ordering Workshop

Prerequisites

Demo

Quick Start

How It Works

Project Structure

Try Saying

Workshop Exercises

Troubleshooting

Security

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages