Dream Server Quick Start

One command to a fully running local AI stack. No manual config, no dependency hell.

This quickstart covers Linux, Windows, and macOS. For Windows, see the Windows install section below. For macOS, see the macOS Quickstart.

Prerequisites

Linux (NVIDIA GPU):

Docker with Compose v2+ (Install)
NVIDIA GPU with 8GB+ VRAM (16GB+ recommended)
NVIDIA Container Toolkit (Install)
40GB+ disk space (for models)

Linux (AMD Strix Halo):

Docker with Compose v2+ (Install)
AMD Ryzen AI MAX+ APU with 64GB+ unified memory
ROCm-compatible kernel (6.17+ recommended, 6.18.4+ ideal)
/dev/kfd and /dev/dri accessible (user in video + render groups)
60GB+ disk space (for GGUF model files)

Windows: See Windows section below.

Step 1: Run the Installer (Linux)

./install.sh

The installer will:

Detect your GPU and auto-select the right tier:
- AMD Strix Halo (unified memory):
  - SH_LARGE (90GB+): qwen3-coder-next (80B MoE), 128K context
  - SH_COMPACT (64-89GB): qwen3-30b-a3b (30B MoE), 128K context
- NVIDIA (discrete GPU):
  - Tier 1 (Entry): <12GB VRAM → qwen2.5-7b-instruct (GGUF Q4_K_M), 16K context
  - Tier 2 (Prosumer): 12-20GB VRAM → qwen2.5-14b-instruct (GGUF Q4_K_M), 16K context
  - Tier 3 (Pro): 20-40GB VRAM → qwen2.5-32b-instruct (GGUF Q4_K_M), 32K context
  - Tier 4 (Enterprise): 40GB+ VRAM → qwen2.5-72b-instruct (GGUF Q4_K_M), 32K context
Check Docker and GPU toolkit (NVIDIA Container Toolkit or ROCm devices)
Ask which optional components to enable (voice, workflows, RAG)
Generate secure passwords and configuration
Apply system tuning (AMD: sysctl, amdgpu modprobe, etc.)
Start all services

Override tier manually: ./install.sh --tier 3

Time Estimate: 5-10 minutes interactive setup, plus 10-30 minutes for first model download.

Step 2: Wait for Model Download

NVIDIA: First run downloads the LLM (~20GB for 32B GGUF). Watch progress:

docker compose logs -f llama-server

When you see server is listening on, you're ready!

AMD Strix Halo: The GGUF model downloads in the background (~25-52GB). Watch progress:

tail -f ~/dream-server/logs/model-download.log

# Or check llama-server readiness:
docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs -f llama-server

When you see server is listening on, the model is loaded and ready.

Step 3: Validate Installation

Verify everything is working:

./scripts/dream-preflight.sh

This tests all services and confirms Dream Server is ready. You should see green checkmarks for each test.

For comprehensive testing:

./scripts/dream-test.sh

This runs the full validation suite including load tests.

Step 4: Open Chat UI

Visit: http://localhost:3000

Create an account (first user becomes admin)
Select a model from the dropdown
Start chatting!

Step 5: Test the API

NVIDIA:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-32b-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

AMD Strix Halo:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-coder-next",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Hardware Tiers

The installer auto-detects your GPU and selects the optimal configuration:

AMD Strix Halo:

Tier	Unified VRAM	Model	Hardware
SH_LARGE	90GB+	qwen3-coder-next (80B MoE)	Ryzen AI MAX+ (96GB config)
SH_COMPACT	64-89GB	qwen3:30b-a3b (30B MoE)	Ryzen AI MAX+ (64GB config)

NVIDIA:

Tier	VRAM	Model	Example GPUs
1 (Entry)	<12GB	Qwen2.5-7B	RTX 3080, RTX 4070
2 (Prosumer)	12-20GB	Qwen2.5-14B (GGUF Q4_K_M)	RTX 3090, RTX 4080
3 (Pro)	20-40GB	Qwen2.5-32B (GGUF Q4_K_M)	RTX 4090, A6000
4 (Enterprise)	40GB+	Qwen2.5-72B (GGUF Q4_K_M)	A100, H100

To check what tier you'd get without installing:

./scripts/detect-hardware.sh

Common Issues

"OOM" or "CUDA out of memory" (NVIDIA)

Reduce context window in .env:

CTX_SIZE=4096  # or even 2048

Or switch to a smaller model:

LLM_MODEL=qwen2.5-7b-instruct

AMD: llama-server crash loop

Check logs: docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs llama-server

Common causes:

GGUF file not found: ensure data/models/*.gguf exists
Wrong GGUF format: use upstream llama.cpp GGUFs (NOT Ollama blobs)
Missing ROCm env vars: HSA_OVERRIDE_GFX_VERSION=11.5.1 must be set

Model download fails

Check disk space: df -h
NVIDIA: Try again: docker compose restart llama-server
AMD: Resume download: wget -c -O data/models/<model>.gguf <url>

WebUI shows "No models available"

The inference engine is still loading.

NVIDIA: Check: docker compose logs llama-server
AMD: Check: docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs llama-server

Port conflicts

Edit .env to change ports:

WEBUI_PORT=3001
OLLAMA_PORT=8081          # LLM inference port

Next Steps

Add workflows: Open n8n at http://localhost:5678 to create custom automation workflows
Connect OpenClaw: Use this as your local inference backend at http://localhost:7860
Dashboard: Monitor services, GPU, and health at http://localhost:3001

Windows

Prerequisites

Windows 10/11 with Docker Desktop (WSL2 backend enabled)
NVIDIA GPU with 8GB+ VRAM, or AMD Strix Halo APU
40GB+ free disk space

Install

git clone https://github.com/Light-Heart-Labs/DreamServer.git
cd DreamServer
.\install.ps1

The installer handles everything — GPU detection, tier selection, Docker setup, credential generation, service startup, and Desktop/Start Menu shortcuts.

Manage

.\dream-server\installers\windows\dream.ps1 status           # Health checks + GPU status
.\dream-server\installers\windows\dream.ps1 start            # Start all services
.\dream-server\installers\windows\dream.ps1 stop             # Stop all services
.\dream-server\installers\windows\dream.ps1 restart           # Restart all services
.\dream-server\installers\windows\dream.ps1 logs llm          # Tail LLM logs

Common Windows Issues

Ollama conflict: If you set OLLAMA_PORT=11434 and Ollama Desktop is running, ports will conflict. Keep OLLAMA_PORT=8080 (default) or stop Ollama.

Docker Desktop not running: Start Docker Desktop from the Start Menu before running the installer.

WSL2 backend not enabled: Open Docker Desktop > Settings > General > check "Use WSL 2 based engine".

Stopping

# NVIDIA
docker compose down

# AMD Strix Halo
docker compose -f docker-compose.base.yml -f docker-compose.amd.yml down

# Windows
.\dream-server\installers\windows\dream.ps1 stop

Updating

# NVIDIA
docker compose pull
docker compose up -d

# AMD Strix Halo
docker compose -f docker-compose.base.yml -f docker-compose.amd.yml pull
docker compose -f docker-compose.base.yml -f docker-compose.amd.yml up -d --build

# Windows
.\dream-server\installers\windows\dream.ps1 update

Built by The Collective • DreamServer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dream Server Quick Start

Prerequisites

Step 1: Run the Installer (Linux)

Step 2: Wait for Model Download

Step 3: Validate Installation

Step 4: Open Chat UI

Step 5: Test the API

Hardware Tiers

Common Issues

"OOM" or "CUDA out of memory" (NVIDIA)

AMD: llama-server crash loop

Model download fails

WebUI shows "No models available"

Port conflicts

Next Steps

Windows

Prerequisites

Install

Manage

Common Windows Issues

Stopping

Updating

FilesExpand file tree

QUICKSTART.md

Latest commit

History

QUICKSTART.md

File metadata and controls

Dream Server Quick Start

Prerequisites

Step 1: Run the Installer (Linux)

Step 2: Wait for Model Download

Step 3: Validate Installation

Step 4: Open Chat UI

Step 5: Test the API

Hardware Tiers

Common Issues

"OOM" or "CUDA out of memory" (NVIDIA)

AMD: llama-server crash loop

Model download fails

WebUI shows "No models available"

Port conflicts

Next Steps

Windows

Prerequisites

Install

Manage

Common Windows Issues

Stopping

Updating