feat(amd): integrate AMD Lemonade as inference backend by Lightheartdevs · Pull Request #579 · Light-Heart-Labs/DreamServer

Lightheartdevs · 2026-03-23T20:36:52Z

Summary

Adds AMD Lemonade Server as the preferred inference backend when AMD hardware is detected
Lemonade provides native NPU + Vulkan + ROCm acceleration, enabling hybrid NPU+GPU execution on Strix Halo
Windows: silent MSI install with user prompt, llama-server Vulkan fallback if declined
Linux: Lemonade Docker image with ROCm passthrough replaces toolbox image
NPU detection added for Ryzen AI (Win32_PnPEntity + sysfs)
NVIDIA and CPU-only paths completely untouched

Key changes

New files:

config/litellm/lemonade.yaml — LiteLLM routing config for Lemonade

Windows installer (install-windows.ps1):

Phase 8: prompts user → MSI install → lemonade-server serve --extra-models-dir → health check
Falls back to Vulkan llama-server if user declines or install fails
Patches .env to correct backend/path on fallback

Docker compose (docker-compose.amd.yml):

ghcr.io/lemonade-sdk/lemonade-server:latest with ROCm, --no-tray, --extra-models-dir
3 persistent volumes (model cache, llama binaries, recipes)
Healthcheck override for /api/v1/health
Open WebUI override for /api/v1 base path

CLI (dream.ps1):

Backend-aware process management (Lemonade vs llama-server)
Correct health check endpoints, startup args, chat API paths

Detection:

NPU detection on Windows (Win32_PnPEntity) and Linux (sysfs/lspci)
HasNpu flag for Strix Halo hybrid mode

Context

AMD is offering DreamServer hardware and runs the Lemonade Developer Challenge. Integrating Lemonade gives native AMD optimization, contest eligibility, and a partnership story.

Test plan

Windows AMD: install with Lemonade accepted → verify health, chat UI, dashboard
Windows AMD: install with Lemonade declined → verify llama-server fallback works
Windows NVIDIA: verify zero behavioral change
Windows no-GPU: verify zero behavioral change
Linux AMD: docker compose -f docker-compose.base.yml -f docker-compose.amd.yml config validates
Verify MSI install path matches "C:\Program Files\Lemonade Server\bin\lemonade-server.exe"

🤖 Generated with Claude Code

Replace llama-server with AMD Lemonade Server when AMD hardware is detected. Lemonade provides native NPU + Vulkan + ROCm acceleration, enabling hybrid NPU+GPU execution on Strix Halo and optimized inference on all AMD silicon. Windows: silent MSI install with user prompt, llama-server Vulkan fallback Linux: Lemonade Docker image with ROCm passthrough replaces toolbox image NPU detection added for Ryzen AI (Win32_PnPEntity + sysfs) New files: - config/litellm/lemonade.yaml (LiteLLM routing for Lemonade backend) Modified: - constants.ps1: Lemonade MSI URL, paths, health endpoint - detection.ps1/sh: NPU detection for Ryzen AI - env-generator.ps1: LLM_BACKEND and LLM_API_BASE_PATH variables - install-windows.ps1: Lemonade install + fallback flow in Phase 8 - dream.ps1: Lemonade-aware process management and health checks - docker-compose.windows-amd.yml: API path support for Lemonade - docker-compose.amd.yml: Lemonade Docker image with ROCm - amd.json: backend contract updated for Lemonade - .env.example: document LLM_BACKEND variable NVIDIA and CPU-only paths are completely untouched. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

P0 fixes: - MSI install path: use "Lemonade Server" (with space) under Program Files, add ALLUSERS=1 for admin install, exe is in bin/ subdirectory - Remove broken /api/v1/load call: use --extra-models-dir flag instead, Lemonade auto-discovers GGUFs and loads on first request - Patch .env when user declines Lemonade: LLM_BACKEND and LLM_API_BASE_PATH are corrected to llama-server values using the bootstrap patching pattern P1 fixes: - Docker compose: add --no-tray (headless), --extra-models-dir /models, persistent volumes (cache, llama binaries, recipes), healthcheck override - OpenClaw provider URL: add /api prefix for Lemonade's /api/v1 endpoint - dream.ps1: add --no-tray, --llamacpp vulkan, --extra-models-dir to Lemonade startup args - strix-halo-config.yaml: update to /api/v1 endpoint with lemonade api_key P2 fixes: - .env.example: document LLM_API_BASE_PATH variable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- docker-compose.amd.yml: override Open WebUI OPENAI_API_BASE_URL to /api/v1 for Lemonade (base compose hardcodes /v1) - dashboard-api setup.py: read LLM_API_BASE_PATH env var instead of hardcoding /v1/chat/completions - Both AMD overlays: pass LLM_API_BASE_PATH to dashboard-api container Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The Linux installer's image pre-pull list (08-images.sh) still referenced the old kyuz0/amd-strix-halo-toolboxes:rocm-7.2 image, while docker-compose.amd.yml was updated to ghcr.io/lemonade-sdk/lemonade-server in PR #579. This caused the installer to download ~8GB of the wrong image, then docker compose up had to separately pull Lemonade at startup. Confirmed on Strix Halo (Ubuntu, 124GB RAM, Ryzen AI MAX+ 395). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

) The Linux installer's image pre-pull list (08-images.sh) still referenced the old kyuz0/amd-strix-halo-toolboxes:rocm-7.2 image, while docker-compose.amd.yml was updated to ghcr.io/lemonade-sdk/lemonade-server in PR #579. This caused the installer to download ~8GB of the wrong image, then docker compose up had to separately pull Lemonade at startup. Confirmed on Strix Halo (Ubuntu, 124GB RAM, Ryzen AI MAX+ 395). Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Lightheartdevs and others added 3 commits March 23, 2026 16:10

Lightheartdevs merged commit 45512c7 into main Mar 23, 2026
15 of 22 checks passed

Lightheartdevs mentioned this pull request Mar 23, 2026

fix(linux): pre-pull Lemonade image for AMD instead of old toolbox #580

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(amd): integrate AMD Lemonade as inference backend#579

feat(amd): integrate AMD Lemonade as inference backend#579
Lightheartdevs merged 3 commits intomainfrom
feat/lemonade-amd-backend

Lightheartdevs commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lightheartdevs commented Mar 23, 2026

Summary

Key changes

Context

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant