Skip to content

fix(amd): dashboard token metrics via Lemonade inner llama-server#607

Merged
Lightheartdevs merged 1 commit intomainfrom
fix/dashboard-lemonade-metrics
Mar 24, 2026
Merged

fix(amd): dashboard token metrics via Lemonade inner llama-server#607
Lightheartdevs merged 1 commit intomainfrom
fix/dashboard-lemonade-metrics

Conversation

@Lightheartdevs
Copy link
Collaborator

Summary

  • Dashboard "Tokens/sec" and "Tokens Generated" always showed and 0 on AMD/Lemonade
  • Lemonade wraps llama.cpp but doesn't proxy /metrics through its main port (8080)
  • Pass --metrics --host 0.0.0.0 to the inner llama-server via --llamacpp-args
  • Expose port 8001 (inner llama-server) on the Docker network
  • LLAMA_METRICS_PORT=8001 tells dashboard-api to query the inner process directly

Changes

  • docker-compose.amd.yml — added --llamacpp-args, expose: ["8001"], LLAMA_METRICS_PORT
  • helpers.pymetrics_port = int(os.environ.get("LLAMA_METRICS_PORT", port))

Backwards compatibility

LLAMA_METRICS_PORT defaults to the main service port. NVIDIA/CPU setups don't set it — zero change.

Test plan

  • AMD: dashboard shows tokens/sec after first inference
  • NVIDIA: dashboard metrics unchanged
  • Verify first-inference delay (Lemonade lazily spawns inner llama-server)

🤖 Generated with Claude Code

Lemonade wraps llama.cpp and doesn't proxy /metrics through its main
API port. Pass --metrics --host 0.0.0.0 to the inner llama-server via
--llamacpp-args, expose port 8001, and add LLAMA_METRICS_PORT env var
to dashboard-api so it queries the inner process directly.

Backwards-compatible: LLAMA_METRICS_PORT defaults to the main service
port on non-Lemonade setups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Lightheartdevs Lightheartdevs merged commit 5afa637 into main Mar 24, 2026
15 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant