Add real-time /status dashboard with SSE push by howard0su · Pull Request #322 · Luce-Org/lucebox-hub

howard0su · 2026-05-31T11:38:16Z

Summary

Adds a live server status page at GET /status that shows the current inference state, request details, generated output, and performance history — all pushed in real time via Server-Sent Events.

What it does

/status — Serves a self-contained HTML dashboard (dark theme, no CDN deps)
/status/events — SSE endpoint pushing event: status (state snapshot) and event: token (incremental output text)
/status/json — JSON snapshot for programmatic access

Dashboard shows

Section	Details
Phase	idle / prefill / decode badge
Request params	model, format, temperature, top_p/k, max_output, session_id
Feature tags	cache hit, pflash, spec decode, stream, thinking
Live stats	prompt tokens, completion tokens, elapsed time, live tok/s
Draft tokens	Current spec-decode candidate tokens (updated per step)
Request messages	Chat messages JSON (truncated, scrollable)
Response output	Token-by-token output accumulated client-side
Perf charts	Prefill tok/s and decode tok/s + accept rate (last 50 requests)

Add a real-time server status dashboard accessible at GET /status: - Serves standalone HTML from server/share/status.html (editable without recompile) - GET /status/events SSE endpoint pushes live JSON updates per spec-decode step - GET /status/json provides a snapshot for non-SSE clients Status tracking (server_status.h): - Current phase (idle/prefill/decode) with prompt excerpt and token counts - Draft tokens being verified (updated each spec-decode step) - Performance history (last 50 requests): prefill tok/s, decode tok/s, accept rate - RAII StatusGuard ensures status resets to idle on all exit paths Backend instrumentation (InferenceObserver on DaemonIO): - Observer callback in model_backend.h, called at each draft/verify step - Instrumented in qwen35_backend.cpp and generic dflash_spec_decode.cpp - Zero overhead when no SSE clients are connected (empty std::function check) Dashboard features (status.html): - Dark-themed responsive UI with phase badges and live counters - Draft token display updated per spec-decode step - SVG-based performance charts (prefill tok/s, decode tok/s, accept rate) - Auto-reconnecting EventSource with connection status indicator - No external CDN dependencies Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add CMake POST_BUILD rule to copy share/status.html into build/share/ - Add exe_dir/share/status.html as a search path (build dir layout) - Keeps existing ../share/ and ./share/ fallbacks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add request params to status: model, format, temperature, top_p/k, max_output, thinking_enabled, session_id, cache/pflash/spec_decode flags - Add incremental 'event: token' SSE events (browser accumulates output) - Add messages JSON to status event (sent once per request) - Redesigned HTML: two-column request/response view, params grid, feature tags (cache hit, pflash, spec decode, stream, thinking), live tok/s - All state accumulated client-side; server stays stateless for output text Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Token text can contain partial UTF-8 sequences (tokens split multi-byte codepoints). Use json::error_handler_t::replace in all dump() calls on status paths so invalid bytes become U+FFFD instead of throwing type_error 316. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

cubic-dev-ai

8 issues found across 8 files

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

Record the post-push discovery and merge of PR Luce-Org#322, the conflict resolution with the MTP hook, and updated validation/classification.

- Add /status/json to kApiEndpoints registry - Replace raw ::send() with sse_try_send() helper that handles partial writes via poll loop with a short 1s timeout (avoids stalling worker) - Add sse_heartbeat() to prune disconnected SSE clients during idle periods (worker dequeue uses timed wait, sends heartbeat every 30s) - Use $<TARGET_FILE_DIR:dflash_server> in CMake POST_BUILD copy rule for correct output path with multi-config generators - Add install(FILES) rule for status.html - Clear messages panel when a new request starts in the browser - Use incremental DOM append (createTextNode) for token events instead of re-rendering full output text on each token Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When a prefix cache hit occurs, the backend only prefills the delta tokens beyond the cached prefix. The previous calculation divided the full prompt token count by the delta prefill time, giving either 0 (full cache hit, no delta) or a wildly inflated number (partial hit). Now uses the actual number of tokens that were prefilled: - Full cache hit: 0 tok/s (correct — no prefill work done) - Partial cache hit: delta_tokens / prefill_time - No cache hit: effective_prompt.size() / prefill_time Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

cubic-dev-ai

1 issue found across 4 files (changes from recent commits).

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

Record the 2026-06-01 06:09 auto-integration refresh, including exact integration of current PR Luce-Org#322 and Luce-Org#294 heads, fresh direct-merge probes for remaining selective-port candidates, and validation results.

Replace blocking sse_try_send() (1s timeout per client) with MSG_DONTWAIT send in sse_heartbeat(). The 12-byte heartbeat ping will succeed instantly for any healthy client; slow clients with full buffers are pruned immediately instead of stalling the worker thread. This eliminates up to N×1s latency on idle-to-active transitions when slow SSE clients are connected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge advanced status dashboard head 4b40aa1 after post-push enumeration detected the PR moved during the cron run.

Record post-push detection and integration of the advanced PR Luce-Org#322 head plus validation for the refreshed stack.

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="server/src/server/http_server.cpp">

<violation number="1" location="server/src/server/http_server.cpp:591">
P1: Heartbeat non-blocking send does not handle partial writes, risking SSE stream corruption</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic}

cubic-dev-ai · 2026-06-01T10:40:06Z

+    for (int fd : sse_fds_) {
+        // Non-blocking send: if the socket buffer can't accept 12 bytes
+        // immediately, the client is too far behind — treat as dead.
+        ssize_t n = ::send(fd, ping, sizeof(ping) - 1, MSG_NOSIGNAL | MSG_DONTWAIT);


P1: Heartbeat non-blocking send does not handle partial writes, risking SSE stream corruption

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At server/src/server/http_server.cpp, line 591: <comment>Heartbeat non-blocking send does not handle partial writes, risking SSE stream corruption</comment> <file context> @@ -580,12 +580,16 @@ void HttpServer::broadcast_token(const std::string & text) { - if (!sse_try_send(fd, ping, sizeof(ping) - 1)) { + // Non-blocking send: if the socket buffer can't accept 12 bytes + // immediately, the client is too far behind — treat as dead. + ssize_t n = ::send(fd, ping, sizeof(ping) - 1, MSG_NOSIGNAL | MSG_DONTWAIT); + if (n <= 0) { dead.push_back(fd); </file context>

howard0su and others added 4 commits May 31, 2026 15:12

cubic-dev-ai Bot reviewed May 31, 2026

View reviewed changes

easel pushed a commit to easel/lucebox-hub that referenced this pull request May 31, 2026

docs: note pr322 integration

7c190bc

Record the post-push discovery and merge of PR Luce-Org#322, the conflict resolution with the MTP hook, and updated validation/classification.

howard0su and others added 2 commits May 31, 2026 19:51

cubic-dev-ai Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread server/src/server/http_server.cpp

easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 1, 2026

Merge PR Luce-Org#322 into auto-integration

d560824

Merge advanced status dashboard head 4b40aa1 after post-push enumeration detected the PR moved during the cron run.

easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 1, 2026

docs: refresh auto-integration manifest

089cb77

Record post-push detection and integration of the advanced PR Luce-Org#322 head plus validation for the refreshed stack.

cubic-dev-ai Bot reviewed Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add real-time /status dashboard with SSE push#322

Add real-time /status dashboard with SSE push#322
howard0su wants to merge 7 commits into
Luce-Org:mainfrom
howard0su:status_html

howard0su commented May 31, 2026

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

howard0su commented May 31, 2026

Summary

What it does

Dashboard shows

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading