[LuceBox][DFlash][lucebox-pr314-common-empty-fallback][2/n] Default empty spec retry in backend calls by OmarB97 · Pull Request #319 · Luce-Org/lucebox-hub

OmarB97 · 2026-05-31T09:22:27Z

Why

Howard's PR #314 review asked that the empty speculative-decode retry stay behind the normal backend call name, especially for restore_and_generate, with the retry enabled by default instead of exposed as a separate call-site helper.

What changed

Kept ModelBackend::generate and ModelBackend::restore_and_generate as the public default call surface.
Moved backend-specific implementations behind generate_impl and restore_and_generate_impl.
Updated daemon, HTTP, backend subclasses, and unit tests to use the default call names while preserving the centralized zero-token speculative retry.

How to review

Start with server/src/common/model_backend.h to verify the default retry wrapper, then scan the call-site cleanup in server/src/common/daemon_loop.cpp and server/src/server/http_server.cpp. The remaining backend changes are mechanical override renames.

Evidence

Follow-up target: fix(common): retry empty spec-decode output through AR #314 (comment)
test_server_unit stdout on taro: Results: 1620 assertions, 0 failures and ALL PASSED.
dflash_server build on taro: [100%] Built target dflash_server.
No visual evidence: backend C++ API/behavior cleanup only; no UI or rendering surface changed.

Verification

git diff --check
On taro, from /tmp/lucebox-pr314-restore-default-a079a4b: cmake -S server -B server/build -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc -DCUDAToolkit_ROOT=/usr/local/cuda -DCMAKE_CUDA_ARCHITECTURES=120 -DDFLASH27B_FA_ALL_QUANTS=OFF
On taro: cmake --build server/build --target test_server_unit -j$(nproc)
On taro: ./server/build/test_server_unit -> 1620 assertions, 0 failures
On taro: cmake --build server/build --target dflash_server -j$(nproc)

Risks / gaps

The implementation rename touches every backend subclass, so the main risk is a missed override or call site. No follow-up task is needed: the focused CUDA build compiled dflash_common, test_server_unit, and dflash_server on sm_120, which covers the touched call surface.

Collaborators

Omar Baradei requested the PR fix(common): retry empty spec-decode output through AR #314 follow-up from ko-mac on May 31, 2026.
Codex on ko-mac (ko-mac.codex#629f416c13) implemented and verified the change for MeshBoard task lucebox-pr314-common-empty-fallback, using taro only as the CUDA build host.

cubic-dev-ai

No issues found across 16 files

_{Re-trigger cubic}

fix(common): default empty spec retry in backend calls

a079a4b

cubic-dev-ai Bot reviewed May 31, 2026

View reviewed changes

Omar Baradei and others added 2 commits May 31, 2026 08:35

fix: retry dflash generations with no visible output

74c1483

Merge pull request #7 from OmarB97/codex/visible-empty-dflash-retry

de1c77f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LuceBox][DFlash][lucebox-pr314-common-empty-fallback][2/n] Default empty spec retry in backend calls#319

[LuceBox][DFlash][lucebox-pr314-common-empty-fallback][2/n] Default empty spec retry in backend calls#319
OmarB97 wants to merge 3 commits into
Luce-Org:mainfrom
OmarB97:codex/pr314-restore-default

OmarB97 commented May 31, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

OmarB97 commented May 31, 2026

Why

What changed

How to review

Evidence

Verification

Risks / gaps

Collaborators

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant