docs: local hardware-in-the-loop FT8/WSPR bench plan (#171) by ringof · Pull Request #172 · ringof/rx888-firmware

ringof · 2026-06-09T01:05:00Z

Adds docs/local-hwil-plan.md — Plan 1 for a local, closed-RF
hardware-in-the-loop test bench, drivable by a local Claude Code /goal.

Closes #171 (plan deliverable; implementation tracked there).

What this is

A plan (docs-only, no code) for a bench that exercises the full real signal
path of this firmware end to end:

QDX (TX, known message) -> attenuators -> RX888 -> FX3 firmware (this repo)
  -> radiod (rx888.so) -> decoder -> decoded message -> single pass/fail line

The orchestrator emits one grep-stable verdict line, which is what makes a
/goal honest — the goal evaluator judges only what's in the transcript, and
that line is the output of a real RF decode.

Two phases, one shared bench

Phase A — FT8 (ft8_lib encode+decode, ~15–30 s/run): fast iteration gate.
Phase B — WSPR via the actual wsprdaemon → wsprd → spot pipeline:
deployment-representative sign-off.

Only the TX mode and decoder tail differ between phases; the orchestrator is
parameterized by MODE={ft8,wspr}. Builds on the existing
docker/ka9q-radio/ harness (already builds radiod + rx888.so, flashes
SDDC_FX3.img).

Explicitly deferred to Plan 2

Remote self-hosted GitHub Actions runner, Actions/cron wiring, and the runner
security model (untrusted-fork code on a machine wired to a transmitter).
That surface is large enough to warrant its own plan.

Scope of this PR

Documentation only — docs/local-hwil-plan.md. No source/build/config changes;
firmware build and host tests are unaffected (build.yml paths-ignores
docs/**). Implementation lands in follow-up PRs against #171.

Open operator questions to resolve before implementation: initial FT8 band,
reserved test callsign/grid, availability of a USB-switchable hub for
self-reset, and single- vs sibling-container split.

https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT

Generated by Claude Code

Plan 1: a local, closed-RF test bench (QDX -> attenuators -> RX888 -> FX3 firmware -> radiod -> decoder) drivable by a local Claude Code /goal. Two phases sharing one bench: FT8 (ft8_lib) as the fast iteration loop, WSPR via real wsprdaemon as the deployment- representative sign-off. Documents the operator interface contract (RF safety / attenuation power rating, QDX/RX888 config, pass criteria), unattended-loop hygiene, and closed-system etiquette (no public spots). Remote self-hosted-runner CI and its security model are explicitly deferred to a separate plan (Plan 2). Docs-only; no code, build, or behavior changes. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT

Add the staged goal ladder (G0 software encode/decode + fixtures, then rungs 1–7) with a hardware-dependency map; record decisions: HITL image built FROM the ka9q-radio image; rung 2 split into 2a (CAT) / 2b (soundcard); rung 3 CW carrier ~10 MHz with numeric PASS (peak within 300 Hz, >=20 dB over noise); red DANGER!! pre-gate banner on every rung that keys the QDX into the RX888; and a generated known-content audio fixtures section (reference-tool authored, encoder doubles as TX stimulus, same-tool self-loopback caveat). First implementation step is now G0. Docs-only; no code, build, or behavior changes. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT

First implementation rung (G0) of the local HITL bench. Hardware-free audio encode->decode self-tests that print one grep-stable verdict line and exit 0/1 (the contract a local /goal reads): - run_g0_ft8.sh : ft8_lib gen_ft8 -> wav -> decode_ft8 - run_g0_wspr.sh: wspr-cui wsprsimwav -> wav -> (sox 12k) -> wsprd - gen_ft8_wav.sh / gen_wspr_wav.sh: known-content audio generators (also the TX stimulus source for the on-bench rungs) Real tools own protocol encode/decode; sox only converts sample rate (48k<->12k). External sources are cloned+built under .build/ (gitignored), pinned: ft8_lib@9fec6ca, wspr-cui@839b86f (mirrors the Dockerfile SHA-pin convention). Each rung decodes an authoritative fixture if present, else a self-loop. Committed FT8 fixture (12 kHz, 352K); WSPR runs self-loop (audio generated on the fly) to avoid a multi-MB blob. Host deps: gcc gfortran libfftw3-dev sox libgfortran5 libfftw3-single3. No FX3 SDK or hardware. Validated: both rungs OK/exit 0; wrong .expected -> FAIL/exit 1. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT

Each run now writes its self-loop audio to a visible out/ dir (gitignored) and prints the path + a copy-pastable decode command, so the operator has a real audio file to verify with their own tools (jt9 / wsprd) instead of trusting the harness verdict. WSPR previously surfaced no file at all. out/g0_ft8_selfloop.wav (12 kHz) out/g0_wspr_selfloop_48k.wav (48 kHz, QDX TX rate) out/g0_wspr_selfloop_12k.wav (12 kHz, wsprd input) Honors the earlier "FT8 fixture only" choice: out/ is gitignored, no multi-MB blob committed. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT

Plan drifted from what G0 became; bring it in line: - Status: approved/in-progress, G0 implemented (PR #172). - Name the real tooling: gen_ft8 (ft8_lib), wsprsimwav (wspr-cui) for WSPR audio, sox for rate conversion only (not synthesis). - Rewrite the fixtures/independence section to the operator-as-verifier model: encoder independence waived; each run emits audio to out/ for the operator to decode with their own jt9/wsprd. Document committed-FT8- fixture vs on-the-fly-WSPR. - Next step -> rung 1; mark test callsign/grid resolved (T1ABC/FN20). - tests/bench/README: add a copy-paste Prerequisites apt line so a clean checkout reproduces the run for independent verification. Docs-only. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT

First QDX-touching rung in the HWIL ladder. Proves the bench host can set frequency, key/unkey PTT, and read CAT responses from the QDX over USB CDC serial. The reusable qdx_cat.py module is used by all later QDX-touching rungs (2b, 3, 5, 7). New files: - tests/bench/qdx_cat.py — QdxCat class (context manager, safety RX on exit) - tests/bench/rung2a_cat_test.py — 6-check test (ID, freq, PTT, IF) - tests/bench/run_rung2a.sh — shell wrapper with preflight checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Kenwood CAT set commands (FA<freq>;, TX;, RX;) do not produce a serial response — only query commands do. Added send_set() for these and switched set_freq, tx_on, tx_off to use it. Validated against real QDX hardware (fw 1_09): all 6 rung 2a checks pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Proves the bench host can play audio into and capture audio from the QDX over its USB Audio Class device. Validates device discovery, capture, playback, and PTT + playback integration — the audio plumbing reused by every TX rung (3, 5, 7). New files: - qdx_audio.py: reusable audio discovery + playback helpers - rung2b_audio_test.py: test sequence (4 checks) - run_rung2b.sh: shell wrapper with preflight checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

qdx_cat.py: - Fix __exit__ crash: write RX; directly to port, bypassing send_set() whose reset_input_buffer() throws I/O errors on a wedged port. qdx_audio.py: - generate_tone: produce S24 stereo 48 kHz (QDX native format) instead of S16 mono that hw: rejects. - play_to_qdx: use hw: directly instead of plughw: so format mismatches fail loud. rung2a_cat_test.py: - Lead with FA; (proven on hardware) instead of gating on ID;. - ID/VN are informational, not gating. rung2b_audio_test.py: - Capture check verifies actual audio energy, not just file size > 0. - Honest verdict: explicitly states RF output is NOT verified by this test and requires manual confirmation with a separate receiver. All checks validated on real QDX hardware with 12V DC supply. RF output confirmed by operator using independent HF receiver. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- bench_rf_test.py: new manual RF verification tool (--port, --freq, --tone, --duration). Reads current QDX freq by default. - rung2a_cat_test.py: env vars → argparse. Wrap ID/VN and IF queries in try/except so they're non-fatal. - rung2b_audio_test.py: env vars → argparse. Print dial freq and expected carrier before PTT test. - README.md: add rungs 2a/2b, bench_rf_test, helper modules, usage. - local-hwil-plan.md: update status through rung 3 (confirmed manually). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…to image Move the Docker build context from docker/ka9q-radio/ to the project root so the Dockerfile can COPY tests/bench/ tools directly. This makes the container self-contained for rung 2 testing and future CI — no volume mounts needed for the QDX bench scripts. - Prefix existing COPY paths (patches/, rx888-test.conf, entrypoint.sh) with docker/ka9q-radio/ for the new context - Add python3, python3-serial, sox, alsa-utils to runtime image - COPY 5 bench tools into /usr/local/lib/bench/ - Add ka9q.sh build subcommand with the new -f Dockerfile syntax - Add conditional /dev/ttyACM0 passthrough in ka9q.sh start - Update build command in 8 docs/scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QDX transmits a known tone (1500 Hz at 14,095,600 Hz dial), powers captures the spectrum via ka9q-radio/RX888, and the script validates the tone appears at the expected carrier (14,097,100 Hz) above the noise floor. First fully closed-loop automated RF test in the bench ladder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The powers output format is a single line per snapshot: timestamp, low_freq, high_freq, binwidth, num_bins, p0, p1, ... not one freq,power pair per line. Fixed the parser to compute bin frequencies from low_freq + i * binwidth. Also added a one-shot retry when the first powers invocation returns no parseable bins — common on fresh container start. Verified on real hardware via docker exec (43.1 dB margin, peak dead on 14,097,100 Hz). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Routine pin bump — no patches, no config changes. Key upstream deltas since the previous pin (87567fa): USB watchdog resets only after a successful transfer, rx888 globals moved into struct sdrstate, isfinite() float-exception guard, TESTFX3 query failure now non-fatal, -march=native off by default. Also fixes stale "active patch 04" wording in README (patch 04 was upstreamed at 87567fa). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ple rate Add [FT8] and [WSPR] channel sections to rx888-test.conf covering the four QDX bands (80/40/30/20m) at standard dial frequencies. Data groups ft8-pcm.local and wspr-pcm.local give bench scripts a capture target for decode rungs 4-7. Replace FFTW wisdom generation (impractical in containers) with a runtime ADC_SAMPRATE env var (default 64m8, optional 129m6 for full-rate). Entrypoint sed-substitutes the sample rate into the config at startup. Existing smoke test (ka9q_smoke.sh) and unit test (ka9q_test.sh) are unaffected — they use ad-hoc powers queries via SSRC 30303, independent of named channel sections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pcmrecord is ka9q-radio's native RTP-to-WAV recorder with built-in FT8 (-8) and WSPR (-w) slot alignment. Needed for rung 4+ bench tests to capture demodulated audio from the ft8-pcm/wspr-pcm channel groups. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add FT8 roundtrip test: QDX transmits a randomly-generated FT8 message (unique callsign + grid per run) on each of four bands, RX888/radiod demodulates, pcmrecord captures slot-aligned WAV, decode_ft8 asserts the message decodes. Verified on live hardware — all 4 bands pass. Dockerfile: build ft8_lib (gen_ft8 + decode_ft8) pinned to 9fec6ca, copy binaries + rung4 script into runtime image. Key fixes found during bring-up: - pcmrecord writes WAVE_FORMAT_EXTENSIBLE WAV that ft8_lib can't parse; normalize through sox before decode_ft8. - SIGTERM on pcmrecord leaves a short partial for the next slot; try all captured WAVs instead of just the last one. - Clean capture directory per band to avoid stale files from prior runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…script Build wsprsimwav + wsprd (jj1bdx/wspr-cui, pinned 839b86f) in the Docker image alongside ft8_lib. Add gfortran to builder, libgfortran5 to runtime. New wspr_roundtrip_test.py orchestrates a single-band (40m, 7.0386 MHz) WSPR encode→TX→capture→decode loop using pcmrecord -w (120s slot-aligned captures) and wsprd. Reports SNR so the operator can calibrate attenuation to the -10 to -15 dB target range. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ware Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two bugs found during first hardware validation of the WSPR roundtrip: 1. SSRC mismatch: radiod rounds freq-in-kHz (7038.6 → 7039) but the script truncated (7038600 // 1000 = 7038). pcmrecord found no matching stream → no captures. Fix: use round() instead of //. 2. TX audio too quiet: wsprsimwav's -6 dB output was below the QDX's modulation threshold — no RF despite confirmed PTT. Fix: normalize via sox gain -n during the mono→stereo conversion. Exposed as --drive (default -1 dB) so the operator can tune the level. Hardware-validated: WSPR roundtrip decode confirmed at SNR +29 dB on 40m (7.038600 MHz) with QDX → attenuator → RX888 → radiod → wsprd. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Set AD8370 VGA gain to 0 dB (was 10) and PE4304 attenuator to 31.5 dB (was 0) to reduce signal level for the QDX→RX888 bench loopback. WSPR roundtrip SNR dropped from +29 to +6 dB; an additional inline 20 dB pad is needed to reach the -10 to -15 dB target. Update HWIL plan to reflect WSPR hardware validation complete. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Drop rung2a_/rung2b_/rung3_/rung4_/wspr_roundtrip_ prefixes from bench test scripts in favor of descriptive names (cat_test, audio_test, loopback_test, ft8_test, wspr_test, rf_test). Update all verdict strings, log prefixes, temp dirs, output filenames, docstrings, Dockerfile COPY lines, and docs to match. Add bench.sh — a unified dispatcher that runs host tests directly and container tests via docker exec, with `bench.sh all` for full runs and `bench.sh list` for discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…utput - bench.sh: SKIP_AUDIO=1 skips audio test in `bench.sh all`; all python3 invocations now use -u for unbuffered stdout/stderr so output streams live through `docker exec` without a TTY. - ft8_test.py: add --passes arg (default 3). Each band runs N passes with a fresh random message per pass; all must decode for a band to pass. Fail-fast on first decode failure. Verdict shows pass counts (e.g. "20m(1/3)"). - wspr_test.py: expand from 40m-only to 80/40/30/20m matching the radiod [WSPR] config. Loop over bands with per-band message generation. Remove --freq arg (replaced by built-in bands list). SSRC derived automatically from dial frequency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…00 samples pcmrecord occasionally emits slot captures with a few extra samples beyond 180000 (e.g. 180005 — ~0.4 ms of timing jitter). decode_ft8 (ft8_lib) crashes with exit 255 ("cannot load wave file") when the sample count exceeds 180000. This caused the FT8 bench test to fail nondeterministically on 20m while 80m/40m/30m passed — the 20m capture happened to land 5 samples over the limit. The signal was fine (+15.5 dB SNR, decoded after trim). Add "trim 0 15.0" to the sox normalization in decode_and_check(), capping output at exactly 180000 samples. No-op on files already at or under that count. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude added 4 commits June 9, 2026 01:04

ringof mentioned this pull request Jun 9, 2026

Local hardware-in-the-loop (HITL) test bench: closed-RF FT8/WSPR loopback drivable by a local /goal #171

Open

9 tasks

claude and others added 20 commits June 9, 2026 02:35

docs(#171): update HWIL plan — rungs 3–5 complete, WSPR awaiting hard…

28d7d88

…ware Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: local hardware-in-the-loop FT8/WSPR bench plan (#171)#172

docs: local hardware-in-the-loop FT8/WSPR bench plan (#171)#172
ringof wants to merge 24 commits into
mainfrom
claude/171-local-hitl

ringof commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ringof commented Jun 9, 2026

What this is

Two phases, one shared bench

Explicitly deferred to Plan 2

Scope of this PR

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants