docs: local hardware-in-the-loop FT8/WSPR bench plan (#171)#172
Draft
ringof wants to merge 24 commits into
Draft
docs: local hardware-in-the-loop FT8/WSPR bench plan (#171)#172ringof wants to merge 24 commits into
ringof wants to merge 24 commits into
Conversation
Plan 1: a local, closed-RF test bench (QDX -> attenuators -> RX888 -> FX3 firmware -> radiod -> decoder) drivable by a local Claude Code /goal. Two phases sharing one bench: FT8 (ft8_lib) as the fast iteration loop, WSPR via real wsprdaemon as the deployment- representative sign-off. Documents the operator interface contract (RF safety / attenuation power rating, QDX/RX888 config, pass criteria), unattended-loop hygiene, and closed-system etiquette (no public spots). Remote self-hosted-runner CI and its security model are explicitly deferred to a separate plan (Plan 2). Docs-only; no code, build, or behavior changes. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT
Add the staged goal ladder (G0 software encode/decode + fixtures, then rungs 1–7) with a hardware-dependency map; record decisions: HITL image built FROM the ka9q-radio image; rung 2 split into 2a (CAT) / 2b (soundcard); rung 3 CW carrier ~10 MHz with numeric PASS (peak within 300 Hz, >=20 dB over noise); red DANGER!! pre-gate banner on every rung that keys the QDX into the RX888; and a generated known-content audio fixtures section (reference-tool authored, encoder doubles as TX stimulus, same-tool self-loopback caveat). First implementation step is now G0. Docs-only; no code, build, or behavior changes. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT
First implementation rung (G0) of the local HITL bench. Hardware-free
audio encode->decode self-tests that print one grep-stable verdict line
and exit 0/1 (the contract a local /goal reads):
- run_g0_ft8.sh : ft8_lib gen_ft8 -> wav -> decode_ft8
- run_g0_wspr.sh: wspr-cui wsprsimwav -> wav -> (sox 12k) -> wsprd
- gen_ft8_wav.sh / gen_wspr_wav.sh: known-content audio generators
(also the TX stimulus source for the on-bench rungs)
Real tools own protocol encode/decode; sox only converts sample rate
(48k<->12k). External sources are cloned+built under .build/ (gitignored),
pinned: ft8_lib@9fec6ca, wspr-cui@839b86f (mirrors the Dockerfile SHA-pin
convention). Each rung decodes an authoritative fixture if present, else a
self-loop. Committed FT8 fixture (12 kHz, 352K); WSPR runs self-loop
(audio generated on the fly) to avoid a multi-MB blob.
Host deps: gcc gfortran libfftw3-dev sox libgfortran5 libfftw3-single3.
No FX3 SDK or hardware. Validated: both rungs OK/exit 0; wrong .expected
-> FAIL/exit 1.
https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT
Each run now writes its self-loop audio to a visible out/ dir (gitignored) and prints the path + a copy-pastable decode command, so the operator has a real audio file to verify with their own tools (jt9 / wsprd) instead of trusting the harness verdict. WSPR previously surfaced no file at all. out/g0_ft8_selfloop.wav (12 kHz) out/g0_wspr_selfloop_48k.wav (48 kHz, QDX TX rate) out/g0_wspr_selfloop_12k.wav (12 kHz, wsprd input) Honors the earlier "FT8 fixture only" choice: out/ is gitignored, no multi-MB blob committed. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT
9 tasks
Plan drifted from what G0 became; bring it in line: - Status: approved/in-progress, G0 implemented (PR #172). - Name the real tooling: gen_ft8 (ft8_lib), wsprsimwav (wspr-cui) for WSPR audio, sox for rate conversion only (not synthesis). - Rewrite the fixtures/independence section to the operator-as-verifier model: encoder independence waived; each run emits audio to out/ for the operator to decode with their own jt9/wsprd. Document committed-FT8- fixture vs on-the-fly-WSPR. - Next step -> rung 1; mark test callsign/grid resolved (T1ABC/FN20). - tests/bench/README: add a copy-paste Prerequisites apt line so a clean checkout reproduces the run for independent verification. Docs-only. https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT
First QDX-touching rung in the HWIL ladder. Proves the bench host can set frequency, key/unkey PTT, and read CAT responses from the QDX over USB CDC serial. The reusable qdx_cat.py module is used by all later QDX-touching rungs (2b, 3, 5, 7). New files: - tests/bench/qdx_cat.py — QdxCat class (context manager, safety RX on exit) - tests/bench/rung2a_cat_test.py — 6-check test (ID, freq, PTT, IF) - tests/bench/run_rung2a.sh — shell wrapper with preflight checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kenwood CAT set commands (FA<freq>;, TX;, RX;) do not produce a serial response — only query commands do. Added send_set() for these and switched set_freq, tx_on, tx_off to use it. Validated against real QDX hardware (fw 1_09): all 6 rung 2a checks pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Proves the bench host can play audio into and capture audio from the QDX over its USB Audio Class device. Validates device discovery, capture, playback, and PTT + playback integration — the audio plumbing reused by every TX rung (3, 5, 7). New files: - qdx_audio.py: reusable audio discovery + playback helpers - rung2b_audio_test.py: test sequence (4 checks) - run_rung2b.sh: shell wrapper with preflight checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
qdx_cat.py: - Fix __exit__ crash: write RX; directly to port, bypassing send_set() whose reset_input_buffer() throws I/O errors on a wedged port. qdx_audio.py: - generate_tone: produce S24 stereo 48 kHz (QDX native format) instead of S16 mono that hw: rejects. - play_to_qdx: use hw: directly instead of plughw: so format mismatches fail loud. rung2a_cat_test.py: - Lead with FA; (proven on hardware) instead of gating on ID;. - ID/VN are informational, not gating. rung2b_audio_test.py: - Capture check verifies actual audio energy, not just file size > 0. - Honest verdict: explicitly states RF output is NOT verified by this test and requires manual confirmation with a separate receiver. All checks validated on real QDX hardware with 12V DC supply. RF output confirmed by operator using independent HF receiver. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- bench_rf_test.py: new manual RF verification tool (--port, --freq, --tone, --duration). Reads current QDX freq by default. - rung2a_cat_test.py: env vars → argparse. Wrap ID/VN and IF queries in try/except so they're non-fatal. - rung2b_audio_test.py: env vars → argparse. Print dial freq and expected carrier before PTT test. - README.md: add rungs 2a/2b, bench_rf_test, helper modules, usage. - local-hwil-plan.md: update status through rung 3 (confirmed manually). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…to image Move the Docker build context from docker/ka9q-radio/ to the project root so the Dockerfile can COPY tests/bench/ tools directly. This makes the container self-contained for rung 2 testing and future CI — no volume mounts needed for the QDX bench scripts. - Prefix existing COPY paths (patches/, rx888-test.conf, entrypoint.sh) with docker/ka9q-radio/ for the new context - Add python3, python3-serial, sox, alsa-utils to runtime image - COPY 5 bench tools into /usr/local/lib/bench/ - Add ka9q.sh build subcommand with the new -f Dockerfile syntax - Add conditional /dev/ttyACM0 passthrough in ka9q.sh start - Update build command in 8 docs/scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
QDX transmits a known tone (1500 Hz at 14,095,600 Hz dial), powers captures the spectrum via ka9q-radio/RX888, and the script validates the tone appears at the expected carrier (14,097,100 Hz) above the noise floor. First fully closed-loop automated RF test in the bench ladder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The powers output format is a single line per snapshot: timestamp, low_freq, high_freq, binwidth, num_bins, p0, p1, ... not one freq,power pair per line. Fixed the parser to compute bin frequencies from low_freq + i * binwidth. Also added a one-shot retry when the first powers invocation returns no parseable bins — common on fresh container start. Verified on real hardware via docker exec (43.1 dB margin, peak dead on 14,097,100 Hz). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Routine pin bump — no patches, no config changes. Key upstream deltas since the previous pin (87567fa): USB watchdog resets only after a successful transfer, rx888 globals moved into struct sdrstate, isfinite() float-exception guard, TESTFX3 query failure now non-fatal, -march=native off by default. Also fixes stale "active patch 04" wording in README (patch 04 was upstreamed at 87567fa). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ple rate Add [FT8] and [WSPR] channel sections to rx888-test.conf covering the four QDX bands (80/40/30/20m) at standard dial frequencies. Data groups ft8-pcm.local and wspr-pcm.local give bench scripts a capture target for decode rungs 4-7. Replace FFTW wisdom generation (impractical in containers) with a runtime ADC_SAMPRATE env var (default 64m8, optional 129m6 for full-rate). Entrypoint sed-substitutes the sample rate into the config at startup. Existing smoke test (ka9q_smoke.sh) and unit test (ka9q_test.sh) are unaffected — they use ad-hoc powers queries via SSRC 30303, independent of named channel sections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pcmrecord is ka9q-radio's native RTP-to-WAV recorder with built-in FT8 (-8) and WSPR (-w) slot alignment. Needed for rung 4+ bench tests to capture demodulated audio from the ft8-pcm/wspr-pcm channel groups. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add FT8 roundtrip test: QDX transmits a randomly-generated FT8 message (unique callsign + grid per run) on each of four bands, RX888/radiod demodulates, pcmrecord captures slot-aligned WAV, decode_ft8 asserts the message decodes. Verified on live hardware — all 4 bands pass. Dockerfile: build ft8_lib (gen_ft8 + decode_ft8) pinned to 9fec6ca, copy binaries + rung4 script into runtime image. Key fixes found during bring-up: - pcmrecord writes WAVE_FORMAT_EXTENSIBLE WAV that ft8_lib can't parse; normalize through sox before decode_ft8. - SIGTERM on pcmrecord leaves a short partial for the next slot; try all captured WAVs instead of just the last one. - Clean capture directory per band to avoid stale files from prior runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…script Build wsprsimwav + wsprd (jj1bdx/wspr-cui, pinned 839b86f) in the Docker image alongside ft8_lib. Add gfortran to builder, libgfortran5 to runtime. New wspr_roundtrip_test.py orchestrates a single-band (40m, 7.0386 MHz) WSPR encode→TX→capture→decode loop using pcmrecord -w (120s slot-aligned captures) and wsprd. Reports SNR so the operator can calibrate attenuation to the -10 to -15 dB target range. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ware Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs found during first hardware validation of the WSPR roundtrip: 1. SSRC mismatch: radiod rounds freq-in-kHz (7038.6 → 7039) but the script truncated (7038600 // 1000 = 7038). pcmrecord found no matching stream → no captures. Fix: use round() instead of //. 2. TX audio too quiet: wsprsimwav's -6 dB output was below the QDX's modulation threshold — no RF despite confirmed PTT. Fix: normalize via sox gain -n during the mono→stereo conversion. Exposed as --drive (default -1 dB) so the operator can tune the level. Hardware-validated: WSPR roundtrip decode confirmed at SNR +29 dB on 40m (7.038600 MHz) with QDX → attenuator → RX888 → radiod → wsprd. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set AD8370 VGA gain to 0 dB (was 10) and PE4304 attenuator to 31.5 dB (was 0) to reduce signal level for the QDX→RX888 bench loopback. WSPR roundtrip SNR dropped from +29 to +6 dB; an additional inline 20 dB pad is needed to reach the -10 to -15 dB target. Update HWIL plan to reflect WSPR hardware validation complete. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop rung2a_/rung2b_/rung3_/rung4_/wspr_roundtrip_ prefixes from bench test scripts in favor of descriptive names (cat_test, audio_test, loopback_test, ft8_test, wspr_test, rf_test). Update all verdict strings, log prefixes, temp dirs, output filenames, docstrings, Dockerfile COPY lines, and docs to match. Add bench.sh — a unified dispatcher that runs host tests directly and container tests via docker exec, with `bench.sh all` for full runs and `bench.sh list` for discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…utput - bench.sh: SKIP_AUDIO=1 skips audio test in `bench.sh all`; all python3 invocations now use -u for unbuffered stdout/stderr so output streams live through `docker exec` without a TTY. - ft8_test.py: add --passes arg (default 3). Each band runs N passes with a fresh random message per pass; all must decode for a band to pass. Fail-fast on first decode failure. Verdict shows pass counts (e.g. "20m(1/3)"). - wspr_test.py: expand from 40m-only to 80/40/30/20m matching the radiod [WSPR] config. Loop over bands with per-band message generation. Remove --freq arg (replaced by built-in bands list). SSRC derived automatically from dial frequency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…00 samples
pcmrecord occasionally emits slot captures with a few extra samples
beyond 180000 (e.g. 180005 — ~0.4 ms of timing jitter). decode_ft8
(ft8_lib) crashes with exit 255 ("cannot load wave file") when the
sample count exceeds 180000.
This caused the FT8 bench test to fail nondeterministically on 20m
while 80m/40m/30m passed — the 20m capture happened to land 5 samples
over the limit. The signal was fine (+15.5 dB SNR, decoded after trim).
Add "trim 0 15.0" to the sox normalization in decode_and_check(),
capping output at exactly 180000 samples. No-op on files already at
or under that count.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
docs/local-hwil-plan.md— Plan 1 for a local, closed-RFhardware-in-the-loop test bench, drivable by a local Claude Code
/goal.Closes #171 (plan deliverable; implementation tracked there).
What this is
A plan (docs-only, no code) for a bench that exercises the full real signal
path of this firmware end to end:
The orchestrator emits one grep-stable verdict line, which is what makes a
/goalhonest — the goal evaluator judges only what's in the transcript, andthat line is the output of a real RF decode.
Two phases, one shared bench
ft8_libencode+decode, ~15–30 s/run): fast iteration gate.wsprdaemon→wsprd→ spot pipeline:deployment-representative sign-off.
Only the TX mode and decoder tail differ between phases; the orchestrator is
parameterized by
MODE={ft8,wspr}. Builds on the existingdocker/ka9q-radio/harness (already buildsradiod+rx888.so, flashesSDDC_FX3.img).Explicitly deferred to Plan 2
Remote self-hosted GitHub Actions runner, Actions/cron wiring, and the runner
security model (untrusted-fork code on a machine wired to a transmitter).
That surface is large enough to warrant its own plan.
Scope of this PR
Documentation only —
docs/local-hwil-plan.md. No source/build/config changes;firmware build and host tests are unaffected (
build.ymlpaths-ignoresdocs/**). Implementation lands in follow-up PRs against #171.Open operator questions to resolve before implementation: initial FT8 band,
reserved test callsign/grid, availability of a USB-switchable hub for
self-reset, and single- vs sibling-container split.
https://claude.ai/code/session_014Q2buDF3FceHaXdJBovDCT
Generated by Claude Code