Doc validation harness: test the toggle + statusline instructions against the real binary#2
Open
Tombar wants to merge 18 commits into
Open
Doc validation harness: test the toggle + statusline instructions against the real binary#2Tombar wants to merge 18 commits into
Tombar wants to merge 18 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s assert) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ut never used) Found by the doc-validation harness (test/docs/iterm.bats pyflakes check). The iTerm2 toggle registers its RPC off `connection`; `app` was dead code.
… distros
- tmux list-keys: accept C-M-t or M-C-t (older tmux reorders modifiers);
assert the stable toggle-path + #{pane_id} first.
- luacheck: --no-color so ANSI escapes don't split the '0 errors' substring.
- bats --print-output-on-failure for diagnosability.
…ngs REPORT REPORT.md records per-claim PASS/STATIC/LIVE-MANUAL results and the one doc bug the harness caught and fixed (iterm.md unused async_get_app).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a committed bats-core harness (
test/docs/) that validates the copy-pasteinstructions in
docs/toggle/{wezterm,iterm,tmux}.mdanddocs/claude-statusline.mdbyextracting the literal fenced code blocks from the docs and running them against the
installed
failsafebinary in a sandboxedHOME. Wired into a separate CI workflow sodoc drift is caught on every push.
The harness already paid for itself: it caught a real bug in
docs/toggle/iterm.md(an unused
app = await iterm2.async_get_app(connection)line, flagged by pyflakes) —fixed here.
What's covered (34 headless tests, green in CI)
read, rego matches"read", rw/ro alias normalization, tab-delimitedmode get.🔒 read/🔓 writeglyphs, jq enrichment, graceful degrade without jq (proven: no cwd appended), single-line output.#{pane_id}==$TMUX_PANE,C-M-tregistration, status-bar colors, no-scriptfailsafe toggle.toggle_modeexecuted via aweztermstub + driver and cross-checked againstfailsafe mode get; luacheck (runs in CI on lua5.4); sudo-timeout revert.read_modeexec'd from its AST,py_compile+import iterm2, pyflakes, no-python toggle.How it works
lib/extract.shpulls the exact Nth fenced block of a language from a.md(heading-anchored), so tests run the doc's bytes, not copies — editing a snippet that breaks its contract fails CI.helpers.bashsandboxesHOMEper test (mktemp -d), so toggling never touches the real~/.claude; teardown only removes its own temp dirs.skipwith a reason (never a vacuous pass); CI installs everything, so a skip in CI is itself a signal.Not automated (honestly labeled)
GUI-only surfaces — WezTerm toasts/
format-tab-title, iTerm2's Python-runtime keybinding — are STATIC / LIVE-MANUAL. A local-onlymake validate-docs-livelaunches real WezTerm (wezterm show-keys) to confirm the snippet loads and bindsCtrl+Alt+t. Seetest/docs/REPORT.mdfor the per-claim results.Test Plan
make validate-docs— 34/34 pass locallygo test ./...— unaffected, all greendoc-validationworkflow green on this branchmake validate-docs-live— real WezTerm config-load passesci.ymluntouched🤖 Generated with Claude Code