Skip to content

ci: run spec_tests with fake_crypto#983

Open
shane-moore wants to merge 1 commit into
sigp:unstablefrom
shane-moore:chore/fake-crypto-in-makefile
Open

ci: run spec_tests with fake_crypto#983
shane-moore wants to merge 1 commit into
sigp:unstablefrom
shane-moore:chore/fake-crypto-in-makefile

Conversation

@shane-moore
Copy link
Copy Markdown
Member

Problem, Evidence, and Context

  • PR test: proposer consensus-data spec coverage #942 added ConsensusDataProposerTest whose success-path assertions (block blinded state, block root, block SSZ roundtrip) only run when bls/fake_crypto is enabled. Without it, success fixtures fall through to a weaker BLS-rejection branch because Lighthouse validates BLS during SSZ decode while Go's fastssz does not.
  • Today nothing in the Makefile or CI runs spec_tests with fake_crypto, so the branch can stay green while bypassing the strongest assertions of the new proposer test.
  • Closes Add a Makefile target for spec tests with fake_crypto coverage #938.

Change Overview

Test-infra only, two additions:

  • Makefile: new test-spec-tests / nextest-spec-tests targets that run spec_tests with --features fake_crypto; test-full extended to include them. Mirrors Lighthouse's ef_tests pattern.
  • CI: new spec-tests-fake-crypto job in test-suite.yml that runs make nextest-spec-tests, gated by test-suite-success. Runs in parallel with the existing release-tests job.

What did not change: no production code; no edits to the BLS-fallback heuristic in consensus_data_proposer.rs or is_bls_validation_error. Both paths remain valuable — the workspace test run continues to exercise the realistic BLS-rejection branch, and the new job covers the full-coverage path.

Risks, Trade-offs, and Mitigations

  • Adds a CI job with ~1–2 min compile + run; runs in parallel so it does not extend the critical path materially.
  • fake_crypto cannot be enabled at the workspace level (the anchor root crate does not declare it), so the target is scoped to -p spec_tests.
  • Trade-off vs folding inline into release-tests-ubuntu: separate job preserves independent failure visibility and parallel execution; cost is ~30–60s of duplicate Rust setup, accepted.

Validation

  • make nextest-spec-tests: 1 test, 114 fixtures, 0.06s runtime (after ~1m48s cold compile). Success fixtures now exercise verify_block (block root + SSZ roundtrip) instead of falling through to the BLS-rejection branch.
  • cargo nextest run --release -p spec_tests (no fake_crypto): still passes via the existing BLS-fallback path — confirms the workspace test run is unaffected.
  • .github/scripts/check_success_job.sh ./.github/workflows/test-suite.yml test-suite-success reports COMPLETENESS CHECK PASSED; yq parses the workflow YAML cleanly.

Rollback

N/A — test-infrastructure-only change.

Blockers / Dependencies

None.

@shane-moore
Copy link
Copy Markdown
Member Author

@claude, please take a look

@claude-code-actions-sigp
Copy link
Copy Markdown

claude-code-actions-sigp Bot commented Apr 30, 2026

Claude finished @shane-moore's task in 56s —— View job


Review of PR #983

  • Gather context and read changed files
  • Review Makefile changes
  • Review CI workflow changes
  • Review commands.md changes
  • Post review feedback

Reviewed the three changed files against the existing CI structure. This is clean, well-scoped test-infra work. A few observations:

Makefile (Makefile:92-99)

The new test-spec-tests and nextest-spec-tests targets look correct. The --features fake_crypto flag is properly scoped to -p spec_tests, and the spec_tests/Cargo.toml confirms fake_crypto = ["bls/fake_crypto"] exists. The test-full target at line 137 correctly includes test-spec-tests.

No issues.

CI workflow (.github/workflows/test-suite.yml:104-123)

The spec-tests-fake-crypto job structure mirrors release-tests-ubuntu appropriately. A few notes:

  1. SELF_HOSTED_RUNNERS guard (line 114): The if: env.SELF_HOSTED_RUNNERS == 'false' condition is carried over from release-tests-ubuntu. Currently SELF_HOSTED_RUNNERS is hardcoded to false (line 23), so this is fine, but if self-hosted runners are ever re-enabled, this job would skip Rust setup without having an alternative — same latent issue as in the existing jobs. Not introduced by this PR, so not actionable here.

  2. test-suite-success dependency (line 250): spec-tests-fake-crypto is correctly added to the needs array of the test-suite-success gate job, and the PR description confirms check_success_job.sh passes. This ensures the new job is required for merge.

  3. No env.GITHUB_TOKEN at job level: The release-tests-ubuntu job doesn't set it at job level either (it relies on the workflow-level env), so this is consistent. The debug-tests-ubuntu job does set it redundantly at job level — a pre-existing inconsistency, not introduced here.

No issues.

commands.md (.claude/rules/commands.md:15)

The new entry for make test-spec-tests / make nextest-spec-tests is accurate and consistent with the existing documentation format.

No issues.

Summary

No bugs, security issues, or correctness problems found. The PR is well-scoped, adds the targets and CI job exactly as described, and the test-suite-success gate is properly updated. LGTM.


|

@diegomrsantos
Copy link
Copy Markdown
Member

Stepping back from the mechanics of this PR — I want to flag a pattern I'd like us to push back on as a team.

When we hit friction with an upstream dependency or spec, our default has been to add a workaround inside Anchor: a feature flag, a heuristic, a fallback branch, a bypass in CI. Each one is individually reasonable, but they accumulate. The result is that the real fix — upstream, or in the shared contract — loses urgency, and we carry the complexity indefinitely.

This PR is a concrete example. The underlying issue is ssv-spec#622: the Capella/Deneb fixtures use placeholder BLS bytes that aren't valid compressed points. Anchor's decoder is correct to reject them. `fake_crypto` lets us sidestep that, and the `is_bls_validation_error` string match keeps the non-`fake_crypto` path green. Both paths will stay green regardless of whether the upstream fixture ever gets fixed, and we now have a CI job committing us to maintaining the bypass.

I'd like us to try, before adding a workaround, to:

  1. Confirm there's an open upstream issue and push on it.
  2. Consider fixing the problem at the boundary (e.g. regenerating or patching fixtures locally with valid inputs) so production code paths stay honest.
  3. Treat the workaround as temporary with a clear removal condition, not as the new steady state.

For this specific PR, I'm not blocking — the diff is small and the intent is good. But I'd like to see us revisit ssv-spec#622 and set a bar for when this CI job comes back out, rather than treat it as permanent infrastructure.

@shane-moore
Copy link
Copy Markdown
Member Author

Wanted to follow up on this directly — opened ssv-spec#627 against spec-test-ci to fix the upstream root cause for #622. Capella/Deneb proposer fixtures get rewritten so every BLS-typed field holds a valid compressed point (derived from an existing test SK; non-BLS fields like KZG commitments/proofs are preserved byte-for-byte via a typed-walk discovery). A new TestProposerFixturesBLSDecodable in ssv-spec gates the regen so this can't regress upstream.

The removal condition for the fake_crypto bypass and the is_bls_validation_error string match is already tracked in #953: once #627 lands and the spec-tests JSONs are regenerated against it, the Anchor submodule bump should let real blst decode the Capella/Deneb proposer fixtures without either escape hatch. I've updated #953 to name PR #627 as the explicit trigger and listed the seven follow-up deletions, so the bypass has a concrete removal path rather than living as steady state.

@shane-moore shane-moore force-pushed the chore/fake-crypto-in-makefile branch from ebdf75c to 26c30e6 Compare May 11, 2026 15:41
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.46%. Comparing base (5950c16) to head (26c30e6).

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable     #983   +/-   ##
=========================================
  Coverage     61.46%   61.46%           
=========================================
  Files           156      156           
  Lines         25688    25688           
=========================================
  Hits          15790    15790           
  Misses         9898     9898           
Flag Coverage Δ
rust 61.46% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants