-
Notifications
You must be signed in to change notification settings - Fork 402
Reduce CI fuzz iterations as we're now timing out #3691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've assigned @joostjager as a reviewer! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, seems we hit yet another failing case in the router
target now: #3692
else | ||
HFUZZ_RUN_ARGS="$HFUZZ_RUN_ARGS -N1000000" | ||
HFUZZ_RUN_ARGS="$HFUZZ_RUN_ARGS -N500000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reasoning behind the 10x, 10x and 2x reductions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I probably could have benchmarked, but in general there's not a lot of value in full_stack_target
in CI because its just too dense to make any real progress, and similar for chanmon_consistency_target
.
I checked out a fuzzer run on the attr failures PR and grepped the logs. Interestingly it seems like indeed
Maybe I am not interpreting correctly though. Also from this log, you'd say that all the fast tests (<30 sec) don't need their iteration count changed.
|
0244aa0
to
9f68f1b
Compare
Ah, thanks for doing that. Yea, I was going on the results when I fuzz with real corpuses, where
Huh, interesting, I went ahead and slowed this one down though.
Yep! |
c87c3f3
to
ad4fb1e
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3691 +/- ##
==========================================
+ Coverage 89.05% 89.92% +0.86%
==========================================
Files 155 156 +1
Lines 122019 129760 +7741
Branches 122019 129760 +7741
==========================================
+ Hits 108666 116683 +8017
+ Misses 10695 10441 -254
+ Partials 2658 2636 -22 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
b93b1d6
to
4b4bad8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fuzz failure is #3708
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great that fuzzing again found an issue. This is awesome.
Before merging this, I think we still want to see a successful fuzz run well within the timeout?
Yea, happy to wait until we can at least run a fuzzing run and check that all the timings make sense. |
4b4bad8
to
4a78951
Compare
4a78951
to
9c78c60
Compare
It passed, but taking five hours for CI is excessive, so I reduced it further. |
Now finished in 273min ~ 4.55 hours. Is this acceptable? |
9c78c60
to
e95f1d8
Compare
Mmm, that's still pretty aggressive. Divided a few jobs by two so hopefully this time's the charm. |
Okay, its now "only" 2h, which I think is kinda reasonable, or at least its in line with what some of our other jobs take, so happy to land this now. |
Distribution looks much better now:
|
- name: Sanity check fuzz targets on Rust ${{ env.TOOLCHAIN }} | ||
run: | | ||
cd fuzz | ||
RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz" cargo test --verbose --color always | ||
RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz" cargo test --verbose --color always --lib --bins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these two additions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because RUSTFLAGS
aren't applied to doctests, so we have to restrict tests to only run library and binary tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add comments when making changes like this. Make life easier for future devs.
|
||
mkdir -p hfuzz_workspace/full_stack_target/input | ||
pushd write-seeds | ||
RUSTFLAGS="$RUSTFLAGS --cfg=fuzzing" cargo run ../hfuzz_workspace/full_stack_target/input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to verify that using the hardcoded seed indeed increases coverage, but it seems that hongfuzz reports in the log always
branch_coverage_percent:0
Not sure why that is? Otherwise it would be easy to compare with and without this hardcoded seed.
Or maybe there is another way to verify that it works indeed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the full_stack_target
got much slower is pretty good evidence :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, indirect evidence. But why is hongfuzz reporting 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not using coverage to direct fuzzing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, that I don't know, it certainly does locally and there's nothing different locally vs in CI. Also CI does turn up (fairly shallow) fuzz bugs as well, so I'd be surprised to learn it was able to do that with just general patterns and no instrumentation...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it show you branch_coverage_percent
as non-zero locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about branch coverage specifically, but what I get from a running instance is: Coverage : edge: 16,485/568,547 [2%] pc: 738 cmp: 794,641
. I assume "cmp" is the branch coverage.
e95f1d8
to
11dde05
Compare
11dde05
to
c97c283
Compare
When we made `test_node_counter_consistency` more aggressively run, our `process_network_graph` fuzzer got materially slower, resulting in consistent fuzz CI job timeouts. Thus, here, we tweak the iteration count on all our fuzz jobs to get them running in more consistent times. Further, we further reduce `full_stack_target` iterations in anticipation of a later commit which will start using our hard-coded fuzz seeds, creating substantially more coverage and slowing down fuzzing iterations.
In 3145168 we disabled `test_node_counter_consistency` in debug builds since it can make make things very slow, including `lightning-rapid-gossip-sync` tests. We should, however, have kept it when fuzzing, since that gives us testing of potential coverage gaps in normal tests.
This should materially improve our fuzzing coverage in CI.
c97c283
to
57c0d26
Compare
Went ahead and squashed the addition of a comment in the fuzz script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to have a more reasonable run time for the fuzzer!
Still interested to find out what's going on with that coverage indicator at zero in CI.
I think github has slowed down the runners so now our fuzz tests are timing out. Here we just reduce iteration count a bit.