-
Notifications
You must be signed in to change notification settings - Fork 945
Tracing tests, fixes, and optimizations #8223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracing tests, fixes, and optimizations #8223
Conversation
e60e873
to
ab4d1be
Compare
common/trace.c
Outdated
dev_trace_file, (unsigned)getpid()); | ||
trace_to_file = fopen(fname, "a+"); | ||
if (!trace_to_file) | ||
err(1, "Opening DEV_CLN_TRACE_FILE %s", fname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bike shedding as usual, but should we adopt CLN_
as a common prefix for our envvars going forward? It's a minor thing, but could simplify things a bit going forward.
Great addition of a trace file by the way, very useful for simple setups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, trace file was your idea! And I agree with renaming env vars.
With an average runtime of 18.7674, this implies 1876ns per trace, which is far in excess of the 370ns claimed in doc/developers-guide/tracing-cln-performance.md. We also add a tag in there, so we measure that! Results on my laptop: real 0m18.524000-19.100000(18.7674+/-0.21)s user 0m16.171000-16.833000(16.424+/-0.26)s sys 0m2.259000-2.400000(2.337+/-0.059)s Signed-off-by: Rusty Russell <[email protected]>
I added this debugging because the next test revealed a mismatch, so I wanted to see where it was happening. The comment in lightningd suggests it's possible, but I can't see any code which suspends in the lightningd io_loop, so I cannot see how this is triggered. Signed-off-by: Rusty Russell <[email protected]>
Suggested-by: Christian Decker Signed-off-by: Rusty Russell <[email protected]>
Testing parenting handling revealed several issues: 1. By calling "trace_span_start" when CLN_TRACEPARENT is set produces a bogus entry, for which the span_id is overwritten so we never end it. 2. We don't need to close the remote parent when we close the first child: in fact, this causes the remaining traces to be detached from the parent! 3. Suspension should return current to the parent, not to NULL. Now the traces balance as we expect. Signed-off-by: Rusty Russell <[email protected]>
1. trace_span_start() is always called with a string literal, so no copy needed (and we can use a macro to enforce this). 2. trace_span_tag() name and value are always longer-lived than the span, so no need to copy these either. Before: real 0m18.524000-19.100000(18.7674+/-0.21)s user 0m16.171000-16.833000(16.424+/-0.26)s sys 0m2.259000-2.400000(2.337+/-0.059)s After: real 0m16.421000-18.407000(17.8128+/-0.72)s user 0m14.242000-16.041000(15.5382+/-0.67)s sys 0m2.179000-2.363000(2.273+/-0.061)s Signed-off-by: Rusty Russell <[email protected]>
Avoids allocations. Also assume that name and value parameters outlive the trace span, so don't copy. Before: real 0m16.421000-18.407000(17.8128+/-0.72)s user 0m14.242000-16.041000(15.5382+/-0.67)s sys 0m2.179000-2.363000(2.273+/-0.061)s After: real 0m13.441000-14.592000(14.2686+/-0.43)s user 0m11.265000-12.289000(11.9626+/-0.37)s sys 0m2.175000-2.381000(2.3048+/-0.072)s Signed-off-by: Rusty Russell <[email protected]>
There's an EBPF limit anyway, so stick with a 512-byte buffer. This brings us back to 621ns per trace: Before: real 0m13.441000-14.592000(14.2686+/-0.43)s user 0m11.265000-12.289000(11.9626+/-0.37)s sys 0m2.175000-2.381000(2.3048+/-0.072)s After: real 0m5.819000-6.472000(6.2064+/-0.26)s user 0m3.779000-4.101000(3.956+/-0.12)s sys 0m2.040000-2.431000(2.2496+/-0.15)s Signed-off-by: Rusty Russell <[email protected]>
This is much faster to give 64 bits of data, and we don't need cryptographic randomness. This brings us back to 413ns per trace. Before: real 0m5.819000-6.472000(6.2064+/-0.26)s user 0m3.779000-4.101000(3.956+/-0.12)s sys 0m2.040000-2.431000(2.2496+/-0.15)s After: real 0m3.981000-4.247000(4.1276+/-0.11)s user 0m3.979000-4.245000(4.126+/-0.11)s sys 0m0.000000-0.002000(0.001+/-0.00063)s Signed-off-by: Rusty Russell <[email protected]> Changelog-Fixed: lightingd: trimmed overhead of tracing infrastructure.
We don't ever actually close the remote span (we don't have its key, after all), and we keep a pointer to the parent. Signed-off-by: Rusty Russell <[email protected]>
This doesn't happen very often, but can with autoclean. However, we rarely traverse to the end, since we always expect to find what we're looking for, and we fill from the front. So even a large array (unless it's used) is fine. Subtle: when we injected a parent, we used "active_spans" as the (arbitrary) key. That can now change with reallocation, and so if that memory were reused we could have a key clash. So we use "&active_spans" which doesn't change. Signed-off-by: Rusty Russell <[email protected]>
ab4d1be
to
68da93d
Compare
Trivial rebase, and replaced DEV_CLN_TRACE_FILE with CLN_DEV_TRACE_FILE as per @cdecker suggestion. |
Recommended by @cdecker , tests were a great idea!