Bug 1953459 - Add profiler markers along the native HTTPS RR resolution path#4
Open
mxinden-bot wants to merge 3 commits into
Open
Bug 1953459 - Add profiler markers along the native HTTPS RR resolution path#4mxinden-bot wants to merge 3 commits into
mxinden-bot wants to merge 3 commits into
Conversation
…on path On https://happy-eyeballs.net/tests/http3-availability/ Firefox does not connect via HTTP/3 even though the HTTPS RR advertising h3 arrives on the wire within ~10ms. The suspicion is that internal processing of the HTTPS record exceeds the happy eyeballs resolution delay, so the TCP attempt starts first and HTTP/2 wins over HTTP/3. The existing happy eyeballs glue markers only show the full DNS round trip as one interval. Add markers that break that interval into its internal stages so the slow stage can be identified in a profile: - HTTPSRR AsyncResolve (socket thread): the happy eyeballs glue issued, skipped, or failed to issue the nsIDNSService query. - HTTPSRR ResolveHost (caller thread): how nsHostResolver handled the request (cache hit, negative cache hit, joined in-progress lookup, lookup started, failed to start). - HTTPSRR queue (DNS resolver thread): time the record spent in the pending queue waiting for a resolver thread. - HTTPSRR native resolve (DNS resolver thread): the full native resolution including response parsing. - HTTPSRR OS query (DNS resolver thread): the raw platform DNS call (res_nquery / DnsQuery_A / DNSServiceQueryRecord / android_res_nquery). - HTTPSRR notify listener (DNS resolver thread): the result is handed to the DNS listener. - DNS OnLookupComplete dispatch (listener target thread): latency of dispatching the result runnable to the listener's target thread, for happy eyeballs the socket thread. All markers carry the host name and a detail string so they can be correlated with the Happy Eyeballs markers on the timeline.
…paths Follow-up to the initial HTTPS RR markers, based on findings from profiling the happy eyeballs HTTPS RR race: Replace the HTTPS-RR-only marker payload with a generic DNSQueryMarker carrying structured metadata: host, record type (A/AAAA/HTTPS/...), outcome, status, record count and a free-form detail field. This makes markers filterable per field in the profiler UI instead of encoding everything in one detail string. Extend coverage from the HTTPS RR path to all record types: - DNS ResolveHost (all consumers, all types): cache hit, negative cache hit, IP literal, loopback, joined in-progress lookup, lookup started, or failure, plus whether the request was speculative. This directly answers whether A/AAAA results came from the cache while the HTTPS record had to go to the network. - DNS queue and DNS native resolve (DNS resolver threads): now also emitted for address lookups, with address counts. - DNS notify listener: now emitted for all record types. - HE DNS request (socket thread): emitted for A and AAAA in addition to HTTPS, making the glue's query issuance order visible. The happy eyeballs core issues the HTTPS query first by design (see send_dns_request in the happy-eyeballs crate); these markers verify that in every profile. Make the DNS prefetch machinery observable, since whether the channel level prefetch runs decides whether the HTTPS record gets a head start over the resolution delay: - DNS prefetch / DNS HTTPSSVC fetch (nsDNSPrefetch): the channel issued a speculative lookup. - DNS prefetch blocked (nsDNSService and ChildDNSService): a speculative lookup was dropped because network.dns.disablePrefetch is set, e.g. by uBlock Origin's disable pre-fetching setting. - Channel DNS prefetch skipped (nsHttpChannel): why the channel did not issue the HTTPS RR prefetch (pref disabled, record already available, disallowed for the channel, not usable on the network, proxy DNS strategy).
The happy eyeballs core deliberately issues its HTTPS query before the A and AAAA queries (see send_dns_request in the happy-eyeballs crate). The DNS prefetch sites did the opposite: both nsHttpChannel's channel-level prefetch and the HTML element / link-hint prefetch issued the address query first and the HTTPS RR query second. The HTTPS RR lookup is the slower of the two on common stub resolver setups (it takes the raw res_nquery/UDP path while address lookups go through getaddrinfo) and it is the lookup the connection race is most sensitive to: if it arrives more than the resolution delay after the addresses, HTTP/3 discovery via HTTPS RR is lost. Enqueue it first so it never waits behind the address lookup, neither on the DNS resolver thread pool nor at the resolver. For the HTML prefetch this also means the HTTPS query is now issued even if the address query would fail; the dominant failure mode there is network.dns.disablePrefetch, which drops both queries equally.
cdc3c66 to
67ca139
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
On https://happy-eyeballs.net/tests/http3-availability/ Firefox does not connect via HTTP/3 even though the HTTPS RR advertising
h3arrives on the wire within ~10ms. Profiling with the first iteration of these markers showed the failure mode precisely: addresses resolve fast (warm), the HTTPS RR pays ~52ms throughres_nquery, the happy eyeballs resolution delay (addresses + 50ms) expires ~1ms before the RR is delivered, the TCP/H2 attempt wins, and the core recordsh3_discovery: https_rr_onlywithout ever launching an H3 attempt. Whether the channel-level DNS prefetch runs (gated bynetwork.dns.disablePrefetch, flipped by uBlock Origin's default settings) decides whether the RR gets a head start and therefore whether h3 is used at all.This PR adds the profiler markers that make all of the above directly visible. Native DNS only; TRR/DoH left for a follow-up.
Markers (second iteration)
All markers share one payload type (
DNSQueryMarker,netwerk/dns/DNSProfilerMarkers.h) with structured fields:host,qtype(A / AAAA / A+AAAA / HTTPS / TXT),outcome,status,records,detail. Fields are individually visible and searchable in the profiler UI.HE DNS requestDNS ResolveHostDNS queueDNS native resolveDNS OS queryres_nquery/DnsQuery_A/DNSServiceQueryRecord/android_res_nquery)DNS notify listenerDNS OnLookupComplete dispatchDNS prefetch/DNS HTTPSSVC fetchDNS prefetch blockednetwork.dns.disablePrefetchChannel DNS prefetch skippedMaybeStartDNSPrefetchdid not issue the HTTPS RR prefetch (pref off, record available, disallowed, network, proxy strategy)Usage
Profile with the Networking preset (socket thread + DNS Resolver threads) and filter the marker chart/table for
DNS, Happy Eyeballs, HE. The new markers nest under the glue'sHappy EyeballsDNS intervals per host.Validation
netwerk/dns,netwerk/baseandnetwerk/protocol/httpcompile cleanly in a Linux objdir (including the non-unifiedPlatformDNSUnix.cpp/GetAddrInfo.cpp); the Windows/Mac/Android platform files follow the same pattern but were not compiled here../mach lint -l clang-formatclean on all touched files.nsHttpChannel::MaybeStartDNSPrefetchpreserves the original short-circuit evaluation order exactly; it only names the failing guard for the marker.