fix: DoH SHOULD and DoQ MUST use a zero query ID#19
Conversation
RFC9250 Sect. 4.2.1 says: ``` When sending queries over a QUIC connection, the DNS Message ID MUST be set to 0. The stream mapping for DoQ allows for unambiguous correlation of queries and responses, so the Message ID field is not required. This has implications for proxying DoQ messages to and from other transports. For example, proxies may have to manage the fact that DoQ can support a larger number of outstanding queries on a single connection than, for example, DNS over TCP, because DoQ is not limited by the Message ID space. This issue already exists for DoH, where a Message ID of 0 is recommended. ``` RFC 8484 Sect. 4.1 says: ``` In order to maximize HTTP cache friendliness, DoH clients using media formats that include the ID field from the DNS message header, such as "application/dns-message", SHOULD use a DNS ID of 0 in every DNS request. HTTP correlates the request and response, thus eliminating the need for the ID in a media type such as "application/dns- message". The use of a varying DNS ID can cause semantically equivalent DNS queries to be cached separately. ``` We noticed this issue in #18, where DoQ queries consistently failed with `dns.alidns.com` when not using a zero DNS query ID. This diff aims at addressing the issue by adding support for generating a protocol-specific query by default. We do this by adding a new constructor: NewQueryWithServerAddr. From the provided ServerAddr, we obtain the protocol, which, in turn determines whether we should use a zero query ID. The existing NewQuery protocol is deprecated and becomes a wrapper around the new NewQueryWithServerAddr function. Because we recognise the value of customising the actual query ID beyond what the RFC says, we also introduce a new QueryOption called QueryOptionID that allows setting an arbitrary ID. We also update tests to ensure full coverage. We also update the `internal` testing commands accordingly.
89cadd6 to
6ba742d
Compare
| // Otherwise, the default is using [dns.Id] for all protocols | ||
| // except DNS-over-HTTPS and DNS-over-QUIC, where we use | ||
| // zero, thus following RFC 9250 Sect 4.2.1. | ||
| func QueryOptionID(id uint16) QueryOption { |
There was a problem hiding this comment.
Are we using this function anywhere except in test cases?
There was a problem hiding this comment.
Ok, this is used to set an arbitrary value. We don't have that use case for now.
There was a problem hiding this comment.
Correct, and thank you for raising this point, which I think is very valid!
Under the YAGNI principle, we could not have implemented this. However, my thinking is that the functionality is small and self contained, and orthogonal with the rest of the codebase.
Additionally, we have seen that upstream DoQ servers behave differently with respect to the query ID strictly being zero. Therefore, it seems potentially useful for fingerprinting.
All in all, this seems to me to vouch in favour of adding it, since the cost is small and there is a potential benefit and use case (i.e., testing for compliance with the RFC).
Adding query option to rbmk dig to set this would be good.
query.go
Outdated
| // require a nonzero queryID to be set. | ||
| switch serverAddr.Protocol { | ||
| case ProtocolDoH: | ||
| // the zero ID MUST be used |
There was a problem hiding this comment.
For DoH and DoQ cases, we can write query.id = 0 right?
There was a problem hiding this comment.
Correct! I felt it was OK to do nothing since the query is zero-initialised by new(dns.Msg). I will slightly update the comment to acknowledge that the ID is already zero, to make the comment less ambiguous.
|
LGTM |
|
Thank you for your review, @roopeshsn! 🙌 I'm going to merge this PR now! |
RFC9250 Sect. 4.2.1 says:
RFC 8484 Sect. 4.1 says:
We noticed this issue in #18,
where DoQ queries consistently failed with
dns.alidns.comwhennot using a zero DNS query ID.
This diff aims at addressing the issue by adding support for generating a protocol-specific query by default.
We do this by adding a new constructor: NewQueryWithServerAddr.
From the provided ServerAddr, we obtain the protocol, which, in turn determines whether we should use a zero query ID.
The existing NewQuery protocol is deprecated and becomes a wrapper around the new NewQueryWithServerAddr function.
Because we recognise the value of customising the actual query ID beyond what the RFC says, we also introduce a new QueryOption called QueryOptionID that allows setting an arbitrary ID.
We also update tests to ensure full coverage.
We also update the
internaltesting commands accordingly.While there, add convenience aliases for DNS protocol names (I found myself wanting this three times, so...)