feat: implement DNS over QUIC#18
Conversation
|
Hi @bassosimone! I made changes as per RFC 9250. Could you test once? One more change is pending. The edns-tcp-keepalive option (section 5.5.2). This option should not be set. If we receive this option as an argument in NewQuery function we need to handle it. |
|
Nice, it seems the code is now working as intended with |
|
More specifically, I think there are two criteria we should meet for merging an MVP of DNS over QUIC:
I will allocate some time to study the RFC and review the code over the weekend! Thank you for working on DoQ, @roopeshsn! 🙏 ✨ |
|
I tried to resolve the following domain, I got the following error, I'll look into this. |
Thanks to you! |
From response.go file, I came to know that if a domain is not valid, then we'll get an error with a suffix "no such host". This is expected, correct? But in this case, roopeshsn.com is valid. So dns0.eu is not able to resolve this domain correct? |
Let me try and run the same command you were running with other protocols. So, I tried using DNS-over-TLS first: It's curious I get an Then I tried using DNS-over-UDP: This result seems to suggest (See how the answer is
So, However, Anyway, back onto the DoQ topic, I think the DoQ code is working as intended, since it shows However, I am still a bit confused by the following: I would have expected |
|
Hi @bassosimone! Thanks for taking the time to review my PR! I'll go through your comments and make the changes this weekend. |
Changed queryStream's method signature to dnsStream to accept both net.Conn and quic.Stream. Created a wrapper for quic.Stream to add two additional methods, LocalAddr and RemoteAddr.
|
Hi @bassosimone! Apologies for the delay. Due to some personal matters, I couldn't work on this earlier. I've resumed now. |
|
Hi @roopeshsn, no worries! I am taking a look at your changes now! 💪 |
|
@roopeshsn I am starting to wonder whether I tried using I also tried with I also tried with I will continue investigating more resolvers from this list: https://adguard-dns.io/kb/general/dns-providers/. I will also need to re-read the RFC but I think we're now doing things correctly. I will also see whether I am able to install DoQ enabled servers ~locally to test against them. |
|
Previously, I wrote:
But, after a bit I had an a-ha moment. The current diff does not correctly close the stream. We need this instead: diff --git a/dotcp.go b/dotcp.go
index 7d84afa..8d5a3c7 100644
--- a/dotcp.go
+++ b/dotcp.go
@@ -17,7 +17,6 @@ import (
"math"
"github.com/miekg/dns"
- "github.com/quic-go/quic-go"
)
// queryTCP implements [*Transport.Query] for DNS over TCP.
@@ -101,7 +100,7 @@ func (t *Transport) queryStream(ctx context.Context,
// The client MUST send the DNS query over the selected stream and MUST
// indicate through the STREAM FIN mechanism that no further data will
// be sent on that stream.
- if _, ok := conn.(quic.Stream); ok {
+ if _, ok := conn.(*quicStreamWrapper); ok {
_ = conn.Close()
}
The issue is the following: the With this change applied, we have: What's more, if you check the timings I posted in #18 (comment), you would see (with many log messages snipped): See how we receive the response at Now we send the query at Based on all of this, here's my analysis of what is going on:
We might want to abstract the check slightly more (i.e., check for an interface rather than checking for I am now going to test with all the DoQ servers I can find and report back in a subsequent message. However, I think this finding is already good enough to warrant some success emojis: 🥳 🥳 🥳 🥳 |
|
So, here's the results of testing with all the DoQ servers in this list: https://adguard-dns.io/kb/general/dns-providers/. I am also going to try and test additional DoQ servers that I may find or guess. I am going to just report back whether we did receive a response or not using emojis. I will add logs only if necessary. The command I am using is
The error with For With @roopeshsn Would you mind applying the patch suggested above and repeat my test to see if you see the same results? |
|
OK, @roopeshsn, I also figured out what's going on with diff --git a/doquic.go b/doquic.go
index c683161..9e3f760 100644
--- a/doquic.go
+++ b/doquic.go
@@ -20,9 +20,8 @@ import (
"github.com/quic-go/quic-go"
)
-func (t *Transport) createQUICStream(ctx context.Context, addr *ServerAddr,
- query *dns.Msg) (stream *quicStreamWrapper, err error) {
-
+func (t *Transport) createQUICStream(
+ ctx context.Context, addr *ServerAddr) (stream *quicStreamWrapper, err error) {
udpAddr, err := net.ResolveUDPAddr("udp", addr.Address)
if err != nil {
return
@@ -56,12 +55,6 @@ func (t *Transport) createQUICStream(ctx context.Context, addr *ServerAddr,
_ = udpConn.SetDeadline(deadline)
}
- // RFC 9250
- // 4.2.1. DNS Message IDs
- // When sending queries over a QUIC connection, the DNS Message ID MUST
- // be set to 0.
- query.Id = 0
-
quicConn, err := tr.Dial(ctx, udpAddr, tlsConfig, quicConfig)
if err != nil {
return
@@ -89,7 +82,7 @@ func (t *Transport) queryQUIC(ctx context.Context, addr *ServerAddr, query *dns.
}
// Send the query and log the query if needed.
- stream, err := t.createQUICStream(ctx, addr, query)
+ stream, err := t.createQUICStream(ctx, addr)
if err != nil {
return nil, err
}
diff --git a/dotcp.go b/dotcp.go
index 7d84afa..8d5a3c7 100644
--- a/dotcp.go
+++ b/dotcp.go
@@ -17,7 +17,6 @@ import (
"math"
"github.com/miekg/dns"
- "github.com/quic-go/quic-go"
)
// queryTCP implements [*Transport.Query] for DNS over TCP.
@@ -101,7 +100,7 @@ func (t *Transport) queryStream(ctx context.Context,
// The client MUST send the DNS query over the selected stream and MUST
// indicate through the STREAM FIN mechanism that no further data will
// be sent on that stream.
- if _, ok := conn.(quic.Stream); ok {
+ if _, ok := conn.(*quicStreamWrapper); ok {
_ = conn.Close()
}
diff --git a/go.mod b/go.mod
index 97eada1..7cfbcf9 100644
--- a/go.mod
+++ b/go.mod
@@ -4,6 +4,7 @@ go 1.23.3
require (
github.com/miekg/dns v1.1.62
+ github.com/quic-go/quic-go v0.48.2
github.com/rbmk-project/common v0.16.0
github.com/stretchr/testify v1.10.0
golang.org/x/net v0.33.0
@@ -15,7 +16,6 @@ require (
github.com/google/pprof v0.0.0-20210407192527-94a9f03dee38 // indirect
github.com/onsi/ginkgo/v2 v2.9.5 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
- github.com/quic-go/quic-go v0.48.2 // indirect
go.uber.org/mock v0.4.0 // indirect
golang.org/x/crypto v0.31.0 // indirect
golang.org/x/exp v0.0.0-20240506185415-9bf2ced13842 // indirect
diff --git a/go.sum b/go.sum
index 0a8f2a8..e0b614a 100644
--- a/go.sum
+++ b/go.sum
@@ -4,8 +4,12 @@ github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMn
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
+github.com/go-logr/logr v1.2.4 h1:g01GSCwiDw2xSZfjJ2/T9M+S6pFdcNtFYsp+Y43HYDQ=
+github.com/go-logr/logr v1.2.4/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572 h1:tfuBGBXKqDEevZMzYi5KSi8KkcZtzBcTgAUUtapy0OI=
github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572/go.mod h1:9Pwr4B2jHnOSGXyyzV8ROjYa2ojvAY6HCGYYfMoC3Ls=
+github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg=
+github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/pprof v0.0.0-20210407192527-94a9f03dee38 h1:yAJXTCF9TqKcTiHJAE8dj7HMvPfh66eeA2JYW7eFpSE=
@@ -15,6 +19,8 @@ github.com/miekg/dns v1.1.62 h1:cN8OuEF1/x5Rq6Np+h1epln8OiyPWV+lROx9LxcGgIQ=
github.com/miekg/dns v1.1.62/go.mod h1:mvDlcItzm+br7MToIKqkglaGhlFMHJ9DTNNWONWXbNQ=
github.com/onsi/ginkgo/v2 v2.9.5 h1:+6Hr4uxzP4XIUyAkg61dWBw8lb/gc4/X5luuxN/EC+Q=
github.com/onsi/ginkgo/v2 v2.9.5/go.mod h1:tvAoo1QUJwNEU2ITftXTpR7R1RbCzoZUOs3RonqW57k=
+github.com/onsi/gomega v1.27.6 h1:ENqfyGeS5AX/rlXDd/ETokDz93u0YufY1Pgxuy/PvWE=
+github.com/onsi/gomega v1.27.6/go.mod h1:PIQNjfQwkP3aQAH7lf7j87O/5FiNr+ZR8+ipb+qQlhg=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/quic-go/quic-go v0.48.2 h1:wsKXZPeGWpMpCGSWqOcqpW2wZYic/8T3aqiOID0/KWE=
@@ -42,8 +48,12 @@ golang.org/x/sys v0.28.0 h1:Fksou7UEQUWlKvIdsqzJmUmCX3cZuD2+P3XyyzwMhlA=
golang.org/x/sys v0.28.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/text v0.21.0 h1:zyQAAkrwaneQ066sspRyJaG9VNi/YJ1NfzcGB3hZ/qo=
golang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ=
+golang.org/x/time v0.5.0 h1:o7cqy6amK/52YcAKIPlM3a+Fpj35zvRj2TP+e1xFSfk=
+golang.org/x/time v0.5.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/tools v0.28.0 h1:WuB6qZ4RPCQo5aP3WdKZS7i595EdWqWR8vqJTlwTVK8=
golang.org/x/tools v0.28.0/go.mod h1:dcIOrVd3mfQKTgrDVQHqCPMWy6lnhfhtX3hLXYVLfRw=
+google.golang.org/protobuf v1.33.0 h1:uNO2rsAINq/JlFpSdYEKIZ0uKD/R9cpdv0T+yoGwGmI=
+google.golang.org/protobuf v1.33.0/go.mod h1:c6P6GXX6sHbq/GpV6MGZEdwhWPcYBgnhAHhKbcUYpos=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
diff --git a/internal/cmd/transport/main.go b/internal/cmd/transport/main.go
index 7d7c8af..24e5a59 100644
--- a/internal/cmd/transport/main.go
+++ b/internal/cmd/transport/main.go
@@ -52,7 +52,7 @@ func main() {
server := dnscore.NewServerAddr(dnscore.Protocol(*protocol), *serverAddr)
flags := 0
maxlength := uint16(dnscore.EDNS0SuggestedMaxResponseSizeUDP)
- if *protocol == string(dnscore.ProtocolDoT) || *protocol == string(dnscore.ProtocolDoH) {
+ if *protocol == string(dnscore.ProtocolDoT) || *protocol == string(dnscore.ProtocolDoH) || *protocol == string(dnscore.ProtocolDoQ) {
flags |= dnscore.EDNS0FlagDO | dnscore.EDNS0FlagBlockLengthPadding
}
if *protocol != string(dnscore.ProtocolUDP) {
@@ -62,6 +62,12 @@ func main() {
// Create the DNS query
optEDNS0 := dnscore.QueryOptionEDNS0(maxlength, flags)
query := runtimex.Try1(dnscore.NewQuery(*domain, dnsType, optEDNS0))
+ // TODO(bassosimone): this is a temporary hack to ensure the query
+ // actually successfully validates. We should instead ensure that we
+ // specify the protocol for which we are creating the query.
+ if server.Protocol == dnscore.ProtocolDoH || server.Protocol == dnscore.ProtocolDoQ {
+ query.Id = 0
+ }
fmt.Printf(";; Query:\n%s\n", query.String())
// Perform the DNS queryWith this diff applied: So, with this change, I think the patch really starts to look good. I'll wait for your testing of the same domains and to hear if you have any design comment regarding what I mentioned above. If it's all green, then we can move towards thinking about actually merging this diff, which will boil down to, mostly, polishing the diff a bit, apply my changes or similar changes, and documenting what remains to be done next with |
bassosimone
left a comment
There was a problem hiding this comment.
Thank you for moving this forward! I have investigated the issue you requested me to investigate and tried to explain my current understanding. I think it's all looking good in this regard and we could soon move towards polishing and preparing for merging, which will mostly boil down to cosmetic changes, tweaks, and noting TODOs. Before this, please, make sure to double check my understanding and investigation and let me know if you have any specific comments regarding my analysis! 🙏
|
Hi @bassosimone! Thank you very much for spending your time on this! 🙏 You did a lot of testing and also gave me a lot of insights. Sure, I'll apply your patch and retest. |
|
Thank you, @roopeshsn! 🙏 |
We can also pass |
|
I retested the doq endpoints. As you said I am also getting the same errors as you mentioned for For For |
|
Cloudflare DNS doesn't support DoQ (cloudflare-dns.com:853) right? I tried to telnet with port 853 to check if that port is listening or not. But I didn't get any ICMP echo reply. Might be firewalls dropping the packets. |
Thank you!
The domain is now available again (the previous error I noticed had something to do with the authoritative DNS server not being able to serve the query, but this seems to have been resolved). That said, neither port
Interesting! In any case, I think we have already tested with enough servers.
Yeah, I think it does not.
I am not sure this kind of test is conclusive, though. As far as I know, the Anyways, I am marking this pull request as ready for review and I'm going to do a review and explain what is missing to get this diff merged, which mainly boils down to minor changes and adaptations. Thank you! |
RFC9250 Sect. 4.2.1 says: ``` When sending queries over a QUIC connection, the DNS Message ID MUST be set to 0. The stream mapping for DoQ allows for unambiguous correlation of queries and responses, so the Message ID field is not required. This has implications for proxying DoQ messages to and from other transports. For example, proxies may have to manage the fact that DoQ can support a larger number of outstanding queries on a single connection than, for example, DNS over TCP, because DoQ is not limited by the Message ID space. This issue already exists for DoH, where a Message ID of 0 is recommended. ``` We noticed this issue in #18. This diff aims at addressing the issue by adding support for generating a protocol-specific query by default. We do this by adding a new constructor: NewQueryWithServerAddr. From the provided ServerAddr, we obtain the protocol, which, in turn determines whether we should use a zero query ID. The existing NewQuery protocol is deprecated and becomes a wrapper around the new NewQueryWithServerAddr function. Because we recognise the value of customising the actual query ID beyond what the RFC says, we also introduce a new QueryOption called QueryOptionID that allows setting an arbitrary ID. We also update tests to ensure full coverage. We also update the `internal` testing commands accordingly.
RFC9250 Sect. 4.2.1 says: ``` When sending queries over a QUIC connection, the DNS Message ID MUST be set to 0. The stream mapping for DoQ allows for unambiguous correlation of queries and responses, so the Message ID field is not required. This has implications for proxying DoQ messages to and from other transports. For example, proxies may have to manage the fact that DoQ can support a larger number of outstanding queries on a single connection than, for example, DNS over TCP, because DoQ is not limited by the Message ID space. This issue already exists for DoH, where a Message ID of 0 is recommended. ``` RFC 8484 Sect. 4.1 says: ``` In order to maximize HTTP cache friendliness, DoH clients using media formats that include the ID field from the DNS message header, such as "application/dns-message", SHOULD use a DNS ID of 0 in every DNS request. HTTP correlates the request and response, thus eliminating the need for the ID in a media type such as "application/dns- message". The use of a varying DNS ID can cause semantically equivalent DNS queries to be cached separately. ``` We noticed this issue in #18, where DoQ queries consistently failed with `dns.alidns.com` when not using a zero DNS query ID. This diff aims at addressing the issue by adding support for generating a protocol-specific query by default. We do this by adding a new constructor: NewQueryWithServerAddr. From the provided ServerAddr, we obtain the protocol, which, in turn determines whether we should use a zero query ID. The existing NewQuery protocol is deprecated and becomes a wrapper around the new NewQueryWithServerAddr function. Because we recognise the value of customising the actual query ID beyond what the RFC says, we also introduce a new QueryOption called QueryOptionID that allows setting an arbitrary ID. We also update tests to ensure full coverage. We also update the `internal` testing commands accordingly.
Related to rbmk-project/dnscore#18, where we introduce DoQ support. While there, ensure `tlsconfig.go` has good test coverage.
TLSConn is required to write tests for the TLS handshake inside the rbmk-project/x/netcore package. PacketConn will be required to write tests for DoQ once we've merged rbmk-project/dnscore#18.
TLSConn is required to write tests for the TLS handshake inside the rbmk-project/x/netcore package. PacketConn will be required to write tests for DoQ once we've merged rbmk-project/dnscore#18.
|
@roopeshsn we should also investigate why tests are failing 🤔 |
Noted. |
|
I ran Are you referring to the workflow or action checks in github? |
|
If yes then from the build logs, it seems that we need to add |
|
I added the following test case for DoQ in query_test.go by modifying the DoH's test case. Is this fine or do we need one more test case? |
|
Apart from cosmetic changes we have below things left correct?
Feel free to add if you have any? Question: |
|
Dear @roopeshsn, thanks for following up with this!
Yes
Yes, in the sense of making sure the CI is green for a
Nothing comes to my mind. (As you mentioned, we have cosmetics but those are easy.)
I could see how this could be beneficial but, at the same time, I think we are not going to need this for now, therefore, I think we can defer doing this to a slightly later time when we'll add testing. (My intention for moving forward is to add support for |
|
To clarify on this:
I think this PR has been running for quite some time and we don't need to add unit tests as part of it, even though, for sure, this is something we could be doing at a later time, as a follow up. Also, the unit test case you added for the ID being zero looks fine. |
|
All done! Let me know if we need to do any other changes. |
Thank you! I will take a look! |
It seems my previous attempt at doing this slightly broke the expectations of go1.24, so let's repair this.
Codecov ReportAttention: Patch coverage is
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
This branch now looks good to me! I have pushed some editorial changes on top of the branch, under the reasoning that it was mostly cosmetic changes. If you have time, please take a look at the overall diff to double check the end result. I am otherwise going to merge this over the weekend. Thank you for helping out with DoQ!!! it has been fun, it has been great, and I am happy that I have learned new things about the internet! 🥳 🥳 🥳 🥳
|
We're going to merge this branch, which has been running on for quite some time, and which now looks good! There is obviously a coverage hit, because there are no unit tests. We will need to work on them on a follow up PR. |
This diff is based on rbmk-project/dnscore@2bb1c48. This diff is based on rbmk-project/common@443e41e. This diff includes code written by @roopeshsn as part of rbmk-project/dnscore#18. The overall purpose for this diff is to simplify the development environment and reduce chores for me. My original setup was OK for a professionally maintained project, however, it is overkill for something I maintain in my free time and as an hobby.
The overall purpose for this diff is to simplify the development environment and reduce chores for me. My original setup was OK for a professionally maintained project. However, it is overkill for a hobby project. This diff is based on rbmk-project/dnscore@2bb1c48. This diff is based on rbmk-project/common@443e41e. This diff includes code written by @roopeshsn as part of rbmk-project/dnscore#18.
Implemented DNS over QUIC.
Part of https://github.com/rbmk-project/issues/issues/3