-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(kad): implement automatic client mode #3877
Conversation
Co-authored-by: Max Inden <[email protected]>
@mxinden All comments addressed. |
@mxinden Feel free to merge if this is ready in your eyes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready to go apart from the smaller suggestion below.
@rkuhn in https://github.com/Actyx/:
Feel free to ignore if |
Commit updating kademlia-exporter project: mxinden/kademlia-exporter@b958a02 |
Overall results look promising @thomaseizinger: |
Cool! So essentially, the improvement we are seeing here by just deploying it to one node is that we are learning about the other node supporting kademlia via identify and add it to our routing table immediately instead of waiting for an inbound/outbound request. Consequently, we have a better local routing table and can answer queries more quickly. Is that analysis correct? |
Was |
Because we didn't want to place the burden on configuring this on the user.
Yes, that is correct.
If you don't have an external address, how would your node be reachable and thus of use to the kademlia network?
Why would you want to do that? |
More under the assumption that the address that is listened on would be its external such as a public ip as some nodes may not have autonat in place or want to add/remove external addresses manually on each change at runtime, although I did see the commit removing the scoring system so that make sense on that.
Some nodes may only want to receive information from DHT without being queried themselves. Eg, a node may want to query the DHT for records without the node being a server itself and receiving request while allowing it to change to a server later on in its runtime, while the same can also be said on if the node wish to switch from being a server to a client later on if it wishes not to handle anymore queries. Under the implementation here, the node dont have a choice but to be a client until its reachable with no option to set it back later on unless it remove its external address (assuming it does switch back to client), but then that may trigger other behaviours to act on those events too if they rely on an external address event. |
All this sounds very constructed to me. Do you have a concrete usecase for which the current implementation does not work? I understand that we are making some assumptions here with how things are now implemented. I don't mind if those get invalidated and we have to adapt the implementation. But the vision I want to work towards is that things work out of the box and people don't have to become experts at the library to make use of its (advanced) features. |
I do maintain a fork of rust-ipfs and one case would be that nodes may wish to only find what peer is providing what over dht, or may wish to fetch records (eg resolving ipns records). Such nodes may not wish to operate as a dht server at that moment especially if the intention is to locate contents or records provided by peers, and may later want to switch over and operate as a server. Another case would be in embedded/mobile environments. Under these environments, they may be reachable by other nodes directly (ip being public or upnp is enabled), or via relay, however the nodes may not have the resources to operate as a dht server. This could be memory related, network restrictions, etc. Client mode would be suitable under those environments while under the current implementation the nodes would have no choice but to operate as a server upon having an external address or dont have kad enabled at all.
I dont believe having the option to set the mode to client or server would be overly complex internally. Maybe have a flag in the configuration to rely on external addresses to determine its mode or to be determined manually by setting the mode to client or server (with the default being client) and send an event to the handlers when the mode is changed at runtime, either as it does in |
I don't think that is the reason. Instead I think that the routing table is significantly more healthy now. I.e. previously we would add go-libp2p based Kademlia client nodes to our routing table once they did an inbound request to us. Given that they are behind a firewall or NAT (thus Kademlia clients), reaching out to them for requests would fail and thus delay outbound query durations. Now we no longer add them to our routing table and thus, for outbound requests, only reach out to the Kademlia server nodes, thus significantly reducing outbound request latency. |
I don't think that is true, we never add inbound connections to our routing table. |
You are right. I have to dig deeper. Your suggestion makes the most sense.
|
Do I understand correctly, that a remote peer can basically force my peer to switch from Client mode to Server mode by triggering On that note, what do you think about allowing to configure mode manually via configuration parameters? For example, as a part of I think that makes a lot of sense, because if I know my peers are high uptime and on powerful intances, I want to be sure that they operate in Server mode, controlling it directly, without having to think about calling Controlling non-direct side effects in a big codebases is hard. |
To clarify. I have So administrator of the backend is free to set And that's totally OK, right? If a peer doesn't know its own external address – they can still accept connections, other peers still can see their socket addresses and try to use them. But with this change, peer now behaves entirely different based on the user configuration. Until Identify comes, which happens unpredictable. I want to have more control over how Kademlia behaves, at least I think I do :) |
@@ -368,6 +365,9 @@ pub enum KademliaHandlerIn { | |||
/// for the query on the remote. | |||
Reset(KademliaRequestId), | |||
|
|||
/// Change the connection to the specified mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the effect of that? Is there a network message sent all my connections when I learn about external_addresses?
I think it makes sense to elaborate on that in the comment. Describe effects of other peers receiving this message.
Also, isn't that a lot of overhead? What if my addresses go on and off often? Not sure why that might happen, but now how do I make sure that doesn't happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nobody receives any messages. Client vs server mode essentially just means, "do we accept inbound kademlia streams" which translates to rust-libp2p
as "do we offer kademlia via ConnectionHandler::listen_protocols
".
.map_upgrade(Either::Left) | ||
} else { | ||
SubstreamProtocol::new(Either::Right(upgrade::DeniedUpgrade), ()) | ||
match self.mode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add some comments on what happens here?
That sounds like a rather impactful thing. Now Peer can allow and disallow accepting connections on its own, and I as a developer have no control over that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to outline my understanding of the effects of this change. I'm certainly can be wrong here, so forgive me if I am :)
So, previously if I had allow_listening = true
, my peer would accept incoming connections even before Identify
happened.
Now Kademlia handshake behaviour depends on whether Identify
already happened or not. So instead of 1 handshake on Kademlia protocol and 1 handshake on Identify, we now have (depending on handshakes order) several scenarios:
PeerA connects to PeerB. PeerA is in Server mode, PeerB has not yet learned her external addresses, so it is in Client mode.
Scenario 1: failed connection
- PeerA tries to connect to PeerB
- PeerA attempts Kademlia handhsake with PeerB, PeerB rejects because Mode::Client => upgrade::DeniedUpgrade
- PeerA performs a succesful handshake via Identify with PeerB
- PeerB learns its external address, switches to Server mode
- PeerB sends
ReconfigureMode
to all its kademlia connections, but PeerA is not among them - Kademlia handhsake between A and B has failed
- Connection failed
If PeerA tries again, connection will succeed, even though nothing has changed in the setup. Technically, nothing prevents PeerB of accepting PeerA's connection the first time. But now PeerA has to retry if she's sure she brought external_addresses to PeerB.
Scenario 2: succesful connection
- PeerA tries to connect to PeerB
- PeerA performs a succesful handshake via Identify with PeerB
- PeerB switches to Server mode
- PeerA attempts Kademlia handshake with PeerB
- PeerB accepts
- Connection succesful
If I understand correctly, there's no strict order in which handshakes happen, right? So it is basically a race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeerA attempts Kademlia handhsake with PeerB, PeerB rejects because Mode::Client => upgrade::DeniedUpgrade
Can you elaborate on how that handshake is triggered? At least rust-libp2p
does not attempt to speak the kademlia protocol to a peer that did not advertise it first. This is the whole point of this PR: By filtering peers based on whether the advertise kademlia, we get healthier routing tables and as a result, kademlia queries finish a lot quicker. See #3877 (comment).
This was resolved in #4132 to allow setting the mode |
That is not the way to think about it. Whether or not your node operates in server or client mode solely depends on whether you have an external address. Using other nodes to learn you external address is one way to do that although identify by itself does not result in a verified external address, only a candidate. It is your own node's responsibility to turn such candidates in the verified external addresses and only that will trigger a switch from client to server mode. |
It is a trade-off, right? I appreciate you sharing your view! We have received quite a bit of feedback on this actually but I do still think that this is the correct design. In the past,
Perhaps your configuration system is too powerful? Have you considered to changing your config file to mean additional external addresses? With protocols like identify and AutoNAT, discovering and verifying an external address should work flawlessly. Let's assume we are a talking about an always-online bootnode and other nodes are hard-wired to connect to that one on startup. We definitely want that node to speak kademlia, right? Without the above design, we'd essentially have to rely on other nodes "probing" for kademlia. They would have to just try whether the node they connect to speaks kademlia. Given that the kademlia protocol itself doesn't really know where a connection came from, we'd have to probe every connection for kademlia support. With this design, we can instead make a very good educated guesses that a node supports kademlia by relying on its (self-reported) supported protocols. A bootnode that is configured with a static, external IP address would advertise kademlia from the moment it boots and any new peer that connects would immediately put it in its routing table once identify reported the supported protocols (which also happens immediately one every new connection). I still believe this is a much cleaner design because it is deterministic in behaviour which makes it easier to understand despite the many moving parts compared to random of protocols. That being said, I am open to alternative designs if you have any ideas! |
Previously, a `NetworkBehaviour` could report an `AddressScore` for an external address. This score was a `u32` and addresses would be ranked amongst those. In reality, an address is either confirmed to be publicly reachable (via a protocol such as AutoNAT) or merely represents a candidate that might be an external address. In a way, addresses are guilty (private) until proven innocent (publicly reachable). When a `NetworkBehaviour` reports an address candidate, we perform address translation on it to potentially correct for ephemeral ports of TCP. These candidates are then injected back into the `NetworkBehaviour`. Protocols such as AutoNAT can use these addresses as a source for probing their NAT status. Once confirmed, they can emit a `ToSwarm::ExternalAddrConfirmed` event which again will be passed to all `NetworkBehaviour`s. This simplified approach will allow us implement Kademlia's client-mode (libp2p#2032) without additional configuration options: As soon as an address is reported as publicly reachable, we can activate server-mode for that connection. Related: libp2p#3877. Related: libp2p#3953. Related: libp2p#2032. Related: libp2p/go-libp2p#2229. Co-authored-by: Max Inden <[email protected]> Pull-Request: libp2p#3954.
Description
Currently, the kademlia behaviour can only learn that the remote node supports kademlia on a particular connection if we successfully negotiate a stream to them.
Using the newly introduced abstractions from #3651, we don't have to attempt to establish a stream to the remote to learn whether they support kademlia on a connection but we can directly learn it from the
ConnectionEvent::RemoteProtocolsChange
event. This happens directly once a connection is established which should overall benefit the DHT.Clients do not advertise the kademlia protocol and thus we will immediately learn that a given connection is not suitable for kadmelia requests. We may receive inbound messages from it but this does not affect the routing table.
Resolves: #2032.
Notes & open questions
Draft because it isstacked on top of:ConnectionHandler
#3651LocalProtocolsChange
event #3979IsMode::Server
a good default? Should it beMode::Client
to be "safer"?Should we even have an overall mode? With how we are now thinking of supported protocols, we could be a server on one connection and a client on another.
Depends-On: #3954.
Change checklist