Node fallback for Put requests #187

gavin-norman-sociomantic · 2018-11-13T14:37:57Z

Resilience to node outage for Put requests is actually quite simple.

On the client side:

If a Put request fails due to a no_node error or a connection error, the client picks another node (using some deterministic algorithm) and sends the record there. (Possibly repeating, if multiple nodes are out.)
Ditto for Get requests.

On the node side:

The node has a full DHT client, and loads the standard .nodes file at startup, connecting to all other nodes.
The node keeps a separate store of "orphan records" (i.e. records that do not fall under its normal hash range and are being temporarily kept for another node).
Any records that are Put that are outside the node's hash range are placed in the store of orphaned records.
Get requests that are outside the node's hash range are looked up in the store of orphaned records.
Periodically, the node iterates over the store of orphaned records and tries to send them to the correct node, using normal Put requests. On success, a record is removed from the store of orphaned records. On failure, it stays there until the next forwarding period.

The text was updated successfully, but these errors were encountered:

nemanja-boric-sociomantic · 2018-11-13T14:45:31Z

Note that we don't need to periodically iterate over the set of the nodes, but we can use node's client's connection notifier to sync the orphaned records when the node reconnects with the lost node.

One thing to worry about here (and maybe it's not worth pursuing at this level) is the network partition problem: if you have two nodes A and B which both can talk only to a subset of clients, and which can't talk to each other, both will receive updates for their records and for the records to the node that's not available. Then on the network recovery, both node A and node B will have the new state for the B's records.

gavin-norman-sociomantic · 2018-11-13T14:59:09Z

The redistribution system could probably be rephrased to work with this too: When the hash range of the node changes, it iterates over its channels, puts out-of-range records into the orphaned records store and lets magic do the rest.

nemanja-boric-sociomantic · 2018-11-13T15:19:28Z

Ah, yes, that's a nice consequence.

gavin-norman-sociomantic added the type-feature label Nov 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node fallback for Put requests #187

Node fallback for Put requests #187

gavin-norman-sociomantic commented Nov 13, 2018

nemanja-boric-sociomantic commented Nov 13, 2018

gavin-norman-sociomantic commented Nov 13, 2018

nemanja-boric-sociomantic commented Nov 13, 2018

Node fallback for Put requests #187

Node fallback for Put requests #187

Comments

gavin-norman-sociomantic commented Nov 13, 2018

nemanja-boric-sociomantic commented Nov 13, 2018

gavin-norman-sociomantic commented Nov 13, 2018

nemanja-boric-sociomantic commented Nov 13, 2018