-
Notifications
You must be signed in to change notification settings - Fork 21
liveness durable log
The [Paxos protocol](http://en.wikipedia.org/wiki/Paxos_(computer_science\)\#Safety_and_liveness_properties) states the following requirement:
If value C has been proposed, then eventually learner L will learn some value (if sufficient processors remain non-faulty).
In other words: once a value has been proposed, no matter what state the learner is currently in, eventually learner L will learn of that value. This means we need to implement a snapshot and catch-up mechanism. This page discussed the implementation.
We will implement a durable catch-up mechanism in the following way:
- each Follower will always communicate its last accepted proposal id to all other nodes with all commands;
- each Follower will have a durable history of logs of recently proposed values, associated by proposal id;
- each Follower is required to have a history of logs since the lowest proposal id of all Followers currently known;
- a single proposal id is guaranteed to be proposed only once during the lifetime of a quorum, including complete shutdown & restart; only complete removal of the entire history of logs at all Followers will reset the proposal id;
- once a Leader sends an Accept to a Follower, it sends the values of all proposals since its highest proposal id (usually just one);
- if a Follower last accepted proposal id is not at least as high as all the other nodes, it can never be elected as leader;
- the proposal id is communicated by the Followers to the Processor;
There are several scenario this implementation will take care of, which we will discuss here.
A Follower is temporarily unreachable. This means all Followers will not remove the history of logs since the disconnect with that Follower occured (implementation rule 3), and upon reconnection the Leader will send the Follower all values that were proposed since its disconnect (implementation rule 5).
A Leader disconnects from the quorum. This will result in a new Leader being elected. The new Leader will keep storing all logs, since the highest accepted proposal id of the disconnected Leader never increments (implementation rule 3). Once the disconnected Leader reconnects, it will not immediately be re-elected as the new leader, since its highest accepted proposal id is not as high as the rest of the quorum (implementation rule 6). Once it has caught up after the next value is proposed, it will consider itself the Leader again, and will be elected as such by the rest of the quorum.
A new follower is added to the quorum and needs to synchronize its state with the rest of the followers
The new follower will be initialized with proposal id #0 at all other Followers. This means that the history will never be truncated (implementation rule 3). The new follower replicates its working data (that is, the state that is altered as a result of the accepted values, and outside the scope of our library) and launches. It will connect to the quorum and gets sent all the recent history (implementation rule 5). It is up to the new follower to decide whether a value was proposed before or after the snapshot, based on the proposal id it receives with the values (implementation rule 7).