feat(driver): do not schedule the same timeout twice#1501
feat(driver): do not schedule the same timeout twice#1501cason wants to merge 7 commits intocirclefin:mainfrom
Conversation
652ed50 to
4d031cf
Compare
This comment was marked as outdated.
This comment was marked as outdated.
| if timeout.round < self.round() || self.scheduled_timeouts.contains(&timeout) { | ||
| return; | ||
| } | ||
| // XXX: test if the driver produces **non**-consensus timeouts |
There was a problem hiding this comment.
Should I change this to a warning or something like that? Apparently, we don't panic - at least in the test that have ran.
There was a problem hiding this comment.
I would perhaps change this to a debug_assert!
| /// The certificate that justifies moving to the `enter_round` specified in the `EnterRoundCertificate. | ||
| pub round_certificate: Option<EnterRoundCertificate<Ctx>>, | ||
|
|
||
| scheduled_timeouts: Vec<Timeout>, |
There was a problem hiding this comment.
Let's add a doc comment here to explain this field's purpose
| if timeout.round < self.round() || self.scheduled_timeouts.contains(&timeout) { | ||
| return; | ||
| } | ||
| // XXX: test if the driver produces **non**-consensus timeouts |
There was a problem hiding this comment.
I would perhaps change this to a debug_assert!
Co-authored-by: Romain Ruetschi <github@romac.me>
nenadmilosevic95
left a comment
There was a problem hiding this comment.
Looks good! Left two minor comments. In general, I like the driver-level dedup — simple and effective. IMHO the state machine check might be even cleaner, since the Tendermint pseudo-code states "for the first time" and the state machine should align with it, but this works well too.
| } | ||
|
|
||
| fn lift_timeout_output(&mut self, timeout: Timeout, outputs: &mut Vec<Output<Ctx>>) { | ||
| if timeout.round < self.round() || self.scheduled_timeouts.contains(&timeout) { |
There was a problem hiding this comment.
Minor: Can this ever be true timeout.round < self.round()? Timeouts should only be for the current round. If purely defensive, I'd add a warn! here so we notice if it ever triggers.
|
|
||
| // Remove useless timeouts from previous rounds | ||
| self.scheduled_timeouts | ||
| .retain(|timeout| timeout.round >= round); |
There was a problem hiding this comment.
Similar here, could this just be clear()? It is minor ofc, I just want to understand if we expect to see any future rounds here, since I don't see how
There was a problem hiding this comment.
Also a good point. But the concern here is to limit the growth of this set.
There was a problem hiding this comment.
I wonder if we can schedule timeouts for future rounds - otherwise, clear() is the way to go.
I will try to implement this version. |
|
Closes: #1500
As per title.
This can happen to several reasons, but mostly because
TimeoutPrecommitis re-scheduled at every round step change.The solution consists on creating a
scheduled_timeoutsvector at the driver, and only adding a new timeout to it if not already present. The vector is cleaned upon a new height, and old entries are removed upon a new height.