We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
https://github.com/tidb-incubator/tinykv/blob/f050d8c1bde1dd210fdcd6e0dbb0739af5687669/kv/test_raftstore/scheduler.go#L335-L372
我在实现project3b时,有时会panic在上述代码的L371行。触发该panic的日志如下(加入两个再删除两个quorum)。
2022/03/30 09:44:40.848845 /home/wangqi/workplace/tinykv/kv/test_raftstore/peer_msg_handler.go:109: [info] [wq] Node [region 1] 7, add: 8 Region.ConfVer: 15 2022/03/30 09:44:40.848958 /home/wangqi/workplace/tinykv/kv/test_raftstore/peer_msg_handler.go:109: [info] [wq] Node [region 1] 7, add: 9 Region.ConfVer: 16 2022/03/30 09:44:40.849307 /home/wangqi/workplace/tinykv/kv/test_raftstore/peer_msg_handler.go:124: [info] [wq] Node [region 1] 7, remove: 8 Region.ConfVer: 17 2022/03/30 09:44:40.849519 /home/wangqi/workplace/tinykv/kv/test_raftstore/peer_msg_handler.go:124: [info] [wq] Node [region 1] 7, remove: 9 Region.ConfVer: 18
panic的原因是,在Conver为15和19时,总quorum不变,但是导致scheduler检验到region.conversion跳跃大于1,panic。 之后定位了一下触发区域心跳的位置,发现有如下两处: (1)https://github.com/tidb-incubator/tinykv/blob/course/kv/raftstore/peer_msg_handler.go#L511-L518 (2)https://github.com/tidb-incubator/tinykv/blob/course/kv/raftstore/peer_msg_handler.go#L202-L204 对于第(1)处,是由时钟触发 对于第(2)处,是当addnode的节点(pending node)追上leader的truncate时触发 对于如下情况:leader和其他大多数节点达成同步,可以直接忽视pending node,进行apply log(addnode和removenode请求),则在当前version,就无法触发(2),则只能等到时钟timeout(1)时才有机会触发区域心跳,这样scheduler就可能检测到region.conversion跳跃大于1,之后panic.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
https://github.com/tidb-incubator/tinykv/blob/f050d8c1bde1dd210fdcd6e0dbb0739af5687669/kv/test_raftstore/scheduler.go#L335-L372
我在实现project3b时,有时会panic在上述代码的L371行。触发该panic的日志如下(加入两个再删除两个quorum)。
panic的原因是,在Conver为15和19时,总quorum不变,但是导致scheduler检验到region.conversion跳跃大于1,panic。
之后定位了一下触发区域心跳的位置,发现有如下两处:
(1)https://github.com/tidb-incubator/tinykv/blob/course/kv/raftstore/peer_msg_handler.go#L511-L518
(2)https://github.com/tidb-incubator/tinykv/blob/course/kv/raftstore/peer_msg_handler.go#L202-L204
对于第(1)处,是由时钟触发
对于第(2)处,是当addnode的节点(pending node)追上leader的truncate时触发
对于如下情况:leader和其他大多数节点达成同步,可以直接忽视pending node,进行apply log(addnode和removenode请求),则在当前version,就无法触发(2),则只能等到时钟timeout(1)时才有机会触发区域心跳,这样scheduler就可能检测到region.conversion跳跃大于1,之后panic.
The text was updated successfully, but these errors were encountered: