Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orderer node crash down after recieving fuzzed messages #279

Open
sardChen opened this issue Feb 18, 2022 · 1 comment
Open

orderer node crash down after recieving fuzzed messages #279

sardChen opened this issue Feb 18, 2022 · 1 comment

Comments

@sardChen
Copy link

System information:
version: fabric 1.4
consensus type: smartbft
OS: ubuntu 20.04

Experiments setup:
Using first-network in https://github.com/SmartBFT-Go/fabric-samples, set up a 12-nodes(10 orderers and 2 peers) network locally, node1, node3, node5 are fuzzer nodes that continuously send fuzzed messages to others nodes, and the rest nodes are normal nodes.

Expected behavior:
Normal nodes run as usual.

Actual behaviour:
Orderer2, Orderer9, Orderer10 crash down, and can not be recovered.

Log info:

2022-02-15 03:44:16.985 UTC [orderer.common.cluster.puller] fetchLastBlockSeq -> ERRO 41706 Failed receiving the latest block from orderer9.example.com:7050: didn't receive a response within 5s
2022-02-15 03:44:16.985 UTC [orderer.common.cluster.puller] func1 -> WARN 41707 Received error of type 'didn't receive a response within 5s' from {orderer9.example.com:7050 [-----BEGIN CERTIFICATE-----
MIICQjCCAemgAwIBAgIQKCHhqBmKvD+fD405cvjAbzAKBggqhkjOPQQDAjBsMQsw
CQYDVQQGEwJVUzETMBEGA1UECBMKQ2FsaWZvcm5pYTEWMBQGA1UEBxMNU2FuIEZy
YW5jaXNjbzEUMBIGA1UEChMLZXhhbXBsZS5jb20xGjAYBgNVBAMTEXRsc2NhLmV4
YW1wbGUuY29tMB4XDTIyMDIxNTAyNDIwMFoXDTMyMDIxMzAyNDIwMFowbDELMAkG
A1UEBhMCVVMxEzARBgNVBAgTCkNhbGlmb3JuaWExFjAUBgNVBAcTDVNhbiBGcmFu
Y2lzY28xFDASBgNVBAoTC2V4YW1wbGUuY29tMRowGAYDVQQDExF0bHNjYS5leGFt
cGxlLmNvbTBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABKXjWWrjsNU8jNYdezcJ
MDRBgEfyUkjbg5UJ1Ee1+ADs81lcn+s1AlUS1NspR1KHFjl39qHDZidQYs+nQeiP
+fOjbTBrMA4GA1UdDwEB/wQEAwIBpjAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYB
BQUHAwEwDwYDVR0TAQH/BAUwAwEB/zApBgNVHQ4EIgQgw1ZJbXtGAHI3Di/DmX5C
3MDhCeaNjqvHHXm9IvGgCz0wCgYIKoZIzj0EAwIDRwAwRAIgMa2KU38U/uw4ntoU
EXrE+MoEN0PVmkmfm5ii6WYktF4CIAGbaJ5pTS8fYO4onZLHzwLcNZ/21Z1ekHGO
2sMV/aAG
-----END CERTIFICATE-----
]}
2022-02-15 03:44:16.985 UTC [orderer.common.cluster.puller] HeightsByEndpoints -> INFO 41708 Returning the heights of OSNs mapped by endpoints map[orderer.example.com:7050:20 orderer10.example.com:7050:20 orderer3.example.com:7050:20 orderer5.example.com:7050:20 orderer6.example.com:7050:20 orderer7.example.com:7050:20 orderer8.example.com:7050:20]
2022-02-15 03:44:16.985 UTC [orderer.consensus.smartbft.chain] synchronize -> INFO 41709 HeightsByEndpoints: map[orderer.example.com:7050:20 orderer10.example.com:7050:20 orderer3.example.com:7050:20 orderer5.example.com:7050:20 orderer6.example.com:7050:20 orderer7.example.com:7050:20 orderer8.example.com:7050:20] channel=mychannel
2022-02-15 03:44:16.985 UTC [comm.grpc.server] 1 -> INFO 4170a streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=192.168.64.3:33536 grpc.peer_subject="CN=orderer9.example.com,L=San Francisco,ST=California,C=US" error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=5.000194448s
2022-02-15 03:44:16.985 UTC [orderer.consensus.smartbft.chain] Sync -> WARN 4170b Could not synchronize with remote peers due to already at height of 20, returning state from local ledger channel=mychannel
2022-02-15 03:44:16.985 UTC [comm.grpc.server] 1 -> INFO 4170c streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=192.168.64.6:38972 grpc.peer_subject="CN=orderer8.example.com,L=San Francisco,ST=California,C=US" error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=5.000095189s
2022-02-15 03:44:16.985 UTC [comm.grpc.server] 1 -> INFO 4170d streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=192.168.64.8:38054 grpc.peer_subject="CN=orderer10.example.com,L=San Francisco,ST=California,C=US" error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=4.999234388s
2022-02-15 03:44:16.986 UTC [comm.grpc.server] 1 -> INFO 4170e streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=192.168.64.9:54276 grpc.peer_subject="CN=orderer7.example.com,L=San Francisco,ST=California,C=US" error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=4.999535957s
2022-02-15 03:44:16.990 UTC [orderer.consensus.smartbft.consensus] processMsg -> WARN 4170f Node 9 got viewChange message view_change:<next_view:8915686874921810032 reason:"\306\247\346\264\201\351\266\243\304\230`\304\210\305\267" >  from 1 with view 8915686874921810032, expected view 204 channel=mychannel
2022-02-15 03:44:16.996 UTC [orderer.consensus.smartbft.consensus] processMsg -> WARN 41710 Node 9 got viewChange message view_change:<next_view:137 >  from 6 with view 137, expected view 204 channel=mychannel
2022-02-15 03:44:16.996 UTC [orderer.consensus.smartbft.consensus] processMsg -> WARN 41711 Node 9 got viewChange message view_change:<next_view:137 >  from 3 with view 137, expected view 204 channel=mychannel
2022-02-15 03:44:17.053 UTC [orderer.consensus.smartbft.consensus] Submit -> ERRO 41712 request {4f81da8ba7018b54c793b6657e8e0cf17a522a352f0c33c51811856011b9282b 75f2c2ba5abf72554b8b64c496c59988c0ea82e03fb6dbe1f5a5a73234683dad} already exists in the pool channel=mychannel
2022-02-15 03:44:17.053 UTC [orderer.consensus.smartbft.consensus] addRequest -> INFO 41713 Request {4f81da8ba7018b54c793b6657e8e0cf17a522a352f0c33c51811856011b9282b 75f2c2ba5abf72554b8b64c496c59988c0ea82e03fb6dbe1f5a5a73234683dad} was not submitted, error: request {4f81da8ba7018b54c793b6657e8e0cf17a522a352f0c33c51811856011b9282b 75f2c2ba5abf72554b8b64c496c59988c0ea82e03fb6dbe1f5a5a73234683dad} already exists in the pool channel=mychannel
2022-02-15 03:44:18.003 UTC [comm.grpc.server] 1 -> INFO 41714 streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=192.168.64.10:51136 grpc.peer_subject="CN=orderer6.example.com,L=San Francisco,ST=California,C=US" error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=5.000721148s
2022-02-15 03:44:18.353 UTC [orderer.consensus.smartbft.consensus] processMsg -> WARN 41715 Node 9 got newView message < NewView with <ViewData signed by 3 with NextView: 32>, <ViewData signed by 8 with NextView: 32>, <ViewData signed by 5 with NextView: 32>, <ViewData signed by 7 with NextView: 32>, <ViewData signed by 9 with NextView: 32>, <ViewData signed by 1 with NextView: 32>, <ViewData signed by 2 with NextView: 32>> from 3, expected sender to be 4 the next leader channel=mychannel
2022-02-15 03:44:18.354 UTC [orderer.consensus.smartbft.consensus] processMsg -> WARN 41716 Node 9 got newView message < NewView with <ViewData signed by 1 with NextView: 10>, <ViewData signed by 5 with NextView: 10>, <ViewData signed by 8 with NextView: 10>, <ViewData signed by 2 with NextView: 10>, <ViewData signed by 7 with NextView: 10>, <ViewData signed by 4 with NextView: 10>, <ViewData signed by 10 with NextView: 10>> from 1, expected sender to be 4 the next leader channel=mychannel
2022-02-15 03:44:18.633 UTC [orderer.consensus.smartbft.consensus] Submit -> ERRO 41717 request {f35cd99687d0409362a271e1ac288eef1e5b070c0fed61b175b7322aafe98417 c35563b550c6690fdea6fa92fb5d04da689a4d64df32c756424f8fdb5ea36add} already exists in the pool channel=mychannel
2022-02-15 03:44:18.634 UTC [orderer.consensus.smartbft.consensus] addRequest -> INFO 41718 Request {f35cd99687d0409362a271e1ac288eef1e5b070c0fed61b175b7322aafe98417 c35563b550c6690fdea6fa92fb5d04da689a4d64df32c756424f8fdb5ea36add} was not submitted, error: request {f35cd99687d0409362a271e1ac288eef1e5b070c0fed61b175b7322aafe98417 c35563b550c6690fdea6fa92fb5d04da689a4d64df32c756424f8fdb5ea36add} already exists in the pool channel=mychannel
2022-02-15 03:44:19.517 UTC [orderer.consensus.smartbft.consensus] Submit -> ERRO 41719 request {f35cd99687d0409362a271e1ac288eef1e5b070c0fed61b175b7322aafe98417 83f07bd154926ec01f25a9f57cdbbf2b1094603f09d1c86d4df4e1126166238a} already exists in the pool channel=byfn-sys-channel
2022-02-15 03:44:19.517 UTC [orderer.consensus.smartbft.consensus] addRequest -> INFO 4171a Request {f35cd99687d0409362a271e1ac288eef1e5b070c0fed61b175b7322aafe98417 83f07bd154926ec01f25a9f57cdbbf2b1094603f09d1c86d4df4e1126166238a} was not submitted, error: request {f35cd99687d0409362a271e1ac288eef1e5b070c0fed61b175b7322aafe98417 83f07bd154926ec01f25a9f57cdbbf2b1094603f09d1c86d4df4e1126166238a} already exists in the pool channel=byfn-sys-channel
2022-02-15 03:44:19.997 UTC [orderer.consensus.smartbft.consensus] processMsg -> WARN 4171b Node 9 got viewChange message view_change:<next_view:137 >  from 8 with view 137, expected view 204 channel=mychannel
2022-02-15 03:44:20.596 UTC [orderer.consensus.smartbft.egress] SendConsensus -> WARN 4171c Failed sending to 2: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup orderer2.example.com: no such host" channel=mychannel
2022-02-15 03:44:20.596 UTC [orderer.consensus.smartbft.egress] SendConsensus -> WARN 4171d Failed sending to 4: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup orderer4.example.com: Temporary failure in name resolution" channel=mychannel
2022-02-15 03:44:20.767 UTC [orderer.consensus.smartbft.consensus] getInFlight -> PANI 4171e Node 9's in flight proposal sequence is 18 while its last decision sequence is 19 channel=mychannel
panic: Node 9's in flight proposal sequence is 18 while its last decision sequence is 19

goroutine 325466 [running]:
github.com/hyperledger/fabric/vendor/go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc001d962c0, 0x0, 0x0, 0x0)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/go.uber.org/zap/zapcore/entry.go:229 +0x545
github.com/hyperledger/fabric/vendor/go.uber.org/zap.(*SugaredLogger).log(0xc000010320, 0x4, 0x168dc85, 0x52, 0xc0030a37a0, 0x3, 0x3, 0x0, 0x0, 0x0)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/go.uber.org/zap/sugar.go:234 +0x100
github.com/hyperledger/fabric/vendor/go.uber.org/zap.(*SugaredLogger).Panicf(...)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/go.uber.org/zap/sugar.go:159
github.com/hyperledger/fabric/common/flogging.(*FabricLogger).Panicf(0xc000010328, 0x168dc85, 0x52, 0xc0030a37a0, 0x3, 0x3)
  /opt/gopath/src/github.com/hyperledger/fabric/common/flogging/zap.go:74 +0x7c
github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).getInFlight(0xc0000f6fc0, 0xc00049b9d0, 0x49)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:446 +0x377
github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).prepareViewDataMsg(0xc0000f6fc0, 0xa)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:389 +0x124
github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).processViewChangeMsg(0xc0000f6fc0, 0x0)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:376 +0x21d
github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).processMsg(0xc0000f6fc0, 0x3, 0xc0030a3650)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:262 +0xa2f
github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).run(0xc0000f6fc0)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:199 +0x2ba
github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).Start.func1(0xc0000f6fc0)
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:137 +0x70
created by github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft.(*ViewChanger).Start
  /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/SmartBFT-Go/consensus/internal/bft/viewchanger.go:134 +0x2c2
@yacovm
Copy link

yacovm commented Mar 19, 2022

I think I fixed this here, but we just didn't port it to Fabric yet.
@muzykantov do you intend to port the latest fixes in the library to Fabric?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants