Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: invalid memory address or nil pointer dereference #1308

Closed
antonydenyer opened this issue Jan 14, 2022 · 41 comments · Fixed by #1331
Closed

panic: runtime error: invalid memory address or nil pointer dereference #1308

antonydenyer opened this issue Jan 14, 2022 · 41 comments · Fixed by #1331
Assignees

Comments

@antonydenyer
Copy link
Contributor

antonydenyer commented Jan 14, 2022

System information

Geth version: geth version
Geth
Version: 1.9.25-stable
Git Commit: 919800f
Quorum Version: 21.10.1
Architecture: amd64
Protocol Versions: [65 64 63]
Network Id: 1337
Go Version: go1.15.6
Operating System: linux
GOPATH=
GOROOT=go

OS & Version: Windows/Linux/OSX
Running on Google CLoud Platform GKE Kubernetes
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.14.2
PRETTY_NAME="Alpine Linux v3.14"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"

Expected behaviour

GoQuorum should gracefully shut down and restart.

Actual behaviour

Private state db is corrupted

INFO [01-13|10:08:51.915] Initialising Ethereum protocol           name=eth     versions="[65 64 63]" network=1000 dbversion=8
INFO [01-13|10:08:51.915] Initialising Quorum consensus protocol   name=istanbul versions="[100 99 64]" network=1000 dbversion=8
WARN [01-13|10:08:51.918] DefaultPrivateStateRepository is nil when calling DefaultState 
WARN [01-13|10:08:51.918] Head private state missing, resetting chain number=30216 hash="caaab9…a70075"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc6ecd4]

goroutine 1 [running]:
github.com/ethereum/go-ethereum/core/types.(*Block).NumberU64(...)
	github.com/ethereum/go-ethereum/core/types/block.go:348
github.com/ethereum/go-ethereum/core.(*BlockChain).SetHeadBeyondRoot(0xc000139900, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:687 +0x1f4
github.com/ethereum/go-ethereum/core.(*BlockChain).SetHead(0xc000139900, 0x0, 0xc0000e8d20, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:566 +0x45
github.com/ethereum/go-ethereum/core.(*BlockChain).ResetWithGenesisBlock(0xc000139900, 0xc0000de750, 0x0, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:838 +0x65
github.com/ethereum/go-ethereum/core.(*BlockChain).Reset(0xc000139900, 0x1752c84, 0x2b)
	github.com/ethereum/go-ethereum/core/blockchain.go:831 +0x37
github.com/ethereum/go-ethereum/core.(*BlockChain).loadLastState(0xc000139900, 0x16de000, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:518 +0x14ff
github.com/ethereum/go-ethereum/core.NewBlockChain(0x19d6720, 0xc00042e520, 0xc0004ff260, 0xc00013ef00, 0x19d4ba0, 0xc00014c500, 0x0, 0x0, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/core/blockchain.go:313 +0x9ab
github.com/ethereum/go-ethereum/eth.New(0xc0003ef1e0, 0xc000348a00, 0x3, 0x0, 0x0)
	github.com/ethereum/go-ethereum/eth/backend.go:228 +0xd39
github.com/ethereum/go-ethereum/cmd/utils.RegisterEthService(0xc0003ef1e0, 0xc000348a00, 0x0, 0x3e8, 0x0)
	github.com/ethereum/go-ethereum/cmd/utils/flags.go:2017 +0xf1
main.makeFullNode(0xc0005a6840, 0x2549, 0x7, 0xc0006719a0)
	github.com/ethereum/go-ethereum/cmd/geth/config.go:180 +0x12f
main.geth(0xc0005a6840, 0x0, 0x0)
	github.com/ethereum/go-ethereum/cmd/geth/main.go:394 +0xf4
gopkg.in/urfave/cli%2ev1.HandleAction(0x149ec20, 0x1823898, 0xc0005a6840, 0xc00065d140, 0x0)
	gopkg.in/urfave/[email protected]/app.go:490 +0x82
gopkg.in/urfave/cli%2ev1.(*App).Run(0xc0003d5ba0, 0xc000132000, 0x4e, 0x50, 0x0, 0x0)
	gopkg.in/urfave/[email protected]/app.go:264 +0x5f5
main.main()

	github.com/ethereum/go-ethereum/cmd/geth/main.go:312 +0x55

Steps to reproduce the behaviour

Under heavy load with private contracts restart the server

The issue has been highlighted by investigations from #1287 into performance.

Thanks to @frankie-lim-sweeho

@antonydenyer
Copy link
Contributor Author

@frankie-lim-sweeho can you try something out for us?

In your container configuration can you add the following:

    lifecycle:
      preStop:
        exec:
          command: [ "/bin/sleep", "20" ]

I suspect this will give the process enough time to finish writing to the db.

@frankie-lim-partior
Copy link

@antonydenyer will test. Thanks

@frankie-lim-partior
Copy link

frankie-lim-partior commented Jan 16, 2022

@antonydenyer I have tested and I can still reproduce the issue
Step to reproduce:

  1. Start stateful Quorum pod with image: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-3 (provided by @baptiste-b-pegasys) and preStop lifecycle:
    lifecycle:
      preStop:
        exec:
          command: [ "/bin/sleep", "20" ]
  1. Submit some private transactions
  2. Stop the stateful Quorum pod
  3. Start the stateful Quorum pod again
  4. Pod will crash with following error. Pod stuck in restart loop, cannot recover until clear persistent storage. Log attached
INFO [01-16|03:57:57.863] Target Private Tx Manager                name=Tessera distributionVersion=21.10.0
INFO [01-16|03:57:57.863] Allocated trie memory caches             clean=165.00MiB dirty=165.00MiB
INFO [01-16|03:57:57.863] Allocated cache and file handles         database=/etc/quorum/qdata/dd/geth/chaindata cache=496.00MiB handles=524288
DEBUG[01-16|03:57:57.881] Chain freezer table opened               database=/etc/quorum/qdata/dd/geth/chaindata/ancient table=hashes items=0 size=0.00B
DEBUG[01-16|03:57:57.882] Chain freezer table opened               database=/etc/quorum/qdata/dd/geth/chaindata/ancient table=bodies items=0 size=0.00B
DEBUG[01-16|03:57:57.882] Chain freezer table opened               database=/etc/quorum/qdata/dd/geth/chaindata/ancient table=receipts items=0 size=0.00B
DEBUG[01-16|03:57:57.882] Chain freezer table opened               database=/etc/quorum/qdata/dd/geth/chaindata/ancient table=diffs    items=0 size=0.00B
DEBUG[01-16|03:57:57.883] Chain freezer table opened               database=/etc/quorum/qdata/dd/geth/chaindata/ancient table=headers  items=0 size=0.00B
INFO [01-16|03:57:57.883] Opened ancient database                  database=/etc/quorum/qdata/dd/geth/chaindata/ancient
DEBUG[01-16|03:57:57.883] Current full block not old enough        number=2824 hash="fb1589…551923" delay=3162240
DEBUG[01-16|03:57:57.884] Account Extra Data root                  hash="000000…000000"
INFO [01-16|03:57:57.884] Initialised chain configuration          config="{ChainID: 1000 Homestead: 0 DAO: <nil> DAOSupport: false EIP150: 0 EIP155: 0 EIP158: 0 Byzantium: 0 IsQuorum: true Constantinople: 0 TransactionSizeLimit: 64 MaxCodeSize: 0 Petersburg: 0 Istanbul: 0, Muir Glacier: <nil> YOLO v2: <nil> PrivacyEnhancements: <nil> PrivacyPrecompile: <nil> Engine: istanbul}"
INFO [01-16|03:57:57.885] Initialising Ethereum protocol           name=eth     versions="[65 64 63]" network=1000 dbversion=8
INFO [01-16|03:57:57.885] Initialising Quorum consensus protocol   name=istanbul versions="[100 99 64]" network=1000 dbversion=8
WARN [01-16|03:57:57.886] DefaultPrivateStateRepository is nil when calling DefaultState 
WARN [01-16|03:57:57.886] Head private state missing, resetting chain number=2824 hash="fb1589…551923"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc6ecd4]

goroutine 1 [running]:
github.com/ethereum/go-ethereum/core/types.(*Block).NumberU64(...)
	github.com/ethereum/go-ethereum/core/types/block.go:348
github.com/ethereum/go-ethereum/core.(*BlockChain).SetHeadBeyondRoot(0xc000139900, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:687 +0x1f4
github.com/ethereum/go-ethereum/core.(*BlockChain).SetHead(0xc000139900, 0x0, 0xc000194d20, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:566 +0x45
github.com/ethereum/go-ethereum/core.(*BlockChain).ResetWithGenesisBlock(0xc000139900, 0xc000489a70, 0x0, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:838 +0x65
github.com/ethereum/go-ethereum/core.(*BlockChain).Reset(0xc000139900, 0x1752c84, 0x2b)
	github.com/ethereum/go-ethereum/core/blockchain.go:831 +0x37
github.com/ethereum/go-ethereum/core.(*BlockChain).loadLastState(0xc000139900, 0x16de000, 0x0)
	github.com/ethereum/go-ethereum/core/blockchain.go:518 +0x14ff
github.com/ethereum/go-ethereum/core.NewBlockChain(0x19d6720, 0xc0005f81c0, 0xc000131c20, 0xc00013e800, 0x19d4ba0, 0xc00014c3c0, 0x0, 0x0, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/core/blockchain.go:313 +0x9ab
github.com/ethereum/go-ethereum/eth.New(0xc000117c70, 0xc0005ed100, 0x3, 0x0, 0x0)
	github.com/ethereum/go-ethereum/eth/backend.go:228 +0xd39
github.com/ethereum/go-ethereum/cmd/utils.RegisterEthService(0xc000117c70, 0xc0005ed100, 0x0, 0x3e8, 0x0)
	github.com/ethereum/go-ethereum/cmd/utils/flags.go:2017 +0xf1
main.makeFullNode(0xc0006a2160, 0x2549, 0x7, 0xc00069b6a0)
	github.com/ethereum/go-ethereum/cmd/geth/config.go:180 +0x12f
main.geth(0xc0006a2160, 0x0, 0x0)
	github.com/ethereum/go-ethereum/cmd/geth/main.go:394 +0xf4
gopkg.in/urfave/cli%2ev1.HandleAction(0x149ec20, 0x1823898, 0xc0006a2160, 0xc00067e060, 0x0)
	gopkg.in/urfave/[email protected]/app.go:490 +0x82
gopkg.in/urfave/cli%2ev1.(*App).Run(0xc000411a00, 0xc000132000, 0x4e, 0x50, 0x0, 0x0)
	gopkg.in/urfave/[email protected]/app.go:264 +0x5f5
main.main()
	github.com/ethereum/go-ethereum/cmd/geth/main.go:312 +0x55

image
head-private-missing-crash-20s-sleep.log

@frankie-lim-partior
Copy link

@antonydenyer and @baptiste-b-pegasys, with version 2 of the fix (image: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-2), I got a difference panic error, but atleast this panic in version 2 do not stuck in crash loop forever, will recover after restart. So it must be the different code change between version 2 (image: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-2) and version 3 (image: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-3) that cause the infinite/unrecoverable crash.

@baptiste-b-pegasys mentioned version 3 will cause a reset of the storage when it happens. possible this reset of storage cause the unrecoverable crash? please check the code.

version 2 (image: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-2) crash with full log attached

DEBUG[01-16|04:22:44.755] Account Extra Data root                  hash="000000…000000"
DEBUG[01-16|04:22:44.755] Rewound to block with state              number=0    hash="66d413…a06c1f"
DEBUG[01-16|04:22:44.784] Account Extra Data root                  hash="000000…000000"
INFO [01-16|04:22:44.784] Loaded most recent local header          number=0    hash="66d413…a06c1f" td=1    age=52y9mo2w
INFO [01-16|04:22:44.784] Loaded most recent local full block      number=0    hash="66d413…a06c1f" td=1    age=52y9mo2w
INFO [01-16|04:22:44.784] Loaded most recent local fast block      number=0    hash="66d413…a06c1f" td=1    age=52y9mo2w
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc86046]

goroutine 1 [running]:
github.com/ethereum/go-ethereum/core.(*BlockChain).SubscribeChainHeadEvent(0x0, 0xc0043b99e0, 0xc0043b99e0, 0x19)
	github.com/ethereum/go-ethereum/core/blockchain.go:2838 +0x26
github.com/ethereum/go-ethereum/core.(*ChainIndexer).Start(0xc0001963c0, 0x19a80c0, 0x0)
	github.com/ethereum/go-ethereum/core/chain_indexer.go:150 +0x5d
github.com/ethereum/go-ethereum/eth.New(0xc0003fb380, 0xc0001d8a00, 0x3, 0x0, 0x0)
	github.com/ethereum/go-ethereum/eth/backend.go:239 +0xdbb
github.com/ethereum/go-ethereum/cmd/utils.RegisterEthService(0xc0003fb380, 0xc0001d8a00, 0x0, 0x3e8, 0x0)
	github.com/ethereum/go-ethereum/cmd/utils/flags.go:2017 +0xf1
main.makeFullNode(0xc000154580, 0x2549, 0x7, 0xc00066c160)
	github.com/ethereum/go-ethereum/cmd/geth/config.go:180 +0x12f
main.geth(0xc000154580, 0x0, 0x0)
	github.com/ethereum/go-ethereum/cmd/geth/main.go:394 +0xf4
gopkg.in/urfave/cli%2ev1.HandleAction(0x149ec20, 0x1823890, 0xc000154580, 0xc0006464e0, 0x0)
	gopkg.in/urfave/[email protected]/app.go:490 +0x82
gopkg.in/urfave/cli%2ev1.(*App).Run(0xc000118000, 0xc000132000, 0x4e, 0x50, 0x0, 0x0)
	gopkg.in/urfave/[email protected]/app.go:264 +0x5f5
main.main()
	github.com/ethereum/go-ethereum/cmd/geth/main.go:312 +0x55

quorum-fix21.10-2-crash-on-load.log.zip

@frankie-lim-partior
Copy link

frankie-lim-partior commented Jan 16, 2022

@baptiste-b-pegasys @antonydenyer , finally I am able to capture the actual quorum crash log while sending high TPS transaction to Quorum using image: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-3, which has the fix for original 'fatal error: concurrent map iteration and map write' issue.

  1. This new 'fatal error: concurrent map iteration and map write' ( with quorum:fix21.10-3) is much harder to reproduce and rare, compare to the original "fatal error: concurrent map iteration and map write" issue in official release v21.10.0.
  2. Once this fatal error occur, pod will restart, and "panic: runtime error: invalid memory address or nil pointer dereference" will happen causing the pod to crash and restart indefinitely.
    3. In other word, PR for "Fix errors when there is high load" Fix errors when there is high load #1302 (that implement the fix used in the test image quorum:fix21.10-3) may introduce an undesirable panic issue that will corrupt the private store.
DEBUG[01-16|05:27:44.379] Executing EVM call finished              runtime="352.829µs"
TRACE[01-16|05:27:44.379] Sent transactions                        id=523d1fd8dc3b8af0 conn=staticdial           count=1
TRACE[01-16|05:27:44.380] Sent transactions                        id=520db3598a65f1c7 conn=staticdial           count=1
DEBUG[01-16|05:27:44.380] Executing EVM call finished              runtime="283.345µs"
fatal error: concurrent map iteration and map write
INFO [01-16|05:27:44.380] QUORUM-CHECKPOINT                        name=TX-COMPLETED            tx=0x03b1db5d89b1169992a2493a7e48344026c6e854bf85070b2abe575e3c8d5f92 time="880.132µs"

goroutine 207759 [running]:
runtime.throw(0x174918a, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc010d15b60 sp=0xc010d15b30 pc=0x450c52
runtime.mapiternext(0xc010d15eb8)
	runtime/map.go:853 +0x554 fp=0xc010d15be0 sp=0xc010d15b60 pc=0x429f14
runtime.mapiterinit(0x14f9680, 0xc0087a5680, 0xc010d15eb8)
	runtime/map.go:843 +0x1c5 fp=0xc010d15c00 sp=0xc010d15be0 pc=0x4298c5
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc008392f00, 0xc003f93080)
	github.com/ethereum/go-ethereum/core/state/statedb.go:774 +0x3d5 fp=0xc010d15f28 sp=0xc010d15c00 pc=0x95e695
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0005b4900, 0x1712e4d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc010d15ff0 sp=0xc010d15f28 pc=0xe9e9d0
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc000655a10, 0x19be5a0, 0xc00985c240, 0xfffffffffffffffe, 0xc002c828c0, 0x0, 0x0, 0x3c, 0x21eb8fe)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc010d160e8 sp=0xc010d15ff0 pc=0xf01085
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc000655a10, 0x19be5a0, 0xc00985c240, 0xc007cb5518, 0x0, 0x0, 0x0, 0xc009e40fc0, 0xc002cdea80, 0x1821028, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc010d16260 sp=0xc010d160e8 pc=0xf018f8
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be5a0, 0xc00985c240, 0x19dd860, 0xc000655a10, 0xc008c421b8, 0xc0096cf740, 0xc007cb5678, 0xc004538560, 0xc0056c19e0, 0xc004538540, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc010d16648 sp=0xc010d16260 pc=0xdb8306
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x5376ca6, 0x19be500, 0xc00c6438f0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc010d16fc8 sp=0xc010d16648 pc=0xdd0413
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be5a0, 0xc00985c240, 0x19dd860, 0xc000655a10, 0xc008c421b8, 0xc0096cf740, 0xc007cb5678, 0xc004538560, 0xc0056c19e0, 0xc004538540, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc010d17268 sp=0xc010d16fc8 pc=0xdb9a3a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc008c42160, 0x19be5a0, 0xc00985c240, 0x19dd860, 0xc000655a10, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc010d17380 sp=0xc010d17268 pc=0xdc0c19
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc0000e9420, 0x19be5a0, 0xc00985c240, 0x0, 0x0, 0xc00c1c0140, 0x1, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc010d17828 sp=0xc010d17380 pc=0xdc3e37
runtime.call256(0xc00c4c22d0, 0xc000bf46c0, 0xc008a3d770, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc010d17938 sp=0xc010d17828 pc=0x48821c
reflect.Value.call(0xc0001a5280, 0xc000bf46c0, 0x13, 0x170e4f4, 0x4, 0xc0067d5040, 0x3, 0x3, 0xc0067d5070, 0xc004538500, ...)
	reflect/value.go:476 +0x8c7 fp=0xc010d17b50 sp=0xc010d17938 pc=0x4bcc87
reflect.Value.Call(0xc0001a5280, 0xc000bf46c0, 0x13, 0xc0067d5040, 0x3, 0x3, 0x16, 0x7fd6f3f87620, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc010d17bd0 sp=0xc010d17b50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc000c10120, 0x19be5a0, 0xc00985c240, 0xc0096cf6e0, 0x13, 0xc004538500, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc010d17cd8 sp=0xc010d17bd0 pc=0xbbd125
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc0072c7440, 0x19be5a0, 0xc00985c1e0, 0xc0083c7ea0, 0xc000c10120, 0xc004538500, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc010d17d48 sp=0xc010d17cd8 pc=0xbb4719
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc0072c7440, 0xc00985c180, 0xc0083c7ea0, 0x203002)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc010d17e30 sp=0xc010d17d48 pc=0xbb3605
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc0072c7440, 0xc00985c180, 0xc0083c7ea0, 0x19a7501)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc010d17ef8 sp=0xc010d17e30 pc=0xbb2b5e
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc00985c180)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc010d17f68 sp=0xc010d17ef8 pc=0xbc27c6
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc0072c7440, 0xc0045384c0)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc010d17fd0 sp=0xc010d17f68 pc=0xbc2992
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc010d17fd8 sp=0xc010d17fd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

goroutine 1 [chan receive, 14 minutes]:
github.com/ethereum/go-ethereum/node.(*Node).Wait(...)
	github.com/ethereum/go-ethereum/node/node.go:445
main.geth(0xc000154f20, 0x0, 0x0)
	github.com/ethereum/go-ethereum/cmd/geth/main.go:399 +0x157
gopkg.in/urfave/cli%2ev1.HandleAction(0x149ec20, 0x1823898, 0xc000154f20, 0xc0001b97a0, 0x0)
	gopkg.in/urfave/[email protected]/app.go:490 +0x82
gopkg.in/urfave/cli%2ev1.(*App).Run(0xc000119520, 0xc000132000, 0x4e, 0x50, 0x0, 0x0)
	gopkg.in/urfave/[email protected]/app.go:264 +0x5f5
main.main()
	github.com/ethereum/go-ethereum/cmd/geth/main.go:312 +0x55

Full log
fix3-crash-2.log.zip

@baptiste-b-pegasys
Copy link
Contributor

Can I have the tools you have in order to reproduce that. Thank you

@baptiste-b-pegasys
Copy link
Contributor

You can try two images:
docker pull ghcr.io/baptiste-b-pegasys/quorum:fix21.10-4 : add a new lock
docker pull ghcr.io/baptiste-b-pegasys/quorum:fix21.10-5 : with the lock + same periodical write of the journals

@frankie-lim-partior
Copy link

@baptiste-b-pegasys thanks! will test .

I will also document as detail as possible how to compile and setup Hyperledger Caliper for Quorum SUT.

@frankie-lim-partior
Copy link

Test Result
Test Target:
ghcr.io/baptiste-b-pegasys/quorum:fix21.10-4 : add a new lock

Result:

  1. FAILED - "fatal error: concurrent map iteration and map write" while sending high TPS private transaction
  2. Full log attached
  3. Post fatal error, Quorum pod restart and works without corrupting the persistent storage
  4. Tested the scenario where we stop then restart the pod. the Quorum pod still working fine without corrupting the persistent storage.
  5. NOT reproducible in this build - "panic: runtime error: invalid memory address or nil pointer dereference"
TRACE[01-18|06:32:36.100] Sent transactions                        id=6130c51a3fefe057 conn=staticdial           count=1
DEBUG[01-18|06:32:36.100] Served eth_sendTransaction               reqid=14723     t=5.079143806s
DEBUG[01-18|06:32:36.100] Executing EVM call finished              runtime=2.050055ms
fatal error: concurrent map iteration and map write

goroutine 897871 [running]:
runtime.throw(0x174913b, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc009cc7b60 sp=0xc009cc7b30 pc=0x450c52
runtime.mapiternext(0xc009cc7eb8)
	runtime/map.go:853 +0x554 fp=0xc009cc7be0 sp=0xc009cc7b60 pc=0x429f14
runtime.mapiterinit(0x14f9620, 0xc009f542d0, 0xc009cc7eb8)
	runtime/map.go:843 +0x1c5 fp=0xc009cc7c00 sp=0xc009cc7be0 pc=0x4298c5
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc00a011800, 0xc00f1e8a80)
	github.com/ethereum/go-ethereum/core/state/statedb.go:786 +0x3d5 fp=0xc009cc7f28 sp=0xc009cc7c00 pc=0x95eab5
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0003d6000, 0x1712e2d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc009cc7ff0 sp=0xc009cc7f28 pc=0xe9e390
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc0006ccdb0, 0x19be3a0, 0xc009b95710, 0xfffffffffffffffe, 0xc00da116c0, 0x0, 0x0, 0x3c, 0x21eb816)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc009cc80e8 sp=0xc009cc7ff0 pc=0xf00a45
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc0006ccdb0, 0x19be3a0, 0xc009b95710, 0xc00b8d98d8, 0x0, 0x0, 0x0, 0xc0084ab200, 0xc00f91aa80, 0x1820ee0, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc009cc8260 sp=0xc009cc80e8 pc=0xf012b8
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be3a0, 0xc009b95710, 0x19dd640, 0xc0006ccdb0, 0xc00977d188, 0xc009a24be0, 0xc00b8d9c18, 0xc009b7bb60, 0xc00872fae0, 0xc009b7bb40, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc009cc8648 sp=0xc009cc8260 pc=0xdb7cc6
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x14e18ae, 0x19be300, 0xc007b75470, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc009cc8fc8 sp=0xc009cc8648 pc=0xdcfdd3
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be3a0, 0xc009b95710, 0x19dd640, 0xc0006ccdb0, 0xc00977d188, 0xc009a24be0, 0xc00b8d9c18, 0xc009b7bb60, 0xc00872fae0, 0xc009b7bb40, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc009cc9268 sp=0xc009cc8fc8 pc=0xdb93fa
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc00977d130, 0x19be3a0, 0xc009b95710, 0x19dd640, 0xc0006ccdb0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc009cc9380 sp=0xc009cc9268 pc=0xdc05d9
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc00000f900, 0x19be3a0, 0xc009b95710, 0x0, 0x0, 0xc00a8f9b80, 0x1, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc009cc9828 sp=0xc009cc9380 pc=0xdc37f7
runtime.call256(0xc0111199e0, 0xc000c6a4f8, 0xc009bd03c0, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc009cc9938 sp=0xc009cc9828 pc=0x48821c
reflect.Value.call(0xc000656c80, 0xc000c6a4f8, 0x13, 0x170e4d4, 0x4, 0xc009b2f400, 0x3, 0x3, 0xc009b2f430, 0xc009b7bb00, ...)
	reflect/value.go:476 +0x8c7 fp=0xc009cc9b50 sp=0xc009cc9938 pc=0x4bcc87
reflect.Value.Call(0xc000656c80, 0xc000c6a4f8, 0x13, 0xc009b2f400, 0x3, 0x3, 0x16, 0x7fef0bd0cee8, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc009cc9bd0 sp=0xc009cc9b50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc000c7b800, 0x19be3a0, 0xc009b95710, 0xc009a24b80, 0x13, 0xc009b7bb00, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc009cc9cd8 sp=0xc009cc9bd0 pc=0xbbd825
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc0095b1e60, 0x19be3a0, 0xc009b956b0, 0xc009bfa000, 0xc000c7b800, 0xc009b7bb00, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc009cc9d48 sp=0xc009cc9cd8 pc=0xbb4e19
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc0095b1e60, 0xc009b95650, 0xc009bfa000, 0x203002)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc009cc9e30 sp=0xc009cc9d48 pc=0xbb3d05
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc0095b1e60, 0xc009b95650, 0xc009bfa000, 0x19a7301)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc009cc9ef8 sp=0xc009cc9e30 pc=0xbb325e
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc009b95650)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc009cc9f68 sp=0xc009cc9ef8 pc=0xbc2ec6
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc0095b1e60, 0xc009b7bac0)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc009cc9fd0 sp=0xc009cc9f68 pc=0xbc3092
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc009cc9fd8 sp=0xc009cc9fd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

fix4-3.log.zip

@frankie-lim-partior
Copy link

frankie-lim-partior commented Jan 18, 2022

Test Result
Test Target:
ghcr.io/baptiste-b-pegasys/quorum:fix21.10-5 : with the lock + same periodical write of the journals

Result:

  1. FAILED - "fatal error: concurrent map iteration and map write" while sending high TPS private transaction
  2. Full log attached
  3. Post fatal error, Quorum pod restart and works without corrupting the persistent storage
  4. Tested the scenario where we stop then restart the pod. the Quorum pod still working fine without corrupting the persistent storage.
  5. NOT reproducible in this build - "panic: runtime error: invalid memory address or nil pointer dereference"
fatal error: concurrent map iteration and map write
goroutine 321521 [running]:
runtime.throw(0x1748f3b, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc00efdbb60 sp=0xc00efdbb30 pc=0x450c52
runtime.mapiternext(0xc00efdbe58)
	runtime/map.go:853 +0x554 fp=0xc00efdbbe0 sp=0xc00efdbb60 pc=0x429f14
runtime.mapiterinit(0x14f9560, 0xc00e43af00, 0xc00efdbe58)
	runtime/map.go:843 +0x1c5 fp=0xc00efdbc00 sp=0xc00efdbbe0 pc=0x4298c5
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc00cddb380, 0xc00a855c80)
	github.com/ethereum/go-ethereum/core/state/statedb.go:810 +0x815 fp=0xc00efdbf28 sp=0xc00efdbc00 pc=0x95edf5
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc007d3afc0, 0x1712c2d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc00efdbff0 sp=0xc00efdbf28 pc=0xe9e270
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc00b545140, 0x19be120, 0xc00a936330, 0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x181fa40)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc00efdc0e8 sp=0xc00efdbff0 pc=0xf00925
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc00b545140, 0x19be120, 0xc00a936330, 0xc01d8f2348, 0x0, 0x8bc8ded9cb6fd200, 0xb46701450e30c1bb, 0x823eb60aabb4e4aa, 0x9471a35516e29b8b, 0x2afb2eb4d2ac8795, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc00efdc260 sp=0xc00efdc0e8 pc=0xf01198
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be120, 0xc00a936330, 0x19dd3c0, 0xc00b545140, 0xc00c462268, 0xc0079013a0, 0xc01d8f2358, 0xc00af5c820, 0xc00df95140, 0xc00af5c800, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc00efdc648 sp=0xc00efdc260 pc=0xdb7ba6
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x14dcbc83, 0x19be120, 0xc00a936330, 0xc01d8f2348, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc00efdcfc8 sp=0xc00efdc648 pc=0xdcfcb3
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be120, 0xc00a936330, 0x19dd3c0, 0xc00b545140, 0xc00c462268, 0xc0079013a0, 0xc01d8f2358, 0xc00af5c820, 0xc00df95140, 0xc00af5c800, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc00efdd268 sp=0xc00efdcfc8 pc=0xdb92da
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc00c462210, 0x19be120, 0xc00a936330, 0x19dd3c0, 0xc00b545140, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc00efdd380 sp=0xc00efdd268 pc=0xdc04b9
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc00b5fa060, 0x19be120, 0xc00a936330, 0x0, 0x0, 0xc01cd2a540, 0x1, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc00efdd828 sp=0xc00efdd380 pc=0xdc36d7
runtime.call256(0xc00609e1e0, 0xc00086b5a8, 0xc00c2105a0, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc00efdd938 sp=0xc00efdd828 pc=0x48821c
reflect.Value.call(0xc0008b4a00, 0xc00086b5a8, 0x13, 0x170e2d4, 0x4, 0xc01cc14aa0, 0x3, 0x3, 0xc01cc14ad0, 0xc00af5c7c0, ...)
	reflect/value.go:476 +0x8c7 fp=0xc00efddb50 sp=0xc00efdd938 pc=0x4bcc87
reflect.Value.Call(0xc0008b4a00, 0xc00086b5a8, 0x13, 0xc01cc14aa0, 0x3, 0x3, 0x16, 0x7f3bfc251538, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc00efddbd0 sp=0xc00efddb50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc0009297a0, 0x19be120, 0xc00a936330, 0xc007901180, 0x13, 0xc00af5c7c0, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc00efddcd8 sp=0xc00efddbd0 pc=0xbbd725
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc0098126c0, 0x19be120, 0xc00a9362d0, 0xc00e1a3f80, 0xc0009297a0, 0xc00af5c7c0, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc00efddd48 sp=0xc00efddcd8 pc=0xbb4d19
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc0098126c0, 0xc00a936270, 0xc00e1a3f80, 0x203002)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc00efdde30 sp=0xc00efddd48 pc=0xbb3c05
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc0098126c0, 0xc00a936270, 0xc00e1a3f80, 0x19a7001)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc00efddef8 sp=0xc00efdde30 pc=0xbb315e
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc00a936270)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc00efddf68 sp=0xc00efddef8 pc=0xbc2dc6
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc0098126c0, 0xc00af5c680)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc00efddfd0 sp=0xc00efddf68 pc=0xbc2f92
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc00efddfd8 sp=0xc00efddfd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

fix5-3.log.zip

@baptiste-b-pegasys
Copy link
Contributor

sorry my mistake, I forgot one lock at the Copy place

ghcr.io/baptiste-b-pegasys/quorum:fix21.10-6

@frankie-lim-partior
Copy link

cool. Let me try baptiste-b-pegasys/quorum:fix21.10-6 now!

@frankie-lim-partior
Copy link

@baptiste-b-pegasys fix baptiste-b-pegasys/quorum:fix21.10-6 failed immediately at first attempt with a new error "fatal error: sync: unlock of unlocked mutex"

Full log attached.

fatal error: sync: unlock of unlocked mutex

goroutine 28450 [running]:
runtime.throw(0x17393b8, 0x1e)
	runtime/panic.go:1116 +0x72 fp=0xc01137bb80 sp=0xc01137bb50 pc=0x450c52
sync.throw(0x17393b8, 0x1e)
	runtime/panic.go:1102 +0x35 fp=0xc01137bba0 sp=0xc01137bb80 pc=0x4843f5
sync.(*Mutex).unlockSlow(0xc0078dc9e0, 0xffffffff)
	sync/mutex.go:196 +0xd8 fp=0xc01137bbc8 sp=0xc01137bba0 pc=0x4a33d8
sync.(*Mutex).Unlock(0xc0078dc9e0)
	sync/mutex.go:190 +0x48 fp=0xc01137bbe8 sp=0xc01137bbc8 pc=0x4a32e8
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc013322900, 0xc00d8a8780)
	github.com/ethereum/go-ethereum/core/state/statedb.go:835 +0xe45 fp=0xc01137bf28 sp=0xc01137bbe8 pc=0x95f425
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0003bd680, 0x1712c2d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc01137bff0 sp=0xc01137bf28 pc=0xe9e310
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc001408510, 0x19be180, 0xc009278930, 0xfffffffffffffffe, 0xc00d632700, 0x0, 0x0, 0x3c, 0x21eb56e)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc01137c0e8 sp=0xc01137bff0 pc=0xf009c5
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc001408510, 0x19be180, 0xc009278930, 0xc014ff74c0, 0x0, 0x0, 0x0, 0xc00d894b40, 0xc00d914a80, 0x1820cd8, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc01137c260 sp=0xc01137c0e8 pc=0xf01238
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be180, 0xc009278930, 0x19dd420, 0xc001408510, 0xc0005d6c08, 0xc00962ede0, 0xc014ff7db0, 0xc010695e60, 0xc00d6286a0, 0xc010695e40, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc01137c648 sp=0xc01137c260 pc=0xdb7c46
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0xac03c, 0x19be100, 0xc00833b710, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc01137cfc8 sp=0xc01137c648 pc=0xdcfd53
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be180, 0xc009278930, 0x19dd420, 0xc001408510, 0xc0005d6c08, 0xc00962ede0, 0xc014ff7db0, 0xc010695e60, 0xc00d6286a0, 0xc010695e40, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc01137d268 sp=0xc01137cfc8 pc=0xdb937a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc0005d6bb0, 0x19be180, 0xc009278930, 0x19dd420, 0xc001408510, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc01137d380 sp=0xc01137d268 pc=0xdc0559
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc004d3eb60, 0x19be180, 0xc009278930, 0x0, 0x0, 0xc002e19940, 0x1, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc01137d828 sp=0xc01137d380 pc=0xdc3777
runtime.call256(0xc00ce0b740, 0xc0032c6258, 0xc0027e8f00, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc01137d938 sp=0xc01137d828 pc=0x48821c
reflect.Value.call(0xc010551780, 0xc0032c6258, 0x13, 0x170e2d4, 0x4, 0xc00b243e50, 0x3, 0x3, 0xc00b243e80, 0xc010695e00, ...)
	reflect/value.go:476 +0x8c7 fp=0xc01137db50 sp=0xc01137d938 pc=0x4bcc87
reflect.Value.Call(0xc010551780, 0xc0032c6258, 0x13, 0xc00b243e50, 0x3, 0x3, 0x16, 0x7fb525bb09a8, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc01137dbd0 sp=0xc01137db50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc00c0768a0, 0x19be180, 0xc009278930, 0xc00962eca0, 0x13, 0xc010695e00, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc01137dcd8 sp=0xc01137dbd0 pc=0xbbd7c5
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc00fad0120, 0x19be180, 0xc0092788d0, 0xc00eb87570, 0xc00c0768a0, 0xc010695e00, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc01137dd48 sp=0xc01137dcd8 pc=0xbb4db9
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc00fad0120, 0xc009278870, 0xc00eb87570, 0x203002)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc01137de30 sp=0xc01137dd48 pc=0xbb3ca5
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc00fad0120, 0xc009278870, 0xc00eb87570, 0x19a7101)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc01137def8 sp=0xc01137de30 pc=0xbb31fe
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc009278870)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc01137df68 sp=0xc01137def8 pc=0xbc2e66
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc00fad0120, 0xc010695d00)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc01137dfd0 sp=0xc01137df68 pc=0xbc3032
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc01137dfd8 sp=0xc01137dfd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

fix6-1.log.zip

@baptiste-b-pegasys baptiste-b-pegasys self-assigned this Jan 18, 2022
@baptiste-b-pegasys
Copy link
Contributor

ok inner call
please check: ghcr.io/baptiste-b-pegasys/quorum:fix21.10-7

@frankie-lim-partior
Copy link

@baptiste-b-pegasys

ghcr.io/baptiste-b-pegasys/quorum:fix21.10-7 failed and crash at first attempt. full log attached.

sync: unlock of unlocked mutex

goroutine 25843 [running]:
runtime.throw(0x17393b8, 0x1e)
	runtime/panic.go:1116 +0x72 fp=0xc004b0dbb8 sp=0xc004b0db88 pc=0x450c52
sync.throw(0x17393b8, 0x1e)
	runtime/panic.go:1102 +0x35 fp=0xc004b0dbd8 sp=0xc004b0dbb8 pc=0x4843f5
sync.(*Mutex).unlockSlow(0xc008d8ba60, 0xc0ffffffff)
	sync/mutex.go:196 +0xd8 fp=0xc004b0dc00 sp=0xc004b0dbd8 pc=0x4a33d8
sync.(*Mutex).Unlock(...)
	sync/mutex.go:190
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc012ee0900, 0xc013b43500)
	github.com/ethereum/go-ethereum/core/state/statedb.go:802 +0xe45 fp=0xc004b0df28 sp=0xc004b0dc00 pc=0x95f425
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0079a4900, 0x1712c2d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc004b0dff0 sp=0xc004b0df28 pc=0xe9e2f0
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc00798d770, 0x19be160, 0xc00715e4e0, 0xfffffffffffffffe, 0xc0120090a0, 0x0, 0x0, 0x3c, 0x21eb56e)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc004b0e0e8 sp=0xc004b0dff0 pc=0xf009a5
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc00798d770, 0x19be160, 0xc00715e4e0, 0xc007016a58, 0x0, 0x0, 0x0, 0xc011fc1b00, 0xc0120b6a80, 0x1820cd8, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc004b0e260 sp=0xc004b0e0e8 pc=0xf01218
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be160, 0xc00715e4e0, 0x19dd400, 0xc00798d770, 0xc00713d398, 0xc02491eb60, 0xc007016b50, 0xc0071602e0, 0xc0122c6a80, 0xc0071602c0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc004b0e648 sp=0xc004b0e260 pc=0xdb7c26
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0xa6e8745, 0x19be100, 0xc011c78630, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc004b0efc8 sp=0xc004b0e648 pc=0xdcfd33
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be160, 0xc00715e4e0, 0x19dd400, 0xc00798d770, 0xc00713d398, 0xc02491eb60, 0xc007016b50, 0xc0071602e0, 0xc0122c6a80, 0xc0071602c0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc004b0f268 sp=0xc004b0efc8 pc=0xdb935a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc00713d340, 0x19be160, 0xc00715e4e0, 0x19dd400, 0xc00798d770, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc004b0f380 sp=0xc004b0f268 pc=0xdc0539
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc0079a8940, 0x19be160, 0xc00715e4e0, 0x0, 0x0, 0xc005aed100, 0x1, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc004b0f828 sp=0xc004b0f380 pc=0xdc3757
runtime.call256(0xc0061828a0, 0xc00064e5f0, 0xc0071565a0, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc004b0f938 sp=0xc004b0f828 pc=0x48821c
reflect.Value.call(0xc0000f1100, 0xc00064e5f0, 0x13, 0x170e2d4, 0x4, 0xc007136370, 0x3, 0x3, 0xc0071363a0, 0xc007160280, ...)
	reflect/value.go:476 +0x8c7 fp=0xc004b0fb50 sp=0xc004b0f938 pc=0x4bcc87
reflect.Value.Call(0xc0000f1100, 0xc00064e5f0, 0x13, 0xc007136370, 0x3, 0x3, 0x16, 0x7f94378d1b60, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc004b0fbd0 sp=0xc004b0fb50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc000e02300, 0x19be160, 0xc00715e4e0, 0xc02491eb00, 0x13, 0xc007160280, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc004b0fcd8 sp=0xc004b0fbd0 pc=0xbbd7a5
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc006451950, 0x19be160, 0xc00715e480, 0xc007027340, 0xc000e02300, 0xc007160280, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc004b0fd48 sp=0xc004b0fcd8 pc=0xbb4d99
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc006451950, 0xc00715e420, 0xc007027340, 0x203001)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc004b0fe30 sp=0xc004b0fd48 pc=0xbb3c85
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc006451950, 0xc00715e420, 0xc007027340, 0x19a7001)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc004b0fef8 sp=0xc004b0fe30 pc=0xbb31de
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc00715e420)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc004b0ff68 sp=0xc004b0fef8 pc=0xbc2e46
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc006451950, 0xc007160240)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc004b0ffd0 sp=0xc004b0ff68 pc=0xbc3012
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc004b0ffd8 sp=0xc004b0ffd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

fix7-1.log.zip

@baptiste-b-pegasys
Copy link
Contributor

hello, another try
ghcr.io/baptiste-b-pegasys/quorum:fix21.10-8

whould be nice to have the tool for reproducing
thanks

@frankie-lim-partior
Copy link

@baptiste-b-pegasys thanks! will try.
I will try to find time this week to details down how exactly you can compile the Hyperledger support for Quorum SUT. Thanks for your support!

@frankie-lim-partior
Copy link

@baptiste-b-pegasys

ghcr.io/baptiste-b-pegasys/quorum:fix21.10-8 failed and crash at first attempt. full log attached.

sync: unlock of unlocked mutex

goroutine 13075 [running]:
runtime.throw(0x17393b8, 0x1e)
	runtime/panic.go:1116 +0x72 fp=0xc010a01b50 sp=0xc010a01b20 pc=0x450c52
sync.throw(0x17393b8, 0x1e)
	runtime/panic.go:1102 +0x35 fp=0xc010a01b70 sp=0xc010a01b50 pc=0x4843f5
sync.(*Mutex).unlockSlow(0xc0080fe440, 0xc0ffffffff)
	sync/mutex.go:196 +0xd8 fp=0xc010a01b98 sp=0xc010a01b70 pc=0x4a33d8
sync.(*Mutex).Unlock(...)
	sync/mutex.go:190
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc00ee20600, 0xc004331680)
	github.com/ethereum/go-ethereum/core/state/statedb.go:790 +0x1005 fp=0xc010a01f28 sp=0xc010a01b98 pc=0x95f5e5
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc00710b8c0, 0x1712c2d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc010a01ff0 sp=0xc010a01f28 pc=0xe9e4d0
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc006b9bb30, 0x19be1a0, 0xc007ec0ae0, 0xfffffffffffffffe, 0xc0042ecd20, 0x0, 0x0, 0x3c, 0x21eb5e6)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc010a020e8 sp=0xc010a01ff0 pc=0xf00b85
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc006b9bb30, 0x19be1a0, 0xc007ec0ae0, 0xc00a822ed8, 0x0, 0x0, 0x0, 0xc00cc54d80, 0xc004518a80, 0x1820cd8, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc010a02260 sp=0xc010a020e8 pc=0xf013f8
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be1a0, 0xc007ec0ae0, 0x19dd440, 0xc006b9bb30, 0xc007eec108, 0xc00f2d75e0, 0xc00a823678, 0xc007e49820, 0xc00bb0b4a0, 0xc007e49800, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc010a02648 sp=0xc010a02260 pc=0xdb7e06
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x58921, 0x19be100, 0xc007f2bc20, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc010a02fc8 sp=0xc010a02648 pc=0xdcff13
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be1a0, 0xc007ec0ae0, 0x19dd440, 0xc006b9bb30, 0xc007eec108, 0xc00f2d75e0, 0xc00a823678, 0xc007e49820, 0xc00bb0b4a0, 0xc007e49800, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc010a03268 sp=0xc010a02fc8 pc=0xdb953a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc007eec0b0, 0x19be1a0, 0xc007ec0ae0, 0x19dd440, 0xc006b9bb30, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc010a03380 sp=0xc010a03268 pc=0xdc0719
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc007122560, 0x19be1a0, 0xc007ec0ae0, 0x0, 0x0, 0xc00a95fe00, 0x1, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc010a03828 sp=0xc010a03380 pc=0xdc3937
runtime.call256(0xc0060fa780, 0xc003bc4eb8, 0xc007ebc780, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc010a03938 sp=0xc010a03828 pc=0x48821c
reflect.Value.call(0xc007c99100, 0xc003bc4eb8, 0x13, 0x170e2d4, 0x4, 0xc00fd3fd10, 0x3, 0x3, 0xc00fd3fd40, 0xc007e497c0, ...)
	reflect/value.go:476 +0x8c7 fp=0xc010a03b50 sp=0xc010a03938 pc=0x4bcc87
reflect.Value.Call(0xc007c99100, 0xc003bc4eb8, 0x13, 0xc00fd3fd10, 0x3, 0x3, 0x16, 0x7f29f6980820, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc010a03bd0 sp=0xc010a03b50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc007d9ea80, 0x19be1a0, 0xc007ec0ae0, 0xc00f2d7580, 0x13, 0xc007e497c0, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc010a03cd8 sp=0xc010a03bd0 pc=0xbbd985
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc006b93170, 0x19be1a0, 0xc007ec0a80, 0xc007ed4310, 0xc007d9ea80, 0xc007e497c0, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc010a03d48 sp=0xc010a03cd8 pc=0xbb4f79
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc006b93170, 0xc007ec0a20, 0xc007ed4310, 0x203001)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc010a03e30 sp=0xc010a03d48 pc=0xbb3e65
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc006b93170, 0xc007ec0a20, 0xc007ed4310, 0x19a7101)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc010a03ef8 sp=0xc010a03e30 pc=0xbb33be
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc007ec0a20)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc010a03f68 sp=0xc010a03ef8 pc=0xbc3026
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc006b93170, 0xc007e49780)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc010a03fd0 sp=0xc010a03f68 pc=0xbc31f2
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc010a03fd8 sp=0xc010a03fd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

fix8-1.log.zip

@baptiste-b-pegasys
Copy link
Contributor

OK I am waiting for your tool.

@sweeho
Copy link

sweeho commented Jan 20, 2022

@baptiste-b-pegasys

Please get the Caliper version that support Quorum SUT here. I have configured it to use WebSocket instead of RPC so that we dont have the TCP connections limitation.
https://github.com/sweeho/caliper-ws.git

The step by step instruction here. This is still very experimental, so its abit manual steps here:
https://github.com/sweeho/caliper-ws/blob/main/README.md

Hopefully you can reproduce the issue. Meanwhile, anything else I can test?

@baptiste-b-pegasys
Copy link
Contributor

no unfortunately, the "-8" version looks like this:

func (s *StateDB) Copy() *StateDB {
	// Copy all the basic fields, initialize the memory ones
	state := &StateDB{
		db:   s.db,
		trie: s.db.CopyTrie(s.trie),
		// Quorum - Privacy Enhancements
		accountExtraDataTrie: s.db.CopyTrie(s.accountExtraDataTrie),
		stateObjects:         make(map[common.Address]*stateObject, len(s.journal.dirties)),
		stateObjectsPending:  make(map[common.Address]struct{}, len(s.stateObjectsPending)),
		stateObjectsDirty:    make(map[common.Address]struct{}, len(s.journal.dirties)),
		refund:               s.refund,
		logs:                 make(map[common.Hash][]*types.Log, len(s.logs)),
		logSize:              s.logSize,
		preimages:            make(map[common.Hash][]byte, len(s.preimages)),
		journal:              newJournal(),
	}
	s.journal.mutex.Lock()
	addresses := make([]common.Address, 0, len(s.journal.dirties))
	for addr := range s.journal.dirties {
		addresses = append(addresses, addr)
	}
	s.journal.mutex.Unlock() // HERE IS WHERE IT FAILS
	// Copy the dirty states, logs, and preimages
	for _, addr := range addresses {

How s.journal.mutex.Unlock() // HERE IS WHERE IT FAILS has double unlock?

@frankie-lim-partior
Copy link

@baptiste-b-pegasys are you able to deploy Caliper based on the steps I have provided? Are you able to reproduce the issue?

@antonydenyer
Copy link
Contributor Author

I think we've identified part of the issue. What do you have as your entrypoint? Is it geth or are you running geth wrapped in a shell script? My concern is that GoQuorum isn't receiving the SIGTERM signal from docker because it's going to the shell script.

Consequently, GoQuorum has no chance to be gracefully shut down.

@frankie-lim-partior
Copy link

@antonydenyer

  1. @baptiste-b-pegasys has made some changes in test build ghcr.io/baptiste-b-pegasys/quorum:fix21.10-4 to ghcr.io/baptiste-b-pegasys/quorum:fix21.10-8, post which "panic: runtime error: invalid memory address or nil pointer dereference" no longer reproducible in these test build. Please check with him what has change.
  2. The current issue is "sync: unlock of unlocked mutex" which is reproducible in test build quorum:fix21.10-6, quorum:fix21.10-7 and quorum:fix21.10-8.

@baptiste-b-pegasys
Copy link
Contributor

Have you tried the latest version 21.10.2 ?

@frankie-lim-partior
Copy link

@baptiste-b-pegasys I have not. I can try. Do you think the changes in this version fixed the issue?

@frankie-lim-partior
Copy link

@baptiste-b-pegasys with official version 21.10.2, "fatal error: concurrent map iteration and map write" is back after just two tests attempt.

Fill log attached.
quorum-21.10.2-crash.log.zip

fatal error: concurrent map iteration and map write

goroutine 626916 [running]:
runtime.throw(0x174911b, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc0096edb80 sp=0xc0096edb50 pc=0x450c52
runtime.mapiternext(0xc0096ede58)
	runtime/map.go:853 +0x554 fp=0xc0096edc00 sp=0xc0096edb80 pc=0x429f14
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc009bdbb00, 0xc00a6bc000)
	github.com/ethereum/go-ethereum/core/state/statedb.go:792 +0x65d fp=0xc0096edf28 sp=0xc0096edc00 pc=0x95e91d
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc00312c6c0, 0x1712e0d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc0096edff0 sp=0xc0096edf28 pc=0xe9ddb0
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc003131440, 0x19be200, 0xc00d5bb1a0, 0xfffffffffffffffe, 0xc00a572ee0, 0x0, 0x0, 0x3c, 0x21eb3e6)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc0096ee0e8 sp=0xc0096edff0 pc=0xf00465
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc003131440, 0x19be200, 0xc00d5bb1a0, 0xc01aa58fe0, 0x0, 0x0, 0x0, 0xc011e96900, 0xc00a6a4a80, 0x1820ec0, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc0096ee260 sp=0xc0096ee0e8 pc=0xf00cd8
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be200, 0xc00d5bb1a0, 0x19dd4a0, 0xc003131440, 0xc00b05d188, 0xc00f507920, 0xc01aa59dd8, 0xc008dd4c40, 0xc009475640, 0xc008dd4c20, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc0096ee648 sp=0xc0096ee260 pc=0xdb76e6
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x30751, 0x19be200, 0xc0100f3c80, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc0096eefc8 sp=0xc0096ee648 pc=0xdcf7f3
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be200, 0xc00d5bb1a0, 0x19dd4a0, 0xc003131440, 0xc00b05d188, 0xc00f507920, 0xc01aa59dd8, 0xc008dd4c40, 0xc009475640, 0xc008dd4c20, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc0096ef268 sp=0xc0096eefc8 pc=0xdb8e1a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc00b05d130, 0x19be200, 0xc00d5bb1a0, 0x19dd4a0, 0xc003131440, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc0096ef380 sp=0xc0096ef268 pc=0xdbfff9
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc00319cb80, 0x19be200, 0xc00d5bb1a0, 0x0, 0x0, 0xc00ad8c5c0, 0x4, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc0096ef828 sp=0xc0096ef380 pc=0xdc3217
runtime.call256(0xc008e0a300, 0xc00524cbc0, 0xc00ae49b30, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc0096ef938 sp=0xc0096ef828 pc=0x48821c
reflect.Value.call(0xc0000f5500, 0xc00524cbc0, 0x13, 0x170e4b4, 0x4, 0xc005e47cc0, 0x3, 0x3, 0xc005e47cf0, 0xc008dd4be0, ...)
	reflect/value.go:476 +0x8c7 fp=0xc0096efb50 sp=0xc0096ef938 pc=0x4bcc87
reflect.Value.Call(0xc0000f5500, 0xc00524cbc0, 0x13, 0xc005e47cc0, 0x3, 0x3, 0x16, 0x7f549efb0f40, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc0096efbd0 sp=0xc0096efb50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc0053456e0, 0x19be200, 0xc00d5bb1a0, 0xc00f5078c0, 0x13, 0xc008dd4be0, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc0096efcd8 sp=0xc0096efbd0 pc=0xbbd225
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc00bac8480, 0x19be200, 0xc00d5bb140, 0xc0089fc000, 0xc0053456e0, 0xc008dd4be0, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc0096efd48 sp=0xc0096efcd8 pc=0xbb4819
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc00bac8480, 0xc00d5bb0e0, 0xc0089fc000, 0x203003)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc0096efe30 sp=0xc0096efd48 pc=0xbb3705
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc00bac8480, 0xc00d5bb0e0, 0xc0089fc000, 0x19a7101)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc0096efef8 sp=0xc0096efe30 pc=0xbb2c5e
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc00d5bb0e0)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc0096eff68 sp=0xc0096efef8 pc=0xbc28c6
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc00bac8480, 0xc008dd4ba0)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc0096effd0 sp=0xc0096eff68 pc=0xbc2a92
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc0096effd8 sp=0xc0096effd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

@baptiste-b-pegasys
Copy link
Contributor

The concurrent map modification is on a different one, I add a lock on it.

ghcr.io/baptiste-b-pegasys/quorum:fix21.10.2

@frankie-lim-partior
Copy link

thanks. testing.

@frankie-lim-partior
Copy link

@baptiste-b-pegasys

Test: ghcr.io/baptiste-b-pegasys/quorum:fix21.10.2

Result: Issue Reproducible - fatal error: concurrent map iteration and map write

Full log attached
crash-21.20.2-fix-1.log.zip

goroutine 2295832 [running]:
runtime.throw(0x174913b, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc00055db58 sp=0xc00055db28 pc=0x450c52
runtime.mapiternext(0xc00055deb8)
	runtime/map.go:853 +0x554 fp=0xc00055dbd8 sp=0xc00055db58 pc=0x429f14
runtime.mapiterinit(0x14f9620, 0xc006daa240, 0xc00055deb8)
	runtime/map.go:843 +0x1c5 fp=0xc00055dbf8 sp=0xc00055dbd8 pc=0x4298c5
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc00bc65380, 0xc00ccd4900)
	github.com/ethereum/go-ethereum/core/state/statedb.go:779 +0x3d5 fp=0xc00055df28 sp=0xc00055dbf8 pc=0x95e735
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0021dc900, 0x1712e2d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc00055dff0 sp=0xc00055df28 pc=0xe9dfd0
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc0021c1920, 0x19be2e0, 0xc005d9d8c0, 0xfffffffffffffffe, 0xc00800db20, 0x0, 0x0, 0x3c, 0x21eb64e)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc00055e0e8 sp=0xc00055dff0 pc=0xf00685
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc0021c1920, 0x19be2e0, 0xc005d9d8c0, 0xc00cdc5718, 0x0, 0x0, 0x0, 0xc0030a4480, 0xc0082b8000, 0x1820ee0, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc00055e260 sp=0xc00055e0e8 pc=0xf00ef8
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be2e0, 0xc005d9d8c0, 0x19dd580, 0xc0021c1920, 0xc00a8c3448, 0xc008a89ec0, 0xc0178a6580, 0xc004d4e300, 0xc00b34dee0, 0xc004d4e2e0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc00055e648 sp=0xc00055e260 pc=0xdb7906
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x8953, 0x19be200, 0xc005f136b0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc00055efc8 sp=0xc00055e648 pc=0xdcfa13
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be2e0, 0xc005d9d8c0, 0x19dd580, 0xc0021c1920, 0xc00a8c3448, 0xc008a89ec0, 0xc0178a6580, 0xc004d4e300, 0xc00b34dee0, 0xc004d4e2e0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc00055f268 sp=0xc00055efc8 pc=0xdb903a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc00a8c33f0, 0x19be2e0, 0xc005d9d8c0, 0x19dd580, 0xc0021c1920, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc00055f380 sp=0xc00055f268 pc=0xdc0219
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc0021e2920, 0x19be2e0, 0xc005d9d8c0, 0x0, 0x0, 0xc00f667950, 0x8, 0x9, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc00055f828 sp=0xc00055f380 pc=0xdc3437
runtime.call256(0xc004e34000, 0xc002c063f8, 0xc0088611d0, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc00055f938 sp=0xc00055f828 pc=0x48821c
reflect.Value.call(0xc00029fe00, 0xc002c063f8, 0x13, 0x170e4d4, 0x4, 0xc00cd09630, 0x3, 0x3, 0xc00cd09660, 0xc004d4e0e0, ...)
	reflect/value.go:476 +0x8c7 fp=0xc00055fb50 sp=0xc00055f938 pc=0x4bcc87
reflect.Value.Call(0xc00029fe00, 0xc002c063f8, 0x13, 0xc00cd09630, 0x3, 0x3, 0x16, 0x7fbeb2d6b1d8, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc00055fbd0 sp=0xc00055fb50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc002c16480, 0x19be2e0, 0xc005d9d8c0, 0xc008a89d80, 0x13, 0xc004d4e0e0, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc00055fcd8 sp=0xc00055fbd0 pc=0xbbd445
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc00fc422d0, 0x19be2e0, 0xc005d9d860, 0xc00a05e540, 0xc002c16480, 0xc004d4e0e0, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc00055fd48 sp=0xc00055fcd8 pc=0xbb4a39
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc00fc422d0, 0xc005d9d770, 0xc00a05e540, 0x203001)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc00055fe30 sp=0xc00055fd48 pc=0xbb3925
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc00fc422d0, 0xc005d9d770, 0xc00a05e540, 0x19a7201)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc00055fef8 sp=0xc00055fe30 pc=0xbb2e7e
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc005d9d770)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc00055ff68 sp=0xc00055fef8 pc=0xbc2ae6
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc00fc422d0, 0xc00ceaffe0)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc00055ffd0 sp=0xc00055ff68 pc=0xbc2cb2
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc00055ffd8 sp=0xc00055ffd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

@frankie-lim-partior
Copy link

@baptiste-b-pegasys any luck? Where you able to reproduce with the Hyperledger caliper setup step I have provided?

I am able to reproduce easily when I run a private transaction .

@baptiste-b-pegasys
Copy link
Contributor

I ran the caliper 1 node and 4 nodes, I am not able to reproduce yet. I think I am not reaching the point it happens.

To be sure: Can I have the arguments and genesis file?

I made another version of the fix, you can try: ghcr.io/baptiste-b-pegasys/quorum:lock

@frankie-lim-partior
Copy link

command

/usr/local/bin/geth --datadir /etc/quorum/qdata/dd --nousb '--http.vhosts=*' '--http.corsdomain=*' --permissioned --nodiscover '--nat=none' --verbosity 5 --networkid 2003 --unlock 0 --emitcheckpoints --http --http.addr 0.0.0.0 --http.port 8545 --http.api admin,db,eth,debug,miner,net,shh,txpool,personal,web3,quorum,istanbul,quorumExtension,quorumPermission --ws --ws.api admin,db,eth,debug,miner,net,shh,txpool,personal,web3,quorum,istanbul,quorumExtension,quorumPermission --ws.addr 0.0.0.0 --ws.port 8546 --ws.origins '*' --port 30303 --rpc.gascap 0 --gcmode full --cache 4096 --cache.database 75 --istanbul.blockperiod 1 --syncmode full --mine --miner.threads 0 --miner.gastarget 100000000000 --miner.gaslimit 90000000000 --maxpeers 100 --allow-insecure-unlock --txpool.accountslots 5000 --txpool.accountqueue 5000 --txpool.globalslots 20000 --txpool.globalqueue 20000 --ptm.url http://10.60.24.4:9081 --ptm.timeout 20 --ptm.dialtimeout 10 --ptm.http.writebuffersize 4096 --ptm.http.readbuffersize 4096 --metrics --pprof --pprof.addr 0.0.0.0 --pprof.port+  9545 --pprof.memprofileratetee 524288 -a --pprof.blockprofilerate /etc/quorum/qdata/logs/quorum.log 0 --password
 /etc/quorum/qdata/dd/password.txt

genesis file

{
      "alloc": {
          "0x92D6d1c6A56116EEEf5b6085a9F2Fc2E07383C7a": {
            "balance": "1000000000000000000000000000"
          }, 
          "0x5ef4bdf5FDdA7E1491E67f528108FF523EcCaC2e": {
            "balance": "1000000000000000000000000000"
          }, 
          "0x7Ad43eFcfAa2BD60F00f7D8D49a2dD2C4ad13437": {
            "balance": "1000000000000000000000000000"
          }, 
          "0x39C43d3c375c1CFa5a1F4607B967330e6F8Eb822": {
            "balance": "1000000000000000000000000000"
          }
      
        },
        "coinbase": "0x0000000000000000000000000000000000000000",
        "config": {
          "homesteadBlock": 0,
          "byzantiumBlock": 0,
          "constantinopleBlock": 0,
          "petersburgBlock": 0,
          "istanbulBlock": 0,
          "eip150Block": 0,
          "eip150Hash": "0x0000000000000000000000000000000000000000000000000000000000000000",
          "eip155Block": 0,
          "eip158Block": 0,
          "maxCodeSizeConfig": [
            {
              "block": 0,
              "size": 64
            }
          ],
          "chainId": 2001,
          "isQuorum": true,
          "istanbul": {
            "epoch": 30000,
            "policy": 0,
            "ceil2Nby3Block": 0
          }
        },
      
      
        "difficulty": "0x1",
        "extraData": "0x0000000000000000000000000000000000000000000000000000000000000000f89af854941c3695740662a5ccd6f100ba5e283a0649e85860942876ba680064a82a716e52522c51766a0afab9de94b843a245e3de83378e648296db0660490a0ed661946f75d26c4de3bd6db791ec191102f16c43939832b8410000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000c0",
        "gasLimit": "0x1312D00",
        "mixHash": "0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365",
        "nonce": "0x0",
        "parentHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
        "timestamp": "0x00"
      }

@frankie-lim-partior
Copy link

Test : ghcr.io/baptiste-b-pegasys/quorum:lock

Panic with a different error now. Full log attached

quorum-lock-crashlog.log.zip

goroutine 8530 [running]:
runtime.throw(0x17395f9, 0x1e)
	runtime/panic.go:1116 +0x72 fp=0xc00cfd1b48 sp=0xc00cfd1b18 pc=0x450c52
sync.throw(0x17395f9, 0x1e)
	runtime/panic.go:1102 +0x35 fp=0xc00cfd1b68 sp=0xc00cfd1b48 pc=0x4843f5
sync.(*Mutex).unlockSlow(0xc0089bc0e0, 0xc0ffffffff)
	sync/mutex.go:196 +0xd8 fp=0xc00cfd1b90 sp=0xc00cfd1b68 pc=0x4a33d8
sync.(*Mutex).Unlock(...)
	sync/mutex.go:190
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc002d91200, 0xc00375e300)
	github.com/ethereum/go-ethereum/core/state/statedb.go:769 +0x1110 fp=0xc00cfd1f28 sp=0xc00cfd1b90 pc=0x95f650
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0001f0240, 0x1712e4d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc00cfd1ff0 sp=0xc00cfd1f28 pc=0xe9e430
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc0008a1470, 0x19be4e0, 0xc007cbb0b0, 0xfffffffffffffffe, 0xc003330380, 0x0, 0x0, 0x3c, 0x21ebb66)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc00cfd20e8 sp=0xc00cfd1ff0 pc=0xf00ae5
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc0008a1470, 0x19be4e0, 0xc007cbb0b0, 0xc004d4d190, 0x0, 0x0, 0x0, 0xc00299b680, 0xc001e06a80, 0x1820f20, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc00cfd2260 sp=0xc00cfd20e8 pc=0xf01358
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be4e0, 0xc007cbb0b0, 0x19dd780, 0xc0008a1470, 0xc007ceb188, 0xc007bef0e0, 0xc000297cd0, 0xc0058e9fa0, 0xc000fdc460, 0xc0058e9f80, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc00cfd2648 sp=0xc00cfd2260 pc=0xdb7d66
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0xa6e8745, 0x19be400, 0xc0118201b0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc00cfd2fc8 sp=0xc00cfd2648 pc=0xdcfe73
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be4e0, 0xc007cbb0b0, 0x19dd780, 0xc0008a1470, 0xc007ceb188, 0xc007bef0e0, 0xc000297cd0, 0xc0058e9fa0, 0xc000fdc460, 0xc0058e9f80, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc00cfd3268 sp=0xc00cfd2fc8 pc=0xdb949a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc007ceb130, 0x19be4e0, 0xc007cbb0b0, 0x19dd780, 0xc0008a1470, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc00cfd3380 sp=0xc00cfd3268 pc=0xdc0679
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc000349c20, 0x19be4e0, 0xc007cbb0b0, 0x0, 0x0, 0xc007ce6440, 0x4, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc00cfd3828 sp=0xc00cfd3380 pc=0xdc3897
runtime.call256(0xc0128cbcb0, 0xc000fbc830, 0xc0088024b0, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc00cfd3938 sp=0xc00cfd3828 pc=0x48821c
reflect.Value.call(0xc000a80080, 0xc000fbc830, 0x13, 0x170e4f4, 0x4, 0xc007cb25a0, 0x3, 0x3, 0xc007cb25d0, 0xc007ba5fc0, ...)
	reflect/value.go:476 +0x8c7 fp=0xc00cfd3b50 sp=0xc00cfd3938 pc=0x4bcc87
reflect.Value.Call(0xc000a80080, 0xc000fbc830, 0x13, 0xc007cb25a0, 0x3, 0x3, 0x16, 0x7ff7102c9270, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc00cfd3bd0 sp=0xc00cfd3b50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc000fe3920, 0x19be4e0, 0xc007cbb0b0, 0xc007bef080, 0x13, 0xc007ba5fc0, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc00cfd3cd8 sp=0xc00cfd3bd0 pc=0xbbd8a5
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc00285ad80, 0x19be4e0, 0xc007cbb050, 0xc003c3b9d0, 0xc000fe3920, 0xc007ba5fc0, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc00cfd3d48 sp=0xc00cfd3cd8 pc=0xbb4e99
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc00285ad80, 0xc007cbaff0, 0xc003c3b9d0, 0x203001)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc00cfd3e30 sp=0xc00cfd3d48 pc=0xbb3d85
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc00285ad80, 0xc007cbaff0, 0xc003c3b9d0, 0x19a7401)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc00cfd3ef8 sp=0xc00cfd3e30 pc=0xbb32de
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc007cbaff0)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc00cfd3f68 sp=0xc00cfd3ef8 pc=0xbc2f46
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc00285ad80, 0xc007ba5f80)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc00cfd3fd0 sp=0xc00cfd3f68 pc=0xbc3112
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc00cfd3fd8 sp=0xc00cfd3fd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

@baptiste-b-pegasys
Copy link
Contributor

Hi, still trying to reproduce the issue, I think I need more power to make this test.

Meanwhile, can you try this version: ghcr.io/baptiste-b-pegasys/quorum:lock-2

@frankie-lim-partior
Copy link

frankie-lim-partior commented Feb 18, 2022

test image: ghcr.io/baptiste-b-pegasys/quorum:lock-2

@baptiste-b-pegasys I think you are nearly there with the fix!! with this test build, it improved drastically! I am able to run 15 consecutive stress tests on the private contract before it crash. Before this, I cannot even repeat a second time without crashing.
The other observation is that once a node crash, when it came back up, it logs more than 10,000 blocks, and it took time for it to resync it. Is that expected?

Here is the latest log attached. Hopefully u can nail down this final one :-)
21.10.2-lock-2-fix-crash.log.zip


fatal error: concurrent map iteration and map write

goroutine 3221552 [running]:
runtime.throw(0x174919c, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc00ec7dad8 sp=0xc00ec7daa8 pc=0x450c52
runtime.mapiternext(0xc00ec7de58)
	runtime/map.go:853 +0x554 fp=0xc00ec7db58 sp=0xc00ec7dad8 pc=0x429f14
runtime.mapiterinit(0x14f96a0, 0xc011fceff0, 0xc00ec7de58)
	runtime/map.go:843 +0x1c5 fp=0xc00ec7db78 sp=0xc00ec7db58 pc=0x4298c5
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc00d22a4e0, 0xc005042680)
	github.com/ethereum/go-ethereum/core/state/statedb.go:819 +0x905 fp=0xc00ec7df28 sp=0xc00ec7db78 pc=0x95edc5
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc000014000, 0x1712e6d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc00ec7dff0 sp=0xc00ec7df28 pc=0xe9e510
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc0008db950, 0x19be540, 0xc00c8d5830, 0xfffffffffffffffe, 0xc007c7a620, 0x0, 0x0, 0x3c, 0x21ebc8e)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc00ec7e0e8 sp=0xc00ec7dff0 pc=0xf00bc5
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc0008db950, 0x19be540, 0xc00c8d5830, 0xc008c96480, 0x0, 0x0, 0x0, 0xc002b1a240, 0xc005133500, 0x1820f40, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc00ec7e260 sp=0xc00ec7e0e8 pc=0xf01438
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be540, 0xc00c8d5830, 0x19dd7e0, 0xc0008db950, 0xc0066a0948, 0xc0027dea60, 0xc00a5f0790, 0xc005cce100, 0xc006c3f480, 0xc005cce0e0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc00ec7e648 sp=0xc00ec7e260 pc=0xdb7e46
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x8b64, 0x19be501, 0xc00e1a9110, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc00ec7efc8 sp=0xc00ec7e648 pc=0xdcff53
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be540, 0xc00c8d5830, 0x19dd7e0, 0xc0008db950, 0xc0066a0948, 0xc0027dea60, 0xc00a5f0790, 0xc005cce100, 0xc006c3f480, 0xc005cce0e0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc00ec7f268 sp=0xc00ec7efc8 pc=0xdb957a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc0066a08f0, 0x19be540, 0xc00c8d5830, 0x19dd7e0, 0xc0008db950, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc00ec7f380 sp=0xc00ec7f268 pc=0xdc0759
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc0005595a0, 0x19be540, 0xc00c8d5830, 0x0, 0x0, 0xc00f61c900, 0x4, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc00ec7f828 sp=0xc00ec7f380 pc=0xdc3977
runtime.call256(0xc011c8e210, 0xc0002569b0, 0xc005bc1e00, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc00ec7f938 sp=0xc00ec7f828 pc=0x48821c
reflect.Value.call(0xc0008a7e80, 0xc0002569b0, 0x13, 0x170e514, 0x4, 0xc008b92a00, 0x3, 0x3, 0xc008b92a30, 0xc005cce0a0, ...)
	reflect/value.go:476 +0x8c7 fp=0xc00ec7fb50 sp=0xc00ec7f938 pc=0x4bcc87
reflect.Value.Call(0xc0008a7e80, 0xc0002569b0, 0x13, 0xc008b92a00, 0x3, 0x3, 0x16, 0x7f6ef5a3c6e8, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc00ec7fbd0 sp=0xc00ec7fb50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc000bdb200, 0x19be540, 0xc00c8d5830, 0xc0027de920, 0x13, 0xc005cce0a0, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc00ec7fcd8 sp=0xc00ec7fbd0 pc=0xbbd985
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc0081ccbd0, 0x19be540, 0xc00c8d57a0, 0xc004ea9c70, 0xc000bdb200, 0xc005cce0a0, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc00ec7fd48 sp=0xc00ec7fcd8 pc=0xbb4f79
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc0081ccbd0, 0xc00c8d5740, 0xc004ea9c70, 0x203003)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc00ec7fe30 sp=0xc00ec7fd48 pc=0xbb3e65
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc0081ccbd0, 0xc00c8d5740, 0xc004ea9c70, 0x19a7401)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc00ec7fef8 sp=0xc00ec7fe30 pc=0xbb33be
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc00c8d5740)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc00ec7ff68 sp=0xc00ec7fef8 pc=0xbc3026
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc0081ccbd0, 0xc005cce060)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc00ec7ffd0 sp=0xc00ec7ff68 pc=0xbc31f2
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc00ec7ffd8 sp=0xc00ec7ffd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

@baptiste-b-pegasys
Copy link
Contributor

Can you check what is the value of miner.threads ? I see 0 from the args I gave me, which means it's disabled.
We set it to 1, I am not sure we test different value than this.

Can you test this version, where a new lock is added for dirties map: ghcr.io/baptiste-b-pegasys/quorum:lock-3
Thanks

@frankie-lim-partior
Copy link

Thanks. will test the latest build.

On the miner thread, default is 0 according to the documentation. Anyway, no miner running on the nodes that are crashing. I have the validator node separate out from the transaction node.
--miner.threads value Number of CPU threads to use for mining (default: 0)

@frankie-lim-partior
Copy link

@baptiste-b-pegasys Another significant improvement! With the same stress test case, I only manage to reproduce after 25 times. Full logs attached. hopefully a final push to fix the residual :-)
fix-lock-3-crash.log.zip

test image: ghcr.io/baptiste-b-pegasys/quorum:lock-3

fatal error: concurrent map iteration and map write
TRACE[02-21|06:46:47.043] Promoting queued transaction             hash="c740f9…347b23"

goroutine 3769079 [running]:
runtime.throw(0x174919c, 0x26)
	runtime/panic.go:1116 +0x72 fp=0xc010ba9af8 sp=0xc010ba9ac8 pc=0x450c52
runtime.mapiternext(0xc010ba9e58)
	runtime/map.go:853 +0x554 fp=0xc010ba9b78 sp=0xc010ba9af8 pc=0x429f14
github.com/ethereum/go-ethereum/core/state.(*StateDB).Copy(0xc00ffec000, 0xc00f62eb60)
	github.com/ethereum/go-ethereum/core/state/statedb.go:819 +0x96d fp=0xc010ba9f28 sp=0xc010ba9b78 pc=0x95ee2d
github.com/ethereum/go-ethereum/miner.(*worker).pending(0xc0000f8480, 0x1712e6d, 0x7, 0x0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/miner/worker.go:294 +0x290 fp=0xc010ba9ff0 sp=0xc010ba9f28 pc=0xe9e510
github.com/ethereum/go-ethereum/miner.(*Miner).Pending(...)
	github.com/ethereum/go-ethereum/miner/miner.go:187
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumber(0xc00090b680, 0x19be540, 0xc00c164060, 0xfffffffffffffffe, 0xc00467e000, 0x0, 0x0, 0x3c, 0x21ebc8e)
	github.com/ethereum/go-ethereum/eth/api_backend.go:174 +0x2a5 fp=0xc010baa0e8 sp=0xc010ba9ff0 pc=0xf00bc5
github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).StateAndHeaderByNumberOrHash(0xc00090b680, 0x19be540, 0xc00c164060, 0xc002c2fc38, 0x0, 0x0, 0x0, 0xc0029ce240, 0xc010388000, 0x1820f40, ...)
	github.com/ethereum/go-ethereum/eth/api_backend.go:192 +0x578 fp=0xc010baa260 sp=0xc010baa0e8 pc=0xf01438
github.com/ethereum/go-ethereum/internal/ethapi.DoCall(0x19be540, 0xc00c164060, 0x19dd7e0, 0xc00090b680, 0xc0093425d8, 0xc002fcaee0, 0xc00d7081c0, 0xc00eecc980, 0xc0033c8c40, 0xc00eecc960, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:944 +0x106 fp=0xc010baa648 sp=0xc010baa260 pc=0xdb7e46
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas.func1(0x58921, 0x19be500, 0xc0093a6de0, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1137 +0x193 fp=0xc010baafc8 sp=0xc010baa648 pc=0xdcff53
github.com/ethereum/go-ethereum/internal/ethapi.DoEstimateGas(0x19be540, 0xc00c164060, 0x19dd7e0, 0xc00090b680, 0xc0093425d8, 0xc002fcaee0, 0xc00d7081c0, 0xc00eecc980, 0xc0033c8c40, 0xc00eecc960, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1149 +0x1fa fp=0xc010bab268 sp=0xc010baafc8 pc=0xdb957a
github.com/ethereum/go-ethereum/internal/ethapi.(*SendTxArgs).setDefaults(0xc009342580, 0x19be540, 0xc00c164060, 0x19dd7e0, 0xc00090b680, 0x0, 0x0)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:1894 +0x299 fp=0xc010bab380 sp=0xc010bab268 pc=0xdc0759
github.com/ethereum/go-ethereum/internal/ethapi.(*PublicTransactionPoolAPI).SendTransaction(0xc0001a79e0, 0x19be540, 0xc00c164060, 0x0, 0x0, 0xc00ec56cc0, 0x4, 0x4, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/internal/ethapi/api.go:2075 +0x1d7 fp=0xc010bab828 sp=0xc010bab380 pc=0xdc3977
runtime.call256(0xc00090a720, 0xc00048a308, 0xc0063d0b40, 0xc0000000f0)
	runtime/asm_amd64.s:543 +0x5c fp=0xc010bab938 sp=0xc010bab828 pc=0x48821c
reflect.Value.call(0xc000403380, 0xc00048a308, 0x13, 0x170e514, 0x4, 0xc00bb74cd0, 0x3, 0x3, 0xc00bb74d00, 0xc00eecc920, ...)
	reflect/value.go:476 +0x8c7 fp=0xc010babb50 sp=0xc010bab938 pc=0x4bcc87
reflect.Value.Call(0xc000403380, 0xc00048a308, 0x13, 0xc00bb74cd0, 0x3, 0x3, 0x16, 0x7f2fe6397a18, 0x30)
	reflect/value.go:337 +0xb9 fp=0xc010babbd0 sp=0xc010babb50 pc=0x4bc179
github.com/ethereum/go-ethereum/rpc.(*callback).call(0xc00105f8c0, 0x19be540, 0xc00c164060, 0xc002fcae60, 0x13, 0xc00eecc920, 0x1, 0x1, 0x0, 0x0, ...)
	github.com/ethereum/go-ethereum/rpc/service.go:206 +0x2c5 fp=0xc010babcd8 sp=0xc010babbd0 pc=0xbbd985
github.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc00879c480, 0x19be540, 0xc00c164000, 0xc0093280e0, 0xc00105f8c0, 0xc00eecc920, 0x1, 0x1, 0x1)
	github.com/ethereum/go-ethereum/rpc/handler.go:416 +0xd9 fp=0xc010babd48 sp=0xc010babcd8 pc=0xbb4f79
github.com/ethereum/go-ethereum/rpc.(*handler).handleCall(0xc00879c480, 0xc00e9e9f80, 0xc0093280e0, 0x203003)
	github.com/ethereum/go-ethereum/rpc/handler.go:360 +0x345 fp=0xc010babe30 sp=0xc010babd48 pc=0xbb3e65
github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg(0xc00879c480, 0xc00e9e9f80, 0xc0093280e0, 0x19a7401)
	github.com/ethereum/go-ethereum/rpc/handler.go:298 +0x1be fp=0xc010babef8 sp=0xc010babe30 pc=0xbb33be
github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1(0xc00e9e9f80)
	github.com/ethereum/go-ethereum/rpc/handler.go:139 +0x46 fp=0xc010babf68 sp=0xc010babef8 pc=0xbc3026
github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1(0xc00879c480, 0xc00eecc8e0)
	github.com/ethereum/go-ethereum/rpc/handler.go:226 +0xd2 fp=0xc010babfd0 sp=0xc010babf68 pc=0xbc31f2
runtime.goexit()
	runtime/asm_amd64.s:1374 +0x1 fp=0xc010babfd8 sp=0xc010babfd0 pc=0x489a21
created by github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc
	github.com/ethereum/go-ethereum/rpc/handler.go:222 +0x66

@baptiste-b-pegasys
Copy link
Contributor

baptiste-b-pegasys commented Feb 21, 2022

Hello, here is a new version of the lock
ghcr.io/baptiste-b-pegasys/quorum:lock-4

Please tell me if it is OK for you.

@frankie-lim-partior
Copy link

@baptiste-b-pegasys Well done!! I am calling it! I have send 300,000 private transactions at sent rate of >2500 TPS, batches in 20,000, no crashes seen! I think we are good! ghcr.io/baptiste-b-pegasys/quorum:lock-4 is good! Hopefully this will get into official release soon. Thanks Again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants