Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StateProvider: System.InvalidOperationException: Operation is not valid due to the current state of the object. #4002

Closed
rickyrombo opened this issue May 3, 2022 · 15 comments · Fixed by #4055
Assignees
Labels
[Pri] High High prioriy

Comments

@rickyrombo
Copy link

Describe the bug
A clear and concise description of what the bug is.

Running an RPC node. Seems like a race condition(?) of StateProvider, only seeing when under heavy load. This happens intermittently/nondeterministicly

Looking into trying to reproduce more consistently and investigating what's putting the StateProvider in a bad state, will follow up.

To Reproduce
Steps to reproduce the behavior:

Don't have a good deterministic repro here.

Expected behavior
A clear and concise description of what you expected to happen.

No error on an eth_call

Additional context
Add any other context about the problem here.

Full log:

2022-05-03 19:33:42.2380|Error during method execution, request: ID 197, eth_call({
  "data": "0x69eea1d64400000000000000000000000000000000000000000000000000000000000000",
  "to": "0x981c44040cb6150a2b8a7f63fb182760505bf666"
},latest) System.InvalidOperationException: Operation is not valid due to the current state of the object.
   at Nethermind.State.StateProvider.get_StateRoot() in /src/Nethermind/Nethermind.State/StateProvider.cs:line 87
   at Nethermind.Facade.BlockchainBridge.CallAndRestore(BlockHeader blockHeader, UInt256& timestamp, Transaction transaction, Boolean treatBlockHeaderAsParentBlock, ITxTracer tracer) in /src/Nethermind/Nethermind.Facade/BlockchainBridge.cs:line 283
   at Nethermind.Facade.BlockchainBridge.TryCallAndRestore(BlockHeader blockHeader, UInt256& timestamp, Transaction transaction, Boolean treatBlockHeaderAsParentBlock, ITxTracer tracer) in /src/Nethermind/Nethermind.Facade/BlockchainBridge.cs:line 240
   at Nethermind.Facade.BlockchainBridge.Call(BlockHeader header, Transaction tx, CancellationToken cancellationToken) in /src/Nethermind/Nethermind.Facade/BlockchainBridge.cs:line 173
   at Nethermind.JsonRpc.Modules.Eth.EthRpcModule.CallTxExecutor.ExecuteTx(BlockHeader header, Transaction tx, CancellationToken token) in /src/Nethermind/Nethermind.JsonRpc/Modules/Eth/EthModule.TransactionExecutor.cs:line 93
   at Nethermind.JsonRpc.Modules.Eth.EthRpcModule.TxExecutor`1.ExecuteTx(TransactionForRpc transactionCall, BlockParameter blockParameter) in /src/Nethermind/Nethermind.JsonRpc/Modules/Eth/EthModule.TransactionExecutor.cs:line 69
2022-05-03 19:33:42.2380|Error when handling ID 197, eth_call({
  "data": "0x69eea1d64400000000000000000000000000000000000000000000000000000000000000",
  "to": "0x981c44040cb6150a2b8a7f63fb182760505bf666"
},latest) | {"jsonrpc":"2.0","error":{"code":-32603,"message":"Internal error","data":"System.InvalidOperationException: Operation is not valid due to the current state of the object.\n   at Nethermind.State.StateProvider.get_StateRoot() in /src/Nethermind/Nethermind.State/StateProvider.cs:line 87\n   at Nethermind.Facade.BlockchainBridge.CallAndRestore(BlockHeader blockHeader, UInt256& timestamp, Transaction transaction, Boolean treatBlockHeaderAsParentBlock, ITxTracer tracer) in /src/Nethermind/Nethermind.Facade/BlockchainBridge.cs:line 283\n   at Nethermind.Facade.BlockchainBridge.TryCallAndRestore(BlockHeader blockHeader, UInt256& timestamp, Transaction transaction, Boolean treatBlockHeaderAsParentBlock, ITxTracer tracer) in /src/Nethermind/Nethermind.Facade/BlockchainBridge.cs:line 240\n   at Nethermind.Facade.BlockchainBridge.Call(BlockHeader header, Transaction tx, CancellationToken cancellationToken) in /src/Nethermind/Nethermind.Facade/BlockchainBridge.cs:line 173\n   at Nethermind.JsonRpc.Modules.Eth.EthRpcModule.CallTxExecutor.ExecuteTx(BlockHeader header, Transaction tx, CancellationToken token) in /src/Nethermind/Nethermind.JsonRpc/Modules/Eth/EthModule.TransactionExecutor.cs:line 93\n   at Nethermind.JsonRpc.Modules.Eth.EthRpcModule.TxExecutor`1.ExecuteTx(TransactionForRpc transactionCall, BlockParameter blockParameter) in /src/Nethermind/Nethermind.JsonRpc/Modules/Eth/EthModule.TransactionExecutor.cs:line 69"},"id":197} 
@dmanjunath
Copy link

Not very knowledgeable about this issue, but running into this problem as well so doing a little digging. From looking at the file from the stack trace, could it be a race condition where one tx sets _needsStateRootUpdate = true and before RecalculateStateRoot is called, another tx tries to interact with the same object?

@rickyrombo
Copy link
Author

@LukaszRozmej would you be able to shed some light on this maybe?

@LukaszRozmej
Copy link
Member

Hi we will prioritize that after we are merge ready.

@LukaszRozmej LukaszRozmej added the [Pri] High High prioriy label May 10, 2022
@jamescarter-le
Copy link

+1 With this happening to me on simple eth_Call queries, just getting some State from a contract.
Odd I've not seen it before, but just started happening today, no change in application. Running 1.13.0

@raymondjacobson
Copy link

@jamescarter-le This is similar to what I've observed. This happens only when under a lot of RPC load, but it causes fairly large problems because we see responses return incorrect/empty data.

@jamescarter-le
Copy link

jamescarter-le commented May 24, 2022

@jamescarter-le This is similar to what I've observed. This happens only when under a lot of RPC load, but it causes fairly large problems because we see responses return incorrect/empty data.

I have this happening under reasonably light load, covering live UniV2 and V3 swaps, nothing I would consider intense. I'll get some counter data to share.

I had run master branch over this DB set, and before that had not experienced this issue. I am resyncing a new DB that should be finished tomorrow for me to check if that displays the same issue.

I have implemented a retry, but main issue for me is eth_estimateGas failing. I am syncing Geth also to compare.

@LukaszRozmej
Copy link
Member

I might have found the root cause, but would be grateful for any feedback if it works for you. Packages are here: https://github.com/NethermindEth/nethermind/actions/runs/2381291605 I can build a docker image if needed.

@raymondjacobson
Copy link

I might have found the root cause, but would be grateful for any feedback if it works for you. Packages are here: https://github.com/NethermindEth/nethermind/actions/runs/2381291605 I can build a docker image if needed.

Thank you @LukaszRozmej ! Definitely happy to test. If you can build a docker image, that would be fantastic.

@jamescarter-le
Copy link

This looks like it has fixed my issue, will continue to monitor.

@LukaszRozmej
Copy link
Member

@jamescarter-le keep me posted we would include it in next release if it works

@jamescarter-le
Copy link

I've been running this version since this morning and had only 1 instance of this being thrown, but this was from my own Tracer plugin that uses BlockchainBridge, likely unrelated.

Been able to throw many RPC requests and hundreds of eth_estimateGas calls through with no issue.

@rickyrombo
Copy link
Author

Thank you so much @LukaszRozmej ! This seems super promising. Will test the docker image

@csjiang
Copy link

csjiang commented May 27, 2022

Hey @LukaszRozmej - thanks for the fix!

We (Audius) were able to test this w one of our Nethermind nodes and rotate into load - it works! No InvalidOperationException so far in the 2~ hours it's been running. Will upgrade rest of nodes w this today and add them to load early next wk. I'll follow up here at that time if we encounter this issue again.

@LukaszRozmej
Copy link
Member

This was release on 1.13.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Pri] High High prioriy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants