Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure pending state is ready for tx execution #1858

Merged
merged 2 commits into from
Feb 28, 2022

Conversation

piersy
Copy link
Contributor

@piersy piersy commented Feb 25, 2022

Fixes #1856

Before introducing this change gas estimations against the pending block on non validating nodes with the espresso fork enabled would sometimes be lower than they should be.

What this PR does is better described here

Its still not clear to me exactly how the lower gas estimates were being calculated but ensuring that the pending block is finalized and assembled seems to resolve this. Validating nodes were already finalizing and assembling their pending blocks and looking back it seems that prior to #1545 the pending block was always finalized and assembled (see permalink below). So this seems like this change is correcting a long running problem.

w.updateSnapshot()

This change also changes the behaviour of the debug tracing apis when used run against the pending block.

Since #1545 caused the pending block to not be finalized, the state root in the pending block was zero. Strangely, looking up the zero root in the state trie doesn't fail but the state returned is essentially empty, meaning that all transactions would be traced as a simple send ( because that is how a contract transaction executes if the contract does not exist).

With this change, calls to the tracing apis with the pending block will return the error required historical state unavailable.

This is because the pending block will have been finalized and will therefore have a state root, and the tracing apis first look up the block and then try to get the state for it using Ethereum.StateAtBlock which doesn't take into account the pending state and therefore the state cannot be found.

Tested

I have verified that the tests linked in #1856 produce reliable gas estimates running against a local node connected to a 1 validator network run with mycelo and also on a fullnode deployed in alfajores.

Backwards compatibility

The results of the rpc apis are chainging.

@piersy
Copy link
Contributor Author

piersy commented Feb 25, 2022

Coverage from tests in ./e2e_test/... for ./consensus/istanbul/... at commit fefe91b

coverage: 51.9% of statements across all listed packages
coverage:  65.4% of statements in consensus/istanbul
coverage:  43.1% of statements in consensus/istanbul/announce
coverage:  55.9% of statements in consensus/istanbul/backend
coverage:   0.0% of statements in consensus/istanbul/backend/backendtest
coverage:  24.3% of statements in consensus/istanbul/backend/internal/replica
coverage:  71.0% of statements in consensus/istanbul/core
coverage:  50.0% of statements in consensus/istanbul/db
coverage:   0.0% of statements in consensus/istanbul/proxy
coverage:  75.3% of statements in consensus/istanbul/uptime
coverage: 100.0% of statements in consensus/istanbul/uptime/store
coverage:  51.8% of statements in consensus/istanbul/validator
coverage:  79.2% of statements in consensus/istanbul/validator/random
CommentID: cc27fd229a

@piersy piersy requested a review from nategraf February 25, 2022 23:20
@nategraf
Copy link
Contributor

Nice detective work so far!

I don't think calling finalizeAndAssemble before the last transaction is applied results in well-defined behavior. It ends up mutating the state DB held by the working in a number of ways, such as distributing epoch rewards, and those transactions are intended to be the last step in assembling a block. Applying additional transactions on top of that will result in behavior that is divergent from what would happen during on-chain execution of a block with the same contents (e.g. a transaction running after the GPM has changed by be valid while the real execution will end up rejecting the transaction because the GPM changes at the end of the block). More critically, this code seems to result in calling finalization multiple times on the same state DB, which is not intended.

func (sb *Backend) Finalize(chain consensus.ChainHeaderReader, header *types.Header, state *state.StateDB, txs []*types.Transaction) {

I'm guessing that the key in this actually the following line:

header.Root = state.IntermediateRoot(chain.Config().IsEIP158(header.Number))

Or possibly the call to state.Prepare:

state.Prepare(common.Hash{}, len(txs))

Both of these mutate the state DB, and the first of the two modifies the header as well.

@piersy
Copy link
Contributor Author

piersy commented Feb 28, 2022

Thank you @nategraf.

I decided to look at the TraceBlock api, because trace block is tracing each tx in the block and has to prepare some intermediate state for each tx such that the tx traces correctly. And that is I think what we need to achieve here.

Below is the code block that prepares a state for tracing a tx.

for i, tx := range txs {
// Send the trace task over for execution
jobs <- &txTraceTask{statedb: statedb.Copy(), index: i}
// Generate the next state snapshot fast without tracing
msg, _ := tx.AsMessage(signer, nil)
statedb.Prepare(tx.Hash(), i)
vmenv := vm.NewEVM(blockCtx, core.NewEVMTxContext(msg), statedb, api.backend.ChainConfig(), vm.Config{})
vmRunner := api.backend.VmRunnerAtHeader(block.Header(), statedb)
if _, err := core.ApplyMessage(vmenv, msg, new(core.GasPool).AddGas(msg.Gas()), vmRunner, sysCtx); err != nil {
failed = err
break
}
// Finalize the state so any modifications are written to the trie
// Only delete empty objects if EIP158/161 (a.k.a Spurious Dragon) is in effect
statedb.Finalise(vmenv.ChainConfig().IsEIP158(block.Number()))

There seem to be 2 things that are done to the state in preparation for tracing.

  • A call to statedb.Prepare(txHash, txIndex)
  • A call to statedb.Finalise(isEIP158)

The prepare implementation is below, and we can see that it creates an accessLIst which was added as part of EIP-2929 (gas cost increases for state access opcodes). The summary for this eip is:

Increases gas cost for SLOAD, CALL, BALANCE, EXT and SELFDESTRUCT when used for the first time in a transaction.

So if Prepare is not called between transaction executions you would expect to see the second transaction pay less gas than it should need to, if it accessed some of the same state as the prior transaction. This seems very much like the problem we are facing.

func (s *StateDB) Prepare(thash common.Hash, ti int) {
s.thash = thash
s.txIndex = ti
s.accessList = newAccessList()
}

The Finalise implementation is below and it is significantly more complicated than Prepare it seems to be doing some sort of internal maintenance on the state representation, but the last line seems significant, the comment states:

Invalidate journal because reverting across transactions is not allowed.

Based on that it would seem that failing to call Finalise between transactions could result in a revert reverting the changes from prior transactions, so something to bear in mind if executing multiple transactions across a state instance.

// Finalise finalises the state by removing the s destructed objects and clears
// the journal as well as the refunds. Finalise, however, will not push any updates
// into the tries just yet. Only IntermediateRoot or Commit will do that.
func (s *StateDB) Finalise(deleteEmptyObjects bool) {
addressesToPrefetch := make([][]byte, 0, len(s.journal.dirties))
for addr := range s.journal.dirties {
obj, exist := s.stateObjects[addr]
if !exist {
// ripeMD is 'touched' at block 1714175, in tx 0x1237f737031e40bcde4a8b7e717b2d15e3ecadfe49bb1bbc71ee9deb09c6fcf2
// That tx goes out of gas, and although the notion of 'touched' does not exist there, the
// touch-event will still be recorded in the journal. Since ripeMD is a special snowflake,
// it will persist in the journal even though the journal is reverted. In this special circumstance,
// it may exist in `s.journal.dirties` but not in `s.stateObjects`.
// Thus, we can safely ignore it here
continue
}
if obj.suicided || (deleteEmptyObjects && obj.empty()) {
obj.deleted = true
// If state snapshotting is active, also mark the destruction there.
// Note, we can't do this only at the end of a block because multiple
// transactions within the same block might self destruct and then
// ressurrect an account; but the snapshotter needs both events.
if s.snap != nil {
s.snapDestructs[obj.addrHash] = struct{}{} // We need to maintain account deletions explicitly (will remain set indefinitely)
delete(s.snapAccounts, obj.addrHash) // Clear out any previously updated account data (may be recreated via a ressurrect)
delete(s.snapStorage, obj.addrHash) // Clear out any previously updated storage data (may be recreated via a ressurrect)
}
} else {
obj.finalise(true) // Prefetch slots in the background
}
s.stateObjectsPending[addr] = struct{}{}
s.stateObjectsDirty[addr] = struct{}{}
// At this point, also ship the address off to the precacher. The precacher
// will start loading tries, and when the change is eventually committed,
// the commit-phase will be a lot faster
addressesToPrefetch = append(addressesToPrefetch, common.CopyBytes(addr[:])) // Copy needed for closure
}
if s.prefetcher != nil && len(addressesToPrefetch) > 0 {
s.prefetcher.prefetch(s.originalRoot, addressesToPrefetch)
}
// Invalidate journal because reverting across transactions is not allowed.
s.clearJournalAndRefund()
}

Looking into the miner code that manages the pending block I can see that both Prepare and Finalise are being called, Prepare before executing a tx and Finalise after. So I think all that is required for us is to ensure that we call prepare on the state returned from the worker's pending method before beginning gas estimation. We shouldn't need to worry about calling finalise since the worker returns a copy of the state and estimate gas executes just one transaction against it.

@piersy piersy changed the title Ensure pending block is finalized and assembled Ensure pending state is ready for tx execution Feb 28, 2022
@piersy piersy marked this pull request as ready for review February 28, 2022 14:53
@piersy piersy requested a review from a team as a code owner February 28, 2022 14:53
@piersy piersy requested review from gastonponti and removed request for a team February 28, 2022 14:53
This commit ensures that StateDB.Prepare is called on the pending state
before it is returned from the miner/worker. This prevents access logs
from previous transaction executions from interfering with the gas cost
calculations of the subsequently executed transaction.

The problem this solves was introdued by this upstream PR
ethereum/go-ethereum#21509 which added an
access list to the state which was used to reduce gas costs for repeated
access of the same state locations, this resulted in the pending block
having the access list of the last executed transaction, which could
cause gas estimates to be wrong when the estimated transaction accessed
some of the same state as the prior transaction.
@piersy piersy force-pushed the piersy/fix-espresso-low-gas-estimates branch from 94bdc9f to fefe91b Compare February 28, 2022 19:08
@piersy piersy changed the base branch from release/1.5.x to master February 28, 2022 19:08
@codecov
Copy link

codecov bot commented Feb 28, 2022

Codecov Report

Merging #1858 (fefe91b) into master (503f156) will increase coverage by 0.00%.
The diff coverage is 0.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1858   +/-   ##
=======================================
  Coverage   54.23%   54.23%           
=======================================
  Files         673      673           
  Lines       88991    88993    +2     
=======================================
+ Hits        48262    48268    +6     
+ Misses      37090    37077   -13     
- Partials     3639     3648    +9     
Impacted Files Coverage Δ
miner/worker.go 68.07% <0.00%> (-0.65%) ⬇️
p2p/discover/ntp.go 66.66% <0.00%> (-9.53%) ⬇️
eth/downloader/statesync.go 57.14% <0.00%> (-5.11%) ⬇️
les/vflux/client/wrsiterator.go 92.59% <0.00%> (-3.71%) ⬇️
core/state/trie_prefetcher.go 76.00% <0.00%> (-3.34%) ⬇️
les/distributor.go 78.62% <0.00%> (-1.38%) ⬇️
les/odr.go 89.79% <0.00%> (-1.03%) ⬇️
trie/proof.go 75.57% <0.00%> (-1.00%) ⬇️
les/vflux/server/prioritypool.go 82.68% <0.00%> (-0.84%) ⬇️
eth/fetcher/block_fetcher.go 83.37% <0.00%> (-0.50%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 503f156...fefe91b. Read the comment docs.

@nategraf nategraf merged commit d67d222 into master Feb 28, 2022
@nategraf nategraf deleted the piersy/fix-espresso-low-gas-estimates branch February 28, 2022 20:09
nategraf pushed a commit that referenced this pull request Feb 28, 2022
* Enusre pending state is ready for tx execution

This commit ensures that StateDB.Prepare is called on the pending state
before it is returned from the miner/worker. This prevents access logs
from previous transaction executions from interfering with the gas cost
calculations of the subsequently executed transaction.

The problem this solves was introdued by this upstream PR
ethereum/go-ethereum#21509 which added an
access list to the state which was used to reduce gas costs for repeated
access of the same state locations, this resulted in the pending block
having the access list of the last executed transaction, which could
cause gas estimates to be wrong when the estimated transaction accessed
some of the same state as the prior transaction.

* Add more in depth comment on call to Prepare
nategraf pushed a commit that referenced this pull request Feb 28, 2022
* Ensure pending state is ready for tx execution (#1858)

* Enusre pending state is ready for tx execution

This commit ensures that StateDB.Prepare is called on the pending state
before it is returned from the miner/worker. This prevents access logs
from previous transaction executions from interfering with the gas cost
calculations of the subsequently executed transaction.

The problem this solves was introdued by this upstream PR
ethereum/go-ethereum#21509 which added an
access list to the state which was used to reduce gas costs for repeated
access of the same state locations, this resulted in the pending block
having the access list of the last executed transaction, which could
cause gas estimates to be wrong when the estimated transaction accessed
some of the same state as the prior transaction.

* Add more in depth comment on call to Prepare

* set version to 1.5.4-stable

Co-authored-by: piersy <[email protected]>
huuhait pushed a commit to zsmartex/celo-blockchain that referenced this pull request Dec 12, 2022
* Enusre pending state is ready for tx execution

This commit ensures that StateDB.Prepare is called on the pending state
before it is returned from the miner/worker. This prevents access logs
from previous transaction executions from interfering with the gas cost
calculations of the subsequently executed transaction.

The problem this solves was introdued by this upstream PR
ethereum/go-ethereum#21509 which added an
access list to the state which was used to reduce gas costs for repeated
access of the same state locations, this resulted in the pending block
having the access list of the last executed transaction, which could
cause gas estimates to be wrong when the estimated transaction accessed
some of the same state as the prior transaction.

* Add more in depth comment on call to Prepare
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gas estimation is unreliable in release v1.5.3 with espresso enabled
2 participants