Ensure pending state is ready for tx execution #1858

piersy · 2022-02-25T14:27:36Z

Before introducing this change gas estimations against the pending block on non validating nodes with the espresso fork enabled would sometimes be lower than they should be.

What this PR does is better described here

Its still not clear to me exactly how the lower gas estimates were being calculated but ensuring that the pending block is finalized and assembled seems to resolve this. Validating nodes were already finalizing and assembling their pending blocks and looking back it seems that prior to #1545 the pending block was always finalized and assembled (see permalink below). So this seems like this change is correcting a long running problem.

celo-blockchain/miner/worker.go

Line 969 in ab78658

w.updateSnapshot()

~~This change also changes the behaviour of the debug tracing apis when used run against the pending block.~~

Since #1545 caused the pending block to not be finalized, the state root in the pending block was zero. Strangely, looking up the zero root in the state trie doesn't fail but the state returned is essentially empty, meaning that all transactions would be traced as a simple send ( because that is how a contract transaction executes if the contract does not exist).

~~With this change, calls to the tracing apis with the pending block will return the error required historical state unavailable.~~

This is because the pending block will have been finalized and will therefore have a state root, and the tracing apis first look up the block and then try to get the state for it using Ethereum.StateAtBlock which doesn't take into account the pending state and therefore the state cannot be found.

Tested

I have verified that the tests linked in #1856 produce reliable gas estimates running against a local node connected to a 1 validator network run with mycelo and also on a fullnode deployed in alfajores.

Backwards compatibility

The results of the rpc apis are chainging.

piersy · 2022-02-25T14:33:13Z

Coverage from tests in ./e2e_test/... for ./consensus/istanbul/... at commit fefe91b

coverage: 51.9% of statements across all listed packages

coverage:  65.4% of statements in consensus/istanbul
coverage:  43.1% of statements in consensus/istanbul/announce
coverage:  55.9% of statements in consensus/istanbul/backend
coverage:   0.0% of statements in consensus/istanbul/backend/backendtest
coverage:  24.3% of statements in consensus/istanbul/backend/internal/replica
coverage:  71.0% of statements in consensus/istanbul/core
coverage:  50.0% of statements in consensus/istanbul/db
coverage:   0.0% of statements in consensus/istanbul/proxy
coverage:  75.3% of statements in consensus/istanbul/uptime
coverage: 100.0% of statements in consensus/istanbul/uptime/store
coverage:  51.8% of statements in consensus/istanbul/validator
coverage:  79.2% of statements in consensus/istanbul/validator/random

CommentID: cc27fd229a

nategraf · 2022-02-26T00:51:40Z

Nice detective work so far!

I don't think calling finalizeAndAssemble before the last transaction is applied results in well-defined behavior. It ends up mutating the state DB held by the working in a number of ways, such as distributing epoch rewards, and those transactions are intended to be the last step in assembling a block. Applying additional transactions on top of that will result in behavior that is divergent from what would happen during on-chain execution of a block with the same contents (e.g. a transaction running after the GPM has changed by be valid while the real execution will end up rejecting the transaction because the GPM changes at the end of the block). More critically, this code seems to result in calling finalization multiple times on the same state DB, which is not intended.

celo-blockchain/consensus/istanbul/backend/engine.go

Line 491 in 6ad68f2

    
           func (sb *Backend) Finalize(chain consensus.ChainHeaderReader, header *types.Header, state *state.StateDB, txs []*types.Transaction) {

I'm guessing that the key in this actually the following line:

celo-blockchain/consensus/istanbul/backend/engine.go

Line 529 in 6ad68f2

header.Root = state.IntermediateRoot(chain.Config().IsEIP158(header.Number))

Or possibly the call to state.Prepare:

celo-blockchain/consensus/istanbul/backend/engine.go

Line 503 in 6ad68f2

state.Prepare(common.Hash{}, len(txs))

Both of these mutate the state DB, and the first of the two modifies the header as well.

piersy · 2022-02-28T11:26:34Z

Thank you @nategraf.

I decided to look at the TraceBlock api, because trace block is tracing each tx in the block and has to prepare some intermediate state for each tx such that the tx traces correctly. And that is I think what we need to achieve here.

Below is the code block that prepares a state for tracing a tx.

celo-blockchain/eth/tracers/api.go

Lines 561 to 576 in 6ad68f2

    
           for i, tx := range txs { 
        
           	// Send the trace task over for execution 
        
           	jobs <- &txTraceTask{statedb: statedb.Copy(), index: i} 
        
           	// Generate the next state snapshot fast without tracing 
        
           	msg, _ := tx.AsMessage(signer, nil) 
        
           	statedb.Prepare(tx.Hash(), i) 
        
           	vmenv := vm.NewEVM(blockCtx, core.NewEVMTxContext(msg), statedb, api.backend.ChainConfig(), vm.Config{}) 
        
           	vmRunner := api.backend.VmRunnerAtHeader(block.Header(), statedb) 
        
           	if _, err := core.ApplyMessage(vmenv, msg, new(core.GasPool).AddGas(msg.Gas()), vmRunner, sysCtx); err != nil { 
        
           		failed = err 
        
           		break 
        
           	} 
        
           	// Finalize the state so any modifications are written to the trie 
        
           	// Only delete empty objects if EIP158/161 (a.k.a Spurious Dragon) is in effect 
        
           	statedb.Finalise(vmenv.ChainConfig().IsEIP158(block.Number()))

There seem to be 2 things that are done to the state in preparation for tracing.

A call to statedb.Prepare(txHash, txIndex)
A call to statedb.Finalise(isEIP158)

The prepare implementation is below, and we can see that it creates an accessLIst which was added as part of EIP-2929 (gas cost increases for state access opcodes). The summary for this eip is:

Increases gas cost for SLOAD, CALL, BALANCE, EXT and SELFDESTRUCT when used for the first time in a transaction.

So if Prepare is not called between transaction executions you would expect to see the second transaction pay less gas than it should need to, if it accessed some of the same state as the prior transaction. This seems very much like the problem we are facing.

celo-blockchain/core/state/statedb.go

Lines 884 to 888 in 6ad68f2

    
           func (s *StateDB) Prepare(thash common.Hash, ti int) { 
        
           	s.thash = thash 
        
           	s.txIndex = ti 
        
           	s.accessList = newAccessList() 
        
           }

The Finalise implementation is below and it is significantly more complicated than Prepare it seems to be doing some sort of internal maintenance on the state representation, but the last line seems significant, the comment states:

Invalidate journal because reverting across transactions is not allowed.

Based on that it would seem that failing to call Finalise between transactions could result in a revert reverting the changes from prior transactions, so something to bear in mind if executing multiple transactions across a state instance.

celo-blockchain/core/state/statedb.go

Lines 775 to 819 in 6ad68f2

    
           // Finalise finalises the state by removing the s destructed objects and clears 
        
           // the journal as well as the refunds. Finalise, however, will not push any updates 
        
           // into the tries just yet. Only IntermediateRoot or Commit will do that. 
        
           func (s *StateDB) Finalise(deleteEmptyObjects bool) { 
        
           	addressesToPrefetch := make([][]byte, 0, len(s.journal.dirties)) 
        
           	for addr := range s.journal.dirties { 
        
           		obj, exist := s.stateObjects[addr] 
        
           		if !exist { 
        
           			// ripeMD is 'touched' at block 1714175, in tx 0x1237f737031e40bcde4a8b7e717b2d15e3ecadfe49bb1bbc71ee9deb09c6fcf2 
        
           			// That tx goes out of gas, and although the notion of 'touched' does not exist there, the 
        
           			// touch-event will still be recorded in the journal. Since ripeMD is a special snowflake, 
        
           			// it will persist in the journal even though the journal is reverted. In this special circumstance, 
        
           			// it may exist in `s.journal.dirties` but not in `s.stateObjects`. 
        
           			// Thus, we can safely ignore it here 
        
           			continue 
        
           		} 
        
           		if obj.suicided || (deleteEmptyObjects && obj.empty()) { 
        
           			obj.deleted = true 
        
           			// If state snapshotting is active, also mark the destruction there. 
        
           			// Note, we can't do this only at the end of a block because multiple 
        
           			// transactions within the same block might self destruct and then 
        
           			// ressurrect an account; but the snapshotter needs both events. 
        
           			if s.snap != nil { 
        
           				s.snapDestructs[obj.addrHash] = struct{}{} // We need to maintain account deletions explicitly (will remain set indefinitely) 
        
           				delete(s.snapAccounts, obj.addrHash)       // Clear out any previously updated account data (may be recreated via a ressurrect) 
        
           				delete(s.snapStorage, obj.addrHash)        // Clear out any previously updated storage data (may be recreated via a ressurrect) 
        
           			} 
        
           		} else { 
        
           			obj.finalise(true) // Prefetch slots in the background 
        
           		} 
        
           		s.stateObjectsPending[addr] = struct{}{} 
        
           		s.stateObjectsDirty[addr] = struct{}{} 
        
           		// At this point, also ship the address off to the precacher. The precacher 
        
           		// will start loading tries, and when the change is eventually committed, 
        
           		// the commit-phase will be a lot faster 
        
           		addressesToPrefetch = append(addressesToPrefetch, common.CopyBytes(addr[:])) // Copy needed for closure 
        
           	} 
        
           	if s.prefetcher != nil && len(addressesToPrefetch) > 0 { 
        
           		s.prefetcher.prefetch(s.originalRoot, addressesToPrefetch) 
        
           	} 
        
           	// Invalidate journal because reverting across transactions is not allowed. 
        
           	s.clearJournalAndRefund() 
        
           }

Looking into the miner code that manages the pending block I can see that both Prepare and Finalise are being called, Prepare before executing a tx and Finalise after. So I think all that is required for us is to ensure that we call prepare on the state returned from the worker's pending method before beginning gas estimation. We shouldn't need to worry about calling finalise since the worker returns a copy of the state and estimate gas executes just one transaction against it.

This commit ensures that StateDB.Prepare is called on the pending state before it is returned from the miner/worker. This prevents access logs from previous transaction executions from interfering with the gas cost calculations of the subsequently executed transaction. The problem this solves was introdued by this upstream PR ethereum/go-ethereum#21509 which added an access list to the state which was used to reduce gas costs for repeated access of the same state locations, this resulted in the pending block having the access list of the last executed transaction, which could cause gas estimates to be wrong when the estimated transaction accessed some of the same state as the prior transaction.

codecov · 2022-02-28T20:09:36Z

Codecov Report

Merging #1858 (fefe91b) into master (503f156) will increase coverage by 0.00%.
The diff coverage is 0.00%.

@@           Coverage Diff           @@
##           master    #1858   +/-   ##
=======================================
  Coverage   54.23%   54.23%           
=======================================
  Files         673      673           
  Lines       88991    88993    +2     
=======================================
+ Hits        48262    48268    +6     
+ Misses      37090    37077   -13     
- Partials     3639     3648    +9

Impacted Files	Coverage Δ
miner/worker.go	`68.07% <0.00%> (-0.65%)`	⬇️
p2p/discover/ntp.go	`66.66% <0.00%> (-9.53%)`	⬇️
eth/downloader/statesync.go	`57.14% <0.00%> (-5.11%)`	⬇️
les/vflux/client/wrsiterator.go	`92.59% <0.00%> (-3.71%)`	⬇️
core/state/trie_prefetcher.go	`76.00% <0.00%> (-3.34%)`	⬇️
les/distributor.go	`78.62% <0.00%> (-1.38%)`	⬇️
les/odr.go	`89.79% <0.00%> (-1.03%)`	⬇️
trie/proof.go	`75.57% <0.00%> (-1.00%)`	⬇️
les/vflux/server/prioritypool.go	`82.68% <0.00%> (-0.84%)`	⬇️
eth/fetcher/block_fetcher.go	`83.37% <0.00%> (-0.50%)`	⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 503f156...fefe91b. Read the comment docs.

* Enusre pending state is ready for tx execution This commit ensures that StateDB.Prepare is called on the pending state before it is returned from the miner/worker. This prevents access logs from previous transaction executions from interfering with the gas cost calculations of the subsequently executed transaction. The problem this solves was introdued by this upstream PR ethereum/go-ethereum#21509 which added an access list to the state which was used to reduce gas costs for repeated access of the same state locations, this resulted in the pending block having the access list of the last executed transaction, which could cause gas estimates to be wrong when the estimated transaction accessed some of the same state as the prior transaction. * Add more in depth comment on call to Prepare

* Ensure pending state is ready for tx execution (#1858) * Enusre pending state is ready for tx execution This commit ensures that StateDB.Prepare is called on the pending state before it is returned from the miner/worker. This prevents access logs from previous transaction executions from interfering with the gas cost calculations of the subsequently executed transaction. The problem this solves was introdued by this upstream PR ethereum/go-ethereum#21509 which added an access list to the state which was used to reduce gas costs for repeated access of the same state locations, this resulted in the pending block having the access list of the last executed transaction, which could cause gas estimates to be wrong when the estimated transaction accessed some of the same state as the prior transaction. * Add more in depth comment on call to Prepare * set version to 1.5.4-stable Co-authored-by: piersy <[email protected]>

* Enusre pending state is ready for tx execution This commit ensures that StateDB.Prepare is called on the pending state before it is returned from the miner/worker. This prevents access logs from previous transaction executions from interfering with the gas cost calculations of the subsequently executed transaction. The problem this solves was introdued by this upstream PR ethereum/go-ethereum#21509 which added an access list to the state which was used to reduce gas costs for repeated access of the same state locations, this resulted in the pending block having the access list of the last executed transaction, which could cause gas estimates to be wrong when the estimated transaction accessed some of the same state as the prior transaction. * Add more in depth comment on call to Prepare

piersy requested a review from nategraf February 25, 2022 23:20

piersy changed the title ~~Ensure pending block is finalized and assembled~~ Ensure pending state is ready for tx execution Feb 28, 2022

piersy marked this pull request as ready for review February 28, 2022 14:53

piersy requested a review from a team as a code owner February 28, 2022 14:53

piersy requested review from gastonponti and removed request for a team February 28, 2022 14:53

piersy added 2 commits February 28, 2022 19:07

Add more in depth comment on call to Prepare

fefe91b

piersy force-pushed the piersy/fix-espresso-low-gas-estimates branch from 94bdc9f to fefe91b Compare February 28, 2022 19:08

piersy changed the base branch from release/1.5.x to master February 28, 2022 19:08

nategraf approved these changes Feb 28, 2022

View reviewed changes

nategraf merged commit d67d222 into master Feb 28, 2022

nategraf deleted the piersy/fix-espresso-low-gas-estimates branch February 28, 2022 20:09

nategraf mentioned this pull request Feb 28, 2022

Cherry pick #1858 to release/1.5.x #1861

Merged

piersy mentioned this pull request Mar 2, 2022

miner: Ensure pending state ready for tx execution ethereum/go-ethereum#24492

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure pending state is ready for tx execution #1858

Ensure pending state is ready for tx execution #1858

piersy commented Feb 25, 2022 •

edited

Loading

piersy commented Feb 25, 2022 •

edited

Loading

nategraf commented Feb 26, 2022

piersy commented Feb 28, 2022 •

edited

Loading

codecov bot commented Feb 28, 2022

Ensure pending state is ready for tx execution #1858

Ensure pending state is ready for tx execution #1858

Conversation

piersy commented Feb 25, 2022 • edited Loading

Tested

Backwards compatibility

piersy commented Feb 25, 2022 • edited Loading

nategraf commented Feb 26, 2022

piersy commented Feb 28, 2022 • edited Loading

codecov bot commented Feb 28, 2022

Codecov Report

piersy commented Feb 25, 2022 •

edited

Loading

piersy commented Feb 25, 2022 •

edited

Loading

piersy commented Feb 28, 2022 •

edited

Loading