Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make X-chain mempool safe for concurrent use #2520

Merged
merged 4 commits into from
Dec 20, 2023

Conversation

StephenButtolph
Copy link
Contributor

Why this should be merged

The X-chain mempool currently requires the context lock to be held. This removes that requirement to allow easier integration with the p2p SDK.

How this works

A number of the structures used by the mempool are already safe for concurrent use. This PR adds locking where required to provide a consistent interface along with preventing racy behavior for existing structures that are not safe for concurrent use.

How this was tested

  • CI

@StephenButtolph StephenButtolph added enhancement New feature or request vm This involves virtual machines Durango durango fork labels Dec 20, 2023
@StephenButtolph StephenButtolph added this to the v1.10.18 milestone Dec 20, 2023
@StephenButtolph StephenButtolph self-assigned this Dec 20, 2023
Comment on lines -47 to -48
Has(txID ids.ID) bool
Get(txID ids.ID) *txs.Tx
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I think having Has and Get on the mempool isn't very async friendly because the mempool is now expected to change concurrently with interactions with it.

Comment on lines +53 to +55
// Iterate over transactions from oldest to newest until the function
// returns false or there are no more transactions.
Iterate(f func(tx *txs.Tx) bool)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to support pull gossip

Comment on lines +68 to +71
lock sync.RWMutex
unissuedTxs linkedhashmap.LinkedHashmap[ids.ID, *txs.Tx]
consumedUTXOs set.Set[ids.ID]
bytesAvailable int
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restructured the order of the fields so that all the fields protected by the lock could be cleanly put under it.

tx, _ := m.unissuedTxs.Get(txID)
return tx
func (m *mempool) Get(txID ids.ID) (*txs.Tx, bool) {
tx, ok := m.unissuedTxs.Get(txID)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LinkedHashMaps are already async safe.

Comment on lines +191 to +192
m.lock.RLock()
defer m.lock.RUnlock()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock is needed here to honor the Iter invariant that the linked hash map isn't arbitrarily modified while iterating.

}
}

func createTestTxs(count int) []*txs.Tx {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function must have been a copy paste from somewhere... because it is massively overkill for the tests in this file

Comment on lines +217 to 221
if _, ok := m.unissuedTxs.Get(txID); ok {
return
}

m.droppedTxIDs.Put(txID, reason)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt like it made the most sense to enforce that a tx wasn't considered dropped if it was in the mempool. I feel like this was the original idea (notice how we evict from droppedTxIDs in Add) but we probably just forgot to have the corresponding check here as well.

Interesting, previously we would insert errDuplicateTx here... which is especially weird.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we clean this up as a result of this change?

	// We need to grab the context lock here to avoid racy behavior with
	// transaction verification + mempool modifications.
	//
	// Invariant: tx should not be referenced again without the context lock
	// held to avoid any data races.
	n.ctx.Lock.Lock()
	err = n.issueTx(tx)
	n.ctx.Lock.Unlock()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm I suppose we may be able to reduce the scope of the lock... We still need the lock held when verifying the tx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll leave any larger changes to the network struct to when we are introducing the p2p SDK.

})
if err := registerer.Register(numTxsMetric); err != nil {
return nil, err
m := &mempool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an aesthetically pleasing cleanup- nicely done

@StephenButtolph StephenButtolph added this pull request to the merge queue Dec 20, 2023
Merged via the queue into dev with commit 0d0ac62 Dec 20, 2023
17 checks passed
@StephenButtolph StephenButtolph deleted the concurrent-x-chain-mempool branch December 20, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Durango durango fork enhancement New feature or request vm This involves virtual machines
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants