Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth/fetcher: make blob transaction announcements skip the waitlist #30118

Conversation

roberto-bayardo
Copy link
Contributor

@roberto-bayardo roberto-bayardo commented Jul 4, 2024

Blob transactions are announce-only and never broadcast, so make the transaction fetcher immediately schedule blob transactions for fetching instead of waitlisting them. This should help blob transactions more quickly reach all nodes.

Testing:

  • extended unit tests to exercise these cases
  • running this PR on my home node, thus far without issue

@roberto-bayardo
Copy link
Contributor Author

I think this issue may explain why OP stack batching transactions sometimes fail to propagate when using blobs, particularly under high load. Alone this issue might seem to simply mean blob transactions would propagate a bit more slowly than otherwise. But consider also that (1) geth's mempool will reject blob transactions with "gapped nonces", and (2) the order in which multiple blob transactions from the same peer are scheduled to be fetched is non-deterministic.
Consider then the impact of a batcher sending, say, 5 backlogged batcher blob transactions to the network. Even if a node gets all 5 announcements right away, it sits on them for half a second, then randomly tries to fetch one. Only 1/5 chance I'll pick the one with the "correct" nonce and continue to propagate the transaction along the network.

Does this sound right?

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good finding, but the solution in this PR seems not quite as we'd want it. Instead of special-casing blob txs, we should make all tx handling respect the time.

So, when waitTrigger triggers, we iterate the fetches in f.waittime, but we place it into a map and thus lose all ordering. It would be better to turn it into an order-respecting type, (like, slice), and order the fetches by instance. Then we'd keep the correct order, and keep waittime as it is.

@roberto-bayardo
Copy link
Contributor Author

This is a good finding, but the solution in this PR seems not quite as we'd want it. Instead of special-casing blob txs, we should make all tx handling respect the time.
...
It would be better to turn it into an order-respecting type, (like, slice), and order the fetches by instance. Then we'd keep the correct order, and keep waittime as it is.

I think there are two issues, the loss of ordering (which as you note I didn't address directly) and the fact that blob transactions are special since they are never broadcast. The latter is why I special-cased. By enforcing waiting on a blob tx, a node N hops from the original RPC node will not see the tx until at least N*0.5 seconds. I'd gather N is typically not that big, but this still feels problematic. Or is this working as intended?

I just noticed that geth also won't broadcast transactions above 4096 bytes, though if such a transaction is seen to be broadcast, it is not treated as an error.

@karalabe
Copy link
Member

karalabe commented Jul 4, 2024

By enforcing waiting on a blob tx, a node N hops from the original RPC node will not see the tx until at least N*0.5 seconds. I'd gather N is typically not that big, but this still feels problematic.

Good point. I light of that we should indeed not "wait" on it, though possibly it could be more elegantly done by inserting them with a -500ms arrival time (so we don't have two codepaths scheduling stuff).

@roberto-bayardo
Copy link
Contributor Author

Yeah I also considered "time travel" to implement this but had a slight preference for this approach which avoids some overhead and makes what's going on more explicit. Will think a bit about how enforcing a first-come-first-serve ordering might work and how it affects things, then circle back.

@roberto-bayardo
Copy link
Contributor Author

Attempt # 2 based on time-travel & announce-order here: #30125

LMK if that looks preferrable and I can close this one out.

// be in the wrong, but we don't know yet who.
// be in the wrong, but we don't know yet who. Note that the blob
// announcement gets waitlisted since its tx hash was seen earlier
// as non-blob type.
doTxNotify{peer: "D", hashes: []common.Hash{{0x01}, {0x02}}, types: []byte{types.LegacyTxType, types.BlobTxType}, sizes: []uint32{999, 222}},
isWaitingWithMeta(map[string][]announce{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I do string ?

@roberto-bayardo roberto-bayardo force-pushed the fetch-blob-txs-immediately branch from 558d95d to accbd61 Compare July 22, 2024 21:04
@roberto-bayardo
Copy link
Contributor Author

replaced with #30125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants