Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry custody requests after peer metadata updates #6975

Open
wants to merge 3 commits into
base: unstable
Choose a base branch
from

Conversation

dapplion
Copy link
Collaborator

Issue Addressed

Closes #6895

We need sync to retry custody requests when a peer CGC updates. A higher CGC can result in a data column subnet peer count increasing from 0 to 1, allowing requests to happen.

Proposed Changes

Add new sync event SyncMessage::UpdatedPeerCgc. It's sent by the router when a metadata response updates the known CGC

@dapplion dapplion requested a review from jxs as a code owner February 11, 2025 02:48
@dapplion dapplion requested a review from jimmygchen February 11, 2025 02:48
@dapplion dapplion added ready-for-review The code is ready for review das Data Availability Sampling labels Feb 11, 2025
Copy link
Member

@jimmygchen jimmygchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me! I think we need to attempt to make progress on range sync as well?

beacon_node/lighthouse_network/src/peer_manager/mod.rs Outdated Show resolved Hide resolved
@@ -483,6 +486,13 @@ impl<T: BeaconChainTypes> SyncManager<T> {
}
}

fn updated_peer_cgc(&mut self, _peer_id: PeerId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to resume by range request as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, I resume range sync aswell

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I'll run some tests today to confirm this fixes the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like this fixes the issue. I still don't see range requests until a few minutes later until we add a new finalized chain.

Right after startup, waiting for custody peers

Feb 12 04:33:07.181 DEBG Waiting for peers to be available on sampling column subnets, chain: 1, service: range_sync, service: sync, module: network::sync::range_sync::chain:1057

Got peer metadata response after 15s

Feb 12 04:33:22.271 DEBG Obtained peer's metadata, new_seq_no: 6, peer_id: 16Uiu2HAmJyaVGkRGR9ACqompkoge8T2x4KFH4KbzDkh7zz6uN2JX, service: libp2p, module: lighthouse_network::peer_manager:732

No range requests until ~5 mins later

Feb 12 04:38:07.303 DEBG Finalization sync peer joined, peer_id: 16Uiu2HAmJyaVGkRGR9ACqompkoge8T2x4KFH4KbzDkh7zz6uN2JX, service: range_sync, service: sync, module: network::sync::range_sync::range:143
Feb 12 04:38:07.305 DEBG New chain added to sync, id: 2, from: 38, to: 1071, end_root: 0x3be00d7ce6e52f7938fd588d909055f72469d0f09ce545d7b23077f2d6b40e8a, current_target: 38, batches: 0, peers: 1, state: Stopped, sync_type: Finalized, peer_id: 16Uiu2HAmJyaVGkRGR9ACqompk
oge8T2x4KFH4KbzDkh7zz6uN2JX, service: range_sync, service: sync, module: network::sync::range_sync::chain_collection:506
Feb 12 04:38:07.306 DEBG Sync RPC request sent, id: 4/3/RangeSync/39/1, peer: 16Uiu2HAmAAZ5wP6fvpe1b9tWNmgA2Wn8MsrNYVhkw5WohcAoaHKR, epoch: 39, slots: 32, method: BlocksByRange, service: sync, module: network::sync::network_context:788
Feb 12 04:38:07.307 DEBG Sync RPC request sent, id: 5/3/RangeSync/39/1, peer: 16Uiu2HAm9PijSZpm5QUphXRoBtkhUZPkGJ4Rgxk4Bny91oZPYZLG, columns: [74, 30, 39, 19, 63, 41, 52, 47, 58], epoch: 39, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you saw this log? Received updated peer CGC message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall seeing this. I think I was using the right locally-built image, but can be worth re-testing to confirm if you have time.

@jimmygchen jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. peerdas-devnet-4 and removed ready-for-review The code is ready for review labels Feb 11, 2025
@dapplion dapplion added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Feb 11, 2025
@dapplion dapplion requested a review from jimmygchen February 11, 2025 17:00
@jimmygchen jimmygchen added under-review A reviewer has only partially completed a review. and removed ready-for-review The code is ready for review labels Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
das Data Availability Sampling peerdas-devnet-4 under-review A reviewer has only partially completed a review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants