Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: ensure mesh data are sync'ed by comparing states #3987

Open
countvonzero opened this issue Jan 22, 2023 · 1 comment
Open

sync: ensure mesh data are sync'ed by comparing states #3987

countvonzero opened this issue Jan 22, 2023 · 1 comment

Comments

@countvonzero
Copy link
Contributor

Description

currently when a node syncs data from peers, its approach is

  • sync all ATXs from genesis
  • from the latest layer (latest ballot recorded in database), sync ballots and blocks from peers

the approach of syncing is to fetch all the ID hashes, if those ID hashes are not present in local database, fetch the data blobs referenced by the hash from peers in batches.

once the node is synced, it starts listening to gossiped data (ATXs and ballots). the sync module acts as fallback and sync data from the prior layer.

there are a few problems with this approach

  • it is inefficient: for a node that comes back online after being offline for a while, it doesn't need to sync from genesis
  • it is not precise: if a node somehow misses/drops gossip data, and the backup sync didn't get them from the direct peers, then the node will have different total weight from others.

with #3915, syncing MalfeasanceProofs suffer the same issues.
worse, as they correspond to identities and there are no clear boundary to limit the amount of MalfeasanceProofs to download from peers.

ideas

from @tal-m

I think this should be solved at the P2P layer. For example, you could add the total ballot weight to the sync; 
this way your total weight is less than your peer, you know you're missing ballots. You can also add a hash
of the ballots from old layers (which you expect won't change often), and use that to make absolutely sure 
you're synced with your peer about ballots (we need to be careful not to make this an avenue for DoS, though, 
since malicious parties can create new, backdated ballots just to mess with sync)

from @dshulyak

^^ this approach makes sense to me for syncing malicious identities. i don't understand the idea of 
downloading identities in every sync iteration, maybe it does work as a simplification for genesis

for example this is how cassandra db syncs state with replicas using merkle tree to locate missing data 
efficiently 
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsRepairNodesManualRepair.html#Howdoesanti-entropyrepairwork

Manual repair: Anti-entropy repair

bors bot pushed a commit that referenced this issue Jan 25, 2023
## Motivation
<!-- Please mention the issue fixed by this PR or detailed motivation -->
Closes #3920
<!-- `Closes #XXXX, closes #XXXX, ...` links mentioned issues to this PR and automatically closes them when this it's merged -->

## Changes
<!-- Please describe in detail the changes made -->
syncer syncs MalfeasanceProof after ATXs are synced. it has to be synced after ATXs to avoid being spammed by unknown identities.

it is synced as follows.
- poll peers for all the malicious NodeIDs
- if those NodeIDs exists, fetch the malfeasance proofs associated with the NodeID hash in batches

this implementation is sub-optimal (see #3987) and is intended to be a genesis compromise
bors bot pushed a commit that referenced this issue Jan 25, 2023
## Motivation
<!-- Please mention the issue fixed by this PR or detailed motivation -->
Closes #3920
<!-- `Closes #XXXX, closes #XXXX, ...` links mentioned issues to this PR and automatically closes them when this it's merged -->

## Changes
<!-- Please describe in detail the changes made -->
syncer syncs MalfeasanceProof after ATXs are synced. it has to be synced after ATXs to avoid being spammed by unknown identities.

it is synced as follows.
- poll peers for all the malicious NodeIDs
- if those NodeIDs exists, fetch the malfeasance proofs associated with the NodeID hash in batches

this implementation is sub-optimal (see #3987) and is intended to be a genesis compromise
bors bot pushed a commit that referenced this issue Jan 26, 2023
## Motivation
<!-- Please mention the issue fixed by this PR or detailed motivation -->
Closes #3920
<!-- `Closes #XXXX, closes #XXXX, ...` links mentioned issues to this PR and automatically closes them when this it's merged -->

## Changes
<!-- Please describe in detail the changes made -->
syncer syncs MalfeasanceProof after ATXs are synced. it has to be synced after ATXs to avoid being spammed by unknown identities.

it is synced as follows.
- poll peers for all the malicious NodeIDs
- if those NodeIDs exists, fetch the malfeasance proofs associated with the NodeID hash in batches

this implementation is sub-optimal (see #3987) and is intended to be a genesis compromise
bors bot pushed a commit that referenced this issue Jan 26, 2023
## Motivation
<!-- Please mention the issue fixed by this PR or detailed motivation -->
Closes #3920
<!-- `Closes #XXXX, closes #XXXX, ...` links mentioned issues to this PR and automatically closes them when this it's merged -->

## Changes
<!-- Please describe in detail the changes made -->
syncer syncs MalfeasanceProof after ATXs are synced. it has to be synced after ATXs to avoid being spammed by unknown identities.

it is synced as follows.
- poll peers for all the malicious NodeIDs
- if those NodeIDs exists, fetch the malfeasance proofs associated with the NodeID hash in batches

this implementation is sub-optimal (see #3987) and is intended to be a genesis compromise
bors bot pushed a commit that referenced this issue Jan 27, 2023
## Motivation
<!-- Please mention the issue fixed by this PR or detailed motivation -->
Closes #3920
<!-- `Closes #XXXX, closes #XXXX, ...` links mentioned issues to this PR and automatically closes them when this it's merged -->

## Changes
<!-- Please describe in detail the changes made -->
syncer syncs MalfeasanceProof after ATXs are synced. it has to be synced after ATXs to avoid being spammed by unknown identities.

it is synced as follows.
- poll peers for all the malicious NodeIDs
- if those NodeIDs exists, fetch the malfeasance proofs associated with the NodeID hash in batches

this implementation is sub-optimal (see #3987) and is intended to be a genesis compromise
bors bot pushed a commit that referenced this issue Jan 27, 2023
## Motivation
<!-- Please mention the issue fixed by this PR or detailed motivation -->
Closes #3920
<!-- `Closes #XXXX, closes #XXXX, ...` links mentioned issues to this PR and automatically closes them when this it's merged -->

## Changes
<!-- Please describe in detail the changes made -->
syncer syncs MalfeasanceProof after ATXs are synced. it has to be synced after ATXs to avoid being spammed by unknown identities.

it is synced as follows.
- poll peers for all the malicious NodeIDs
- if those NodeIDs exists, fetch the malfeasance proofs associated with the NodeID hash in batches

this implementation is sub-optimal (see #3987) and is intended to be a genesis compromise
@dshulyak dshulyak removed the sync label Sep 26, 2023
@dshulyak dshulyak moved this to 📋 Backlog in Dev team kanban Nov 20, 2023
@ivan4th
Copy link
Contributor

ivan4th commented Dec 21, 2023

Regarding some sync approaches that might enable quick sync between pairs of peers (assuming symmetric difference between their sets of ATXs, malfeasance proofs etc. is small) and also ensuring good propagation of the state across the network:

  • Graphene: IBLT + Bloom filters. Several implementations exists, but mostly in either unmaintained projects or unmerged PRs (I may be wrong). From what I see, requires some careful selection of IBLT parameters
  • A rate-compatible solution to the set reconciliation problem a newer approach using multi-edge-type (MET) IBLTs that doesn't require estimation of the size of symmetric difference between two sets. Didn't find any implementations so far
  • Range-based Set Reconciliation (also a simpler explanation) a promising technique requiring less computational resources, but also a number of communication rounds logarithmic to the size of the symmetric difference between sets. Has several working implementations in actively maintained projects
  • SREP: Out-Of-Band Sync of Transaction Pools for Large-Scale Blockchains discusses a network-wide sync algorithm for transaction pools (works on top of any pairwise peer sync algorithm) which might be useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants