Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds feature to enable chained Merkle shreds #34916

Merged

Conversation

behzadnouri
Copy link
Contributor

Problem

During a cluster upgrade when only half of the cluster can ingest the new shred variant, sending shreds of the new variant can cause nodes to diverge.

Summary of Changes

Added feature to enable chained Merkle shreds explicitly.

Copy link

codecov bot commented Jan 23, 2024

Codecov Report

Attention: 46 lines in your changes are missing coverage. Please review.

Comparison is base (5da06c5) 81.6% compared to head (1e69ef5) 81.6%.
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master   #34916     +/-   ##
=========================================
- Coverage    81.6%    81.6%   -0.1%     
=========================================
  Files         830      830             
  Lines      224746   224901    +155     
=========================================
+ Hits       183512   183609     +97     
- Misses      41234    41292     +58     

Copy link
Contributor

@AshwinSekar AshwinSekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple nits

@@ -960,6 +964,7 @@ lazy_static! {
(enable_zk_proof_from_account::id(), "Enable zk token proof program to read proof from accounts instead of instruction data #34750"),
(curve25519_restrict_msm_length::id(), "restrict curve25519 multiscalar multiplication vector lengths #34763"),
(cost_model_requested_write_lock_cost::id(), "cost model uses number of requested write locks #34819"),
(enable_chained_merkle_shreds::id(), "Enable chained Merkle shreds #"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pr number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#[must_use]
fn should_drop_legacy_shreds(
fn check_feature_activation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could merge with cluster_nodes::check_feature_activation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one uses a root_bank for less verbosity.
This one uses feature_set and epoch_schedule because holding onto root bank here has these issues: #33078

We can't make cluster_nodes one call this one because it adds dependency on core crate which we want to avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we make the cluster_nodes one also use feature_set and epoch_schedule so that both callers can use it?
this code is also duplicated in duplicate_shred_handler, but because cluster_nodes is in solana-turbine it would have added a circular dependency.
perhaps the best thing would be to move this into feature_set so that everyone can use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we make the cluster_nodes one also use feature_set and epoch_schedule so that both callers can use it?

The expanded form here is an unfortunate consequence of #33078 and is both more verbose and less self-contained; I feel like cluster_nodes one taking the root-bank as the argument is already the better code.

perhaps the best thing would be to move this into feature_set so that everyone can use it

I wouldn't suggest doing that because this one epoch lag is only relevant when working with raw shreds and we don't want to encourage using it in other instances.

Either way, this commit is only renaming existing code and adding a single argument. We can address the code duplication (which pre-exists this commit) separately.
We also need to keep the code change small, because if we are targeting v1.18 for these patches, then this needs to be backported.

@behzadnouri behzadnouri force-pushed the enable-chained-merkle-shreds branch from f379050 to 1e69ef5 Compare January 26, 2024 16:46
@behzadnouri behzadnouri merged commit d4fdcd9 into solana-labs:master Jan 27, 2024
45 checks passed
@behzadnouri behzadnouri deleted the enable-chained-merkle-shreds branch January 27, 2024 15:03
@behzadnouri behzadnouri added the v1.18 PRs that should be backported to v1.18 label Feb 5, 2024
Copy link
Contributor

mergify bot commented Feb 5, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

mergify bot pushed a commit that referenced this pull request Feb 5, 2024
During a cluster upgrade when only half of the cluster can ingest the new shred
variant, sending shreds of the new variant can cause nodes to diverge.
The commit adds a feature to enable chained Merkle shreds explicitly.

(cherry picked from commit d4fdcd9)
mergify bot added a commit that referenced this pull request Feb 5, 2024
…) (#35083)

adds feature to enable chained Merkle shreds (#34916)

During a cluster upgrade when only half of the cluster can ingest the new shred
variant, sending shreds of the new variant can cause nodes to diverge.
The commit adds a feature to enable chained Merkle shreds explicitly.

(cherry picked from commit d4fdcd9)

Co-authored-by: behzad nouri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1.18 PRs that should be backported to v1.18
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants