Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(state): Fix the deduplication of note commitment trees #7379

Merged
merged 21 commits into from
Aug 28, 2023
Merged

Conversation

upbqdn
Copy link
Member

@upbqdn upbqdn commented Aug 24, 2023

Motivation & Solution

  • Refactor the deduplication.
  • Include a test that checks that there are no duplicates.

This PR merges #7365.

Reviewer Checklist

  • Will the PR name make sense to users?
    • Does it need extra CHANGELOG info? (new features, breaking changes, large changes)
  • Are the PR labels correct?
  • Does the code do what the ticket and PR says?
    • Does it change concurrent code, unsafe code, or consensus rules?
  • How do you know it works? Does it have tests?

@upbqdn upbqdn added C-bug Category: This is a bug A-rust Area: Updates to Rust code P-High 🔥 A-state Area: State / database changes labels Aug 24, 2023
@upbqdn upbqdn requested a review from a team as a code owner August 24, 2023 14:28
@upbqdn upbqdn self-assigned this Aug 24, 2023
@upbqdn upbqdn requested review from arya2 and teor2345 and removed request for a team August 24, 2023 14:28
teor2345

This comment was marked as resolved.

@teor2345

This comment was marked as resolved.

@teor2345

This comment was marked as resolved.

@teor2345
Copy link
Contributor

With this PR, #7385, and #7386, it takes about 35 seconds on my machine to upgrade a fully synced mainnet state and do the runtime tests.

The sync is launched concurrently, but the network hasn't even fully started by then, so I think that performance is acceptable.

On testnet it takes about 10 seconds.

@teor2345
Copy link
Contributor

When I run using all 3 PRs on an old mainnet state that was synced to 1719530, the deduplication takes about 7 seconds.

So I think we're fine for performance here.

upbqdn added 2 commits August 25, 2023 17:42
These checks are superseded by `check_for_duplicate_trees`
@upbqdn
Copy link
Member Author

upbqdn commented Aug 25, 2023

Arya told me there were some performance issues in the initial version of this PR.

The issue was that I was iterating over the trees using next_back for each tree. I never ran the upgrade for more than 12 hours, but it would take way longer than that.

previous state versions didn't cache/serialize the root for empty trees? Or was that just the genesis tree?

That was just the genesis tree. The rest of the trees were stored with their cached roots.

@teor2345
Copy link
Contributor

Arya told me there were some performance issues in the initial version of this PR.

The issue was that I was iterating over the trees using next_back for each tree. I never ran the upgrade for more than 12 hours, but it would take way longer than that.

previous state versions didn't cache/serialize the root for empty trees? Or was that just the genesis tree?

That was just the genesis tree. The rest of the trees were stored with their cached roots.

Did we update the genesis tree to have a cached root as part of this upgrade or the previous one?

I don't know how often the genesis root is accessed, but it seems like a quick fix for a significant user-triggerable performance/security issue. (It should probably go in a separate PR so we can get this one merged.)

@upbqdn
Copy link
Member Author

upbqdn commented Aug 28, 2023

Did we update the genesis tree to have a cached root as part of this upgrade or the previous one?

We didn't.

Copy link
Contributor

@arya2 arya2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me once we merge in #7391.

What's this test failure:

2023-08-28T02:53:02.693713Z ERROR zebra_state::service::finalized_state::disk_format::upgrade: found duplicate sapling trees after running de-duplicate tree upgrade height=Height(25900) prev_height=Height(25899) tree_root=Root("fbc2f4300c01f0b7820d00e3347c8da4ee614674376cbc45359daa54f9b5493e")
https://github.com/ZcashFoundation/zebra/actions/runs/5994564430/job/16257493713#step:8:26063

(This is from the suggestion PR, I think it's not re-running the upgrade because the disk version was updated earlier.)

upbqdn and others added 2 commits August 28, 2023 20:47
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@upbqdn

This comment was marked as resolved.

@upbqdn

This comment was marked as resolved.

@upbqdn
Copy link
Member Author

upbqdn commented Aug 28, 2023

What's this test failure

That was due to the bug described in PR #7385. Since the cached state hasn't been upgraded, Zebra incorrectly determined it was a new state, and labeled it as 25.1.1, which was the version in the code. Zebra then skipped the pruning since it's for lower versions, but the test that checks for duplicates runs at each start, so it reported duplicates starting from height 1. This should be resolved by merging PR #7385.

@upbqdn
Copy link
Member Author

upbqdn commented Aug 28, 2023

@Mergifyio update.

@mergify
Copy link
Contributor

mergify bot commented Aug 28, 2023

update .

✅ Branch has been successfully updated

@mergify mergify bot merged commit 2ea994a into main Aug 28, 2023
@mergify mergify bot deleted the fix-tree-dedup branch August 28, 2023 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Area: Updates to Rust code A-state Area: State / database changes C-bug Category: This is a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants