Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for building merkle multiproofs #16

Merged
merged 1 commit into from
Jun 23, 2022
Merged

Add support for building merkle multiproofs #16

merged 1 commit into from
Jun 23, 2022

Conversation

etan-status
Copy link
Contributor

@etan-status etan-status commented Dec 6, 2021

This adds functionality for building merkle multiproofs in the same form
that the Ethereum consensus specs suggest, i.e., in descending order of
helper indices. Merkle multiproofs are useful for the light client sync.
https://github.com/ethereum/consensus-specs/blob/v1.1.6/ssz/merkle-proofs.md#merkle-multiproofs

Copy link
Contributor

@zah zah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come there are no tests here?

ssz_serialization/merkleization.nim Outdated Show resolved Hide resolved
ssz_serialization/merkleization.nim Outdated Show resolved Hide resolved
ssz_serialization/merkleization.nim Outdated Show resolved Hide resolved
@etan-status
Copy link
Contributor Author

Added tests.

@etan-status
Copy link
Contributor Author

etan-status commented Dec 8, 2021

New test coverage:

rm -rf coverage && mkdir -p coverage && \
    nim c -d:PREFER_BLST_SHA256=false --nimcache:coverage/nimcache \
        --passC:-fprofile-arcs --passC:-ftest-coverage --passL:-fprofile-arcs --passL:-ftest-coverage \
        -r vendor/nim-ssz-serialization/tests/test_merkleization_types.nim && \
    lcov --capture --directory coverage/nimcache --output-file coverage/coverage.info && \
    lcov --extract coverage/coverage.info "*/vendor/nim-ssz-serialization/*" \
        --output-file coverage/coverage.f.info && \
    genhtml coverage/coverage.f.info --output-directory coverage/output
Overall coverage rate:
  lines......: 93.9% (939 of 1000 lines)
  functions..: 98.2% (161 of 164 functions)

@etan-status
Copy link
Contributor Author

Wait for #19 before merging this.

@etan-status etan-status marked this pull request as draft December 16, 2021 10:44
@etan-status etan-status marked this pull request as ready for review December 17, 2021 16:19
ssz_serialization/merkleization.nim Outdated Show resolved Hide resolved
ssz_serialization/merkleization.nim Outdated Show resolved Hide resolved
ssz_serialization/merkleization.nim Show resolved Hide resolved
ssz_serialization/merkleization.nim Show resolved Hide resolved
ssz_serialization/merkleization.nim Show resolved Hide resolved
ssz_serialization/merkleization.nim Show resolved Hide resolved
ssz_serialization/proofs.nim Show resolved Hide resolved
ssz_serialization/proofs.nim Show resolved Hide resolved
@etan-status
Copy link
Contributor Author

Thanks @zah for the review comments. I have addressed them in the latest push.

@etan-status
Copy link
Contributor Author

Nim devel checks are failing because of upstream bugs in the Nim compiler repo.

@tersec
Copy link
Contributor

tersec commented Jan 31, 2022

Nim devel checks are failing because of upstream bugs in the Nim compiler repo.

Is this an already-identified regression? One reason for having this Nim devel checks is so that they can be fixed upstream before Nimbus has to wait until Nim 1.x.2 or 1.x.3 before it even builds due to not having noticed these to be fixed in time for the Nim 1.x.0 or Nim 1.x.1 releases.

@stefantalpalaru
Copy link
Contributor

Is this an already-identified regression?

Yes, they fixed it here: nim-lang/Nim#19472

@etan-status
Copy link
Contributor Author

Re-ran CI.

@etan-status
Copy link
Contributor Author

Anything still holding this PR back?

This adds functionality for building merkle multiproofs in the same form
that the Ethereum consensus specs suggest, i.e., in descending order of
helper indices. Merkle multiproofs are useful for the light client sync.
https://github.com/ethereum/consensus-specs/blob/v1.1.6/ssz/merkle-proofs.md#merkle-multiproofs

Tested from `nimbus-eth2` root using:
```
nim c -r vendor/nim-ssz-serialization/tests/test_all.nim
```
@etan-status
Copy link
Contributor Author

  • Add std/ prefix to imports.

Copy link
Collaborator

@kdeme kdeme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me to be merged imo.

I do have a similar remark as @zah regarding the naming of hash_tree_root procs which are there to return the proofs rather than the final hash tree root.
I get it that each piece of the proof(s) is in its own a hash tree root, but based on the name, one could think that the final hash_tree_root call would then return the hash tree root, instead of the proof(s).

Initially I thought that the naming was fine, but then I noticed that the proof versions of these procs are actually all separate due to the indexing logic that needs to be done, which causes of no real code re-use with the original versions in there.
Which is great from the point of view of leaving that code untouched and making merging this less impactful (Existing code touched is actually rather limited).
But then I do think that for a lot of the code related to this indexing it would be more fitting to live in proofs.nim.

But perhaps I am missing some crucial details (I did rather skim fast over the indexing part and the new hashTreeRootAux proc). Anyhow, not necessarily something blocking.

func build_proof*(
anchor: auto,
indices: static openArray[GeneralizedIndex],
proof: var openArray[Digest]): Result[void, string] =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there still a reason for having the function signatures with proof: var openArray[Digest] and a returned void Result (and the same counts for the root_tree_hash counterparts)?

When there is already the need for Result[void, string] (or boolean for that matter), perhaps that type of return should be only provided, considering it is a safer practice (caller must verify the result to access the proofs).

Copy link
Contributor Author

@etan-status etan-status Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it could save some copying because the result can be directly put into the destination, but semantically these two are the same.

var foo: SomeObject
? state.build_proof(5.GeneralizedIndex, foo.proof)
foo.proof = ? state.build_proof(5.GeneralizedIndex)

@etan-status
Copy link
Contributor Author

I do have a similar remark as @zah regarding the naming of hash_tree_root procs which are there to return the proofs rather than the final hash tree root. I get it that each piece of the proof(s) is in its own a hash tree root, but based on the name, one could think that the final hash_tree_root call would then return the hash tree root, instead of the proof(s).

hash_tree_root does not compute a proof. It returns the root at index 1 by default, or the roots at custom given indices (like, if you pass 13, 42, the result will contain the root at index 13 and the root at index 42).
Note that the terminology of root for intermediate hashes, while it seems a bit backward, seems to be also used by the EF, e.g., see ethereum/consensus-specs#2629 (comment) -- I used to have this same concern as well.

Alternative names could be hash_tree_root_at_indices, hash_tree_subroot (but it can also do index 1), hash_tree_leaves (but it can also do intermediate hashes above the leaves). Or maybe there are better ideas?

Initially I thought that the naming was fine, but then I noticed that the proof versions of these procs are actually all separate due to the indexing logic that needs to be done, which causes of no real code re-use with the original versions in there. Which is great from the point of view of leaving that code untouched and making merging this less impactful (Existing code touched is actually rather limited). But then I do think that for a lot of the code related to this indexing it would be more fitting to live in proofs.nim.

Yes, this is just an implementation detail though. The regular hash_tree_root is equivalent to hash_tree_root(1.GeneralizedIndex), so it could also use the new implementation. However, due to the top root being requested much more frequently, I think having the 1 case implemented separately allows for more optimization. The cost is that for types where non-1 indices are requested that two copies of the hashing logic are emitted (one for top root, and one for any other index), but it's not many types that we use like that.

@kdeme
Copy link
Collaborator

kdeme commented Jun 21, 2022

hash_tree_root does not compute a proof. It returns the root at index 1 by default, or the roots at custom given indices (like, if you pass 13, 42, the result will contain the root at index 13 and the root at index 42).

OK, fair enough. I guess what I was trying to say is that you can make it provided a proof by giving it the right indexes as parameter.

Note that the terminology of root for intermediate hashes, while it seems a bit backward, seems to be also used by the EF, e.g., see ethereum/consensus-specs#2629 (comment) -- I used to have this same concern as well.

I was not aware of that terminology. I saw hash_tree_root solely as "compute the hash of the root of the tree". But as it appears to be an already established terminology in the eth2 specs for any intermediate being called a root, it seems (even more) fine to leave the naming as is.

Yes, this is just an implementation detail though. The regular hash_tree_root is equivalent to hash_tree_root(1.GeneralizedIndex), so it could also use the new implementation.

Right, that's a good point and argument for keeping the code where it is.

@kdeme kdeme merged commit da3c08c into status-im:master Jun 23, 2022
@etan-status etan-status deleted the merkle-multiproof branch June 23, 2022 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants