Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): support vnode hint in storage table #3628

Merged
merged 13 commits into from
Jul 5, 2022

Conversation

BugenZhao
Copy link
Member

@BugenZhao BugenZhao commented Jul 4, 2022

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR adds support for vnode hint in the storage table. We'll always try computing vnode based on the pk prefix before iteration to avoid scanning all vnodes.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

@BugenZhao BugenZhao requested review from xxchan, kwannoel and wcy-fdu July 4, 2022 09:45
@@ -603,7 +569,7 @@ impl<S: StateStore, E: Encoding, const T: AccessType> StorageTableBase<S, E, T>
} else {
// Should use excluded next key for end bound.
// Otherwise keys starting with the bound is not included.
Excluded(next_key(&serialized_key))
end_bound_of_prefix(&serialized_key)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we also fix that, if serialized_key is \xff, we should return Unbounded here instead of Excluded(b""). cc @xxchan

pub(super) async fn batch_iter_with_encoded_key_range<R, B>(
/// Iterates on the table with the given prefix of the pk in `pk_prefix` and the range bounds of
/// the next primary key column in `next_col_bounds`.
// TODO: support multiple datums or `Row` for `next_col_bounds`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we support multiple datums in next_col_bounds here? So that all cases can reuse this function? 🤣 cc @xxchan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think other cases are simple. Reusing this function does no benefit to them 🥸

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you already removed iter_with_pk_prefix... 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when will next_col_bounds for multiple columns needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If next_col_bounds can be multiple columns, then all of the scan patterns can be formalized by this method, including arbitrary pk bound scan.

@BugenZhao BugenZhao requested a review from xx01cyx July 4, 2022 09:53
@codecov
Copy link

codecov bot commented Jul 5, 2022

Codecov Report

Merging #3628 (b9a6adc) into main (3541fa1) will decrease coverage by 0.00%.
The diff coverage is 81.60%.

@@            Coverage Diff             @@
##             main    #3628      +/-   ##
==========================================
- Coverage   74.30%   74.30%   -0.01%     
==========================================
  Files         788      788              
  Lines      111653   111369     -284     
==========================================
- Hits        82967    82750     -217     
+ Misses      28686    28619      -67     
Flag Coverage Δ
rust 74.30% <81.60%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/batch/src/executor/row_seq_scan.rs 12.88% <0.00%> (+0.15%) ⬆️
src/storage/hummock_sdk/src/key.rs 86.30% <ø> (+0.68%) ⬆️
src/storage/src/table/test_relational_table.rs 97.81% <ø> (-0.33%) ⬇️
src/storage/src/table/storage_table.rs 85.83% <82.81%> (+13.89%) ⬆️
src/common/src/util/ordered/serde.rs 91.01% <83.33%> (-0.33%) ⬇️
src/storage/src/table/state_table.rs 97.85% <92.85%> (+0.81%) ⬆️
src/meta/src/model/barrier.rs 86.66% <0.00%> (-3.34%) ⬇️
src/meta/src/manager/id.rs 94.94% <0.00%> (-1.69%) ⬇️
src/common/src/array/data_chunk_iter.rs 79.74% <0.00%> (-0.95%) ⬇️
src/meta/src/barrier/mod.rs 81.35% <0.00%> (-0.20%) ⬇️
... and 9 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Copy link
Contributor

@xx01cyx xx01cyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM!

src/storage/src/table/storage_table.rs Outdated Show resolved Hide resolved
@BugenZhao BugenZhao requested a review from skyzh July 5, 2022 11:45
@mergify mergify bot merged commit 085d256 into main Jul 5, 2022
@mergify mergify bot deleted the bz/vnode-in-key-part-8 branch July 5, 2022 14:21
nasnoisaac pushed a commit to nasnoisaac/risingwave that referenced this pull request Aug 9, 2022
* use row for cell based table iter interfaces

Signed-off-by: Bugen Zhao <[email protected]>

* compute vnode hint

Signed-off-by: Bugen Zhao <[email protected]>

* refine docs

Signed-off-by: Bugen Zhao <[email protected]>

* remove tests for state table pk bounds iter

Signed-off-by: Bugen Zhao <[email protected]>

* fix typo

Signed-off-by: Bugen Zhao <[email protected]>

* Update src/storage/src/table/storage_table.rs

* trigger ci

Signed-off-by: Bugen Zhao <[email protected]>

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

consistent-hash: support reading by single vnode for specific states
3 participants