Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: apply relational refactor for hash agg (max, min) #2999

Merged
merged 10 commits into from
Jun 20, 2022

Conversation

BowenXiao1999
Copy link
Contributor

@BowenXiao1999 BowenXiao1999 commented Jun 6, 2022

What's changed and what's your intention?

Sorry for the huge PR, but this should be the min efforts to introduce the relational refactor. Mainly change is the AggState interface (mostlyget_output), now we pass StateTable as & or &mut.

After this PR, besides StringAgg (we currently do not have e2e test in it so i prefer not include here), all other agg call should use the relational table.

  • Change the function signature. Add &StateTable or &mut StateTable
    • get_output(epoch) -> get_output(epoch, &StateTable<S>). The states may need to fetch data from remote store.
    • Same for row_count, mark_as_dirty.
  • For HashAgg, I wrap a Arc<Mutex<>> for all State Tables cuz I do not want to solve too difficult Multi-thread problems in this PR. Discussed and we thought apply_batch (Mostly mem_table write) should not be parallelized so the lock is not a significant problem.

There will be a lot of TODOs to make code more simple and clear:
See #3235.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

@BowenXiao1999 BowenXiao1999 marked this pull request as draft June 6, 2022 08:17
@BowenXiao1999 BowenXiao1999 changed the title feat: apply relational refactor for hash agg (max, min) [WIP] feat: apply relational refactor for hash agg (max, min) Jun 6, 2022
@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch 2 times, most recently from c0a5902 to 2a7e515 Compare June 8, 2022 09:05
@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch 4 times, most recently from 4894a77 to e37abc6 Compare June 15, 2022 05:26
@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #2999 (d577da0) into main (bd48bba) will increase coverage by 0.07%.
The diff coverage is n/a.

❗ Current head d577da0 differs from pull request most recent head c052e9c. Consider uploading reports for the commit c052e9c to get more accurate results

@@            Coverage Diff             @@
##             main    #2999      +/-   ##
==========================================
+ Coverage   73.15%   73.23%   +0.07%     
==========================================
  Files         756      748       -8     
  Lines      102521   101766     -755     
==========================================
- Hits        75003    74525     -478     
+ Misses      27518    27241     -277     
Flag Coverage Δ
rust 73.23% <0.00%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/meta/src/barrier/command.rs 75.59% <0.00%> (-2.37%) ⬇️
src/stream/src/executor/managed_state/join/mod.rs 89.88% <0.00%> (-1.45%) ⬇️
src/meta/src/hummock/mock_hummock_meta_client.rs 41.48% <0.00%> (-1.07%) ⬇️
src/storage/src/monitor/state_store_metrics.rs 85.06% <0.00%> (-1.02%) ⬇️
src/connector/src/macros.rs 8.16% <0.00%> (-0.93%) ⬇️
src/meta/src/stream/stream_graph.rs 79.72% <0.00%> (-0.71%) ⬇️
src/storage/src/hummock/local_version_manager.rs 83.86% <0.00%> (-0.16%) ⬇️
src/frontend/src/test_utils.rs 92.54% <0.00%> (-0.13%) ⬇️
src/frontend/src/catalog/table_catalog.rs 98.13% <0.00%> (-0.13%) ⬇️
src/meta/src/stream/source_manager.rs 26.34% <0.00%> (-0.12%) ⬇️
... and 44 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch 2 times, most recently from 4b43658 to 5c476f8 Compare June 15, 2022 07:26
@BowenXiao1999 BowenXiao1999 changed the title [WIP] feat: apply relational refactor for hash agg (max, min) feat: apply relational refactor for hash agg (max, min) Jun 15, 2022
@BowenXiao1999 BowenXiao1999 marked this pull request as ready for review June 15, 2022 07:33
@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch from 5c476f8 to e165024 Compare June 15, 2022 07:34
@@ -204,6 +204,10 @@ impl<S: StateStore> StateTable<S> {
self.iter_with_encoded_key_bounds(encoded_key_bounds, epoch)
.await
}

pub fn is_dirty(&self) -> bool {
!self.mem_table.buffer.is_empty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we can remove get_mem_table()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. NTFS

@BowenXiao1999
Copy link
Contributor Author

There are a lot to be refactored/removed. Tracked in: #3235

@@ -317,42 +357,26 @@ where
}

/// Flush the internal state to a write batch.
fn flush_inner(&mut self, write_batch: &mut WriteBatch<S>) -> StreamExecutorResult<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flush now becomes very clean. That's why I think maybe in future we should remove flush on ManagedState

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO here? The original doc seems confusing now. 😄

Copy link
Contributor

@wcy-fdu wcy-fdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM!

@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch from 0bc7a0e to d8b15d2 Compare June 16, 2022 03:23
@BowenXiao1999 BowenXiao1999 enabled auto-merge (squash) June 16, 2022 03:23
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM.

src/stream/src/executor/aggregation/agg_state.rs Outdated Show resolved Hide resolved
@@ -124,23 +130,22 @@ where
/// always be retained when flushing the managed state. Otherwise, we will only retain n entries
/// after each flush.
pub async fn new(
keyspace: Keyspace<S>,
_keyspace: Keyspace<S>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may remove the generic for the struct and add it to every functions accepting the state table. So that there's no need for phantom data as well.

.await?;
pin_mut!(all_data_iter);

for _ in 0..self.top_n_count.unwrap_or(usize::MAX) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If top_n_count is None, we should keep nothing in the cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe It should not be a Option... Let's change in other PRs

@@ -317,42 +357,26 @@ where
}

/// Flush the internal state to a write batch.
fn flush_inner(&mut self, write_batch: &mut WriteBatch<S>) -> StreamExecutorResult<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO here? The original doc seems confusing now. 😄

Comment on lines +81 to 80
/// TODO: Remove this soon.
serializer: ExtremeSerializer<A::OwnedItem, EXTREME_TYPE>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's store the pk length and it seems we can remove this now. Then the whole module can be cleaned up as well.

@@ -279,23 +280,29 @@ impl<K: HashKey, S: StateStore> HashAggExecutor<K, S> {
input_pk_data_types.clone(),
epoch,
Some(hash_code),
state_tables,
&*state_tables.read().await,
Copy link
Member

@BugenZhao BugenZhao Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a RwLock, it seems the fetch process cannot be concurrent if some task is applying the chunk with a write guard?

Copy link
Member

@BugenZhao BugenZhao Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may move the apply part outside of the future, or introduce a new rwlock-wrapped state table so that there's no need to pass state tables as arguments everywhere. 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@BugenZhao BugenZhao disabled auto-merge June 16, 2022 03:33
@skyzh
Copy link
Contributor

skyzh commented Jun 16, 2022

Let's hold this PR for a while, I'll need more time to review it.

@BowenXiao1999
Copy link
Contributor Author

BowenXiao1999 commented Jun 16, 2022

I guess the main problem may comes from the API. Should we pass &StateTable as parameters or Store it as ArcMutex directly in state?

Let me brief some points:

  • Currently in this PR, due to Arc<Mutex in the state tables vec (apply_chunk), we can not read from remote store parallely. We can enable this after we take the apply_batch (needs mut ref) out from futures. After that, we only have read and no write so it can be parallelized. (Note We do not want multi-thread mem_table write)

  • Store StateTable directly in ManagedState (Note that it has not performace improvement compared with this PR, after we done the improvement above).
    Good: The API is more clean.
    Bad: We should be careful of flush/commit. Cuz different state might share one table. So we only commit once for state in same agg_call.

I choose this kind of design cuz I read this RFC: https://singularity-data.quip.com/buN6ASPdobmk/RFC-Split-Dirty-State-and-Cache. And we previously pass WriteBatch.

But I found the refactor cost is huge: A lot of trait methods are affected. So I' m not so strong on this and looking for suggestion.

Overall this two is same in performance theory, only differentiate in API.

@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch from d8b15d2 to 9e48c04 Compare June 16, 2022 09:58
Copy link
Contributor

@wcy-fdu wcy-fdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold this PR until CI pass.

@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch 2 times, most recently from 7075aff to f2e4773 Compare June 17, 2022 04:03
@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch from f2e4773 to c052e9c Compare June 20, 2022 03:29
@BowenXiao1999 BowenXiao1999 force-pushed the bw/apply-relational-refactor-for-hash-agg branch from c052e9c to d38f25a Compare June 20, 2022 03:38
@BowenXiao1999
Copy link
Contributor Author

Let's hold this PR for a while, I'll need more time to review it.

Any update 🥰? There are some small tweaks that might be done in this PR, if no big change.

@BowenXiao1999 BowenXiao1999 merged commit 9169436 into main Jun 20, 2022
@BowenXiao1999 BowenXiao1999 deleted the bw/apply-relational-refactor-for-hash-agg branch June 20, 2022 05:44
Little-Wallace added a commit that referenced this pull request Jun 21, 2022
commit 309ce36
Author: Bowen <[email protected]>
Date:   Tue Jun 21 13:19:44 2022 +0800

    refactor(agg): clean up unused fields & refactor (#3339)

    * refactor(agg): clean up unused fields

    * delete file

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit e48dced
Author: Shmiwy <[email protected]>
Date:   Tue Jun 21 12:55:21 2022 +0800

    feat(storage): support compression setting per level (#3362)

    Signed-off-by: Shmiwy <[email protected]>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 961936f
Author: Alex Chi <[email protected]>
Date:   Tue Jun 21 12:42:45 2022 +0800

    feat(test): parallelize sqlsmith test (#3360)

    * feat(test): parallelize sqlsmith test

    Signed-off-by: Alex Chi <[email protected]>

    * more tests

    Signed-off-by: Alex Chi <[email protected]>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit fa90541
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 12:25:33 2022 +0800

    chore(github): feature-request template should use the feature label (#3359)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 02405e1
Author: Bowen <[email protected]>
Date:   Tue Jun 21 12:13:07 2022 +0800

    style: add more comments & refactor on pg-wire  (#3358)

    style: add more comments on pg-wire code

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 1c09432
Author: Li0k <[email protected]>
Date:   Tue Jun 21 12:00:37 2022 +0800

    fix(storage): fix slow unit-test in compactor_test (#3357)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 19036a6
Author: xxchan <[email protected]>
Date:   Tue Jun 21 05:48:08 2022 +0200

    fix(binder): do not allow correlated subquery in join tables (#3352)

    * fix(binder): do not allow correlated subquery in join tables

    * clippy

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit c37c9c4
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 11:24:40 2022 +0800

    refactor: remove unnecessary lazy_static (#3353)

    Signed-off-by: TennyZhuang <[email protected]>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 68596ab
Author: Croxx <[email protected]>
Date:   Tue Jun 21 11:11:53 2022 +0800

    feat(cache): introduce LruCacheEventListener to subscribe erasure and eviction (#3334)

commit 5627f25
Author: Steven Chua <[email protected]>
Date:   Tue Jun 21 10:54:12 2022 +0800

    feat(ctl): Display SstableIdInfo and Block Metadata in sst-dump (#3338)

    * feat(ctl): Add sst-dump command to risectl

    * feat(ctl): Fix risectl compatibility and remove VNode info

    * feat(ctl): Add checksum and compression algo for each block

    * feat(ctl): Add SstableIdInfo data to sst-dump

    * feat(ctl): Fix compilation errors

    * feat(ctl): Fix compilation errors and bugs

commit 16ffd98
Author: Name1e5s <[email protected]>
Date:   Tue Jun 21 10:50:33 2022 +0800

    fix(expr): cast int16/int32/int64/float32 to float64 in floor/ceil/round (#3319)

    * fix(expr): cast int16/int32/int64/float32 to float64 in floor/ceil/round

    * fix plan

    Co-authored-by: TennyZhuang <[email protected]>

commit 964bb92
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 10:23:56 2022 +0800

    ci(Mergify): configuration update (#3355)

    Signed-off-by: null <[email protected]>

commit dac904e
Author: jon-chuang <[email protected]>
Date:   Tue Jun 21 09:57:33 2022 +0800

    feat(executor): streaming hyperloglog improvements (#3315)

    * minor

    * rename tests

    * minor

    * remove option, const eval of param, better comments, succint tests

commit cd4f302
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 09:34:54 2022 +0800

    ci(Mergify): configuration update (#3252)

    * ci(Mergify): configuration update

    Signed-off-by: null <[email protected]>

    * Update .mergify.yml

    * Update .mergify.yml

    * Update .mergify.yml

    Co-authored-by: xxchan <[email protected]>

commit 2aa7e8e
Author: Xinpeng Wei <[email protected]>
Date:   Mon Jun 20 22:11:51 2022 +0800

    feat(frontend): add InternalStateTable Catalog (#3139)

    * use TableMessage for internal table

    * fix risedev check

    * update planner test

    * fix unit test

    * fix misc check

    * fix ci

    * fix issues in PR comments

    * fix clippy

    * fix ci

    * update planner test

commit 1e190dd
Author: xxchan <[email protected]>
Date:   Mon Jun 20 14:16:12 2022 +0200

    fix(binder): do not allow correlated input ref in order by (#3346)

commit 263d770
Author: Tao Wu <[email protected]>
Date:   Mon Jun 20 18:54:25 2022 +0800

    fix: build failure caused by OptimzierContext::new (#3340)

commit 9f18401
Author: Tao Wu <[email protected]>
Date:   Mon Jun 20 17:47:22 2022 +0800

    feat: introduce the framework of sqlsmith (#3305)

commit 096a991
Author: Alex Chi <[email protected]>
Date:   Mon Jun 20 17:40:05 2022 +0800

    feat(ctl): add bench command (#3337)

    Signed-off-by: Alex Chi <[email protected]>

commit 075d596
Author: TennyZhuang <[email protected]>
Date:   Mon Jun 20 17:05:50 2022 +0800

    build: bump toolchain to 20220620 (#3324)

    * build: bump toolchain to 20220620

    Signed-off-by: TennyZhuang <[email protected]>

    * also update docker-compose

    Signed-off-by: TennyZhuang <[email protected]>

commit d864f30
Author: Wenzhuo Liu <[email protected]>
Date:   Mon Jun 20 16:38:21 2022 +0800

    feat: add output_indices to join executors (#3047)

commit 13b9d58
Author: StrikeW <[email protected]>
Date:   Mon Jun 20 16:29:56 2022 +0800

    feat(stream): enable append-only mv plan for kafka source (#3333)

commit ea386d3
Author: Liang <[email protected]>
Date:   Mon Jun 20 15:50:55 2022 +0800

    refactor(compaction): deprecate the HashStrategy for OverlapStrategy (#3331)

commit a9fba38
Author: Liang <[email protected]>
Date:   Mon Jun 20 15:34:25 2022 +0800

    fix(picker): fetch info from table_id field in sstableinfo (#3332)

commit 5d2bb42
Author: Bohan Zhang <[email protected]>
Date:   Mon Jun 20 14:55:59 2022 +0800

    test(stream): add ci for split change mutation in source (#3039)

    * stage

    Signed-off-by: tabVersion <[email protected]>

    * stage

    Signed-off-by: tabVersion <[email protected]>

    * add test

    Signed-off-by: tabVersion <[email protected]>

    * change e2e to datagen

    Signed-off-by: tabVersion <[email protected]>

    * stage

    Signed-off-by: tabVersion <[email protected]>

    * some bug to fix

    Signed-off-by: tabVersion <[email protected]>

    * fix async issue

    Signed-off-by: tabVersion <[email protected]>

    * add assert

    Signed-off-by: tabVersion <[email protected]>

commit 04fe6d6
Author: Liang <[email protected]>
Date:   Mon Jun 20 14:47:56 2022 +0800

    refactor(vnode bitmap): remove vnode bitmap in sst info (#3329)

commit c6d1288
Author: Li0k <[email protected]>
Date:   Mon Jun 20 14:29:56 2022 +0800

    feat(storage): add manual compaction picker for targeted compaction (#3288)

    * feat(storage): add ManualCompactionPicker

    * feat(storage): distinguish get_compaction_task for manual

    * feat(storage): meta client support more parameters for manual_compaction

    * chore(storage): add tracing and some notes

    * chore(storage): split manual_compaction_picker to independent file

    * feat(storage): fix target_input check pending and support manual_pick for dynamic_level_selector

    * fix(storage): internal_table_id include mv_id

    * fix(storage): fix picker check target_input_ssts pending

    * fix(storage): fix picker with total_file_size

commit d04954f
Author: Renjie Liu <[email protected]>
Date:   Mon Jun 20 14:27:01 2022 +0800

    fix(ci): Reduce log (#3330)

commit eca9239
Author: Bugen Zhao <[email protected]>
Date:   Mon Jun 20 13:59:56 2022 +0800

    refactor(storage): remove `Option` on pk serializer of cell-based table (#3328)

    * minor refactor

    Signed-off-by: Bugen Zhao <[email protected]>

    * remove option of pk serializer

    Signed-off-by: Bugen Zhao <[email protected]>

    * remove pk serializer in state table

    Signed-off-by: Bugen Zhao <[email protected]>

    * extract vnode compute

    Signed-off-by: Bugen Zhao <[email protected]>

    * remove into order types

    Signed-off-by: Bugen Zhao <[email protected]>

commit 9169436
Author: Bowen <[email protected]>
Date:   Mon Jun 20 13:44:10 2022 +0800

    feat: apply relational refactor for hash agg (max, min) (#2999)

    * feat: two closure can not get mut ref of same variable

    * use Arc::Mutex to wrap the state table

    * roll back string agg

    * add StateTable to get_output

    * finish basic coding (unit test failed)

    * finish basic coding

    * fix bug

    * show case

    * use empty Row for scan

    * tweak

commit 35bb16a
Author: Bugen Zhao <[email protected]>
Date:   Mon Jun 20 13:43:11 2022 +0800

    refactor: use packed bitmap struct for vnode bitmap (#3310)

    * use bitmap in streaming

    Signed-off-by: Bugen Zhao <[email protected]>

    * use bitmap in storage

    Signed-off-by: Bugen Zhao <[email protected]>

    * minor fix

    Signed-off-by: Bugen Zhao <[email protected]>

    * make bitmap optional

    Signed-off-by: Bugen Zhao <[email protected]>

commit be50f93
Author: Liang <[email protected]>
Date:   Mon Jun 20 13:27:37 2022 +0800

    feat(compaction): let compactor be unaware of vnode mapping (#3321)

commit 88258a6
Author: lmatz <[email protected]>
Date:   Sun Jun 19 21:31:01 2022 -0700

    doc: no need to manually check in PR from forks (#3325)

commit c590e18
Author: zwang28 <[email protected]>
Date:   Mon Jun 20 12:13:50 2022 +0800

    refactor(storage): split HummockVersion's levels by compaction group. (#3206)

commit f1c3298
Author: zwang28 <[email protected]>
Date:   Mon Jun 20 11:35:18 2022 +0800

    feat(meta): register source to compaction group manager (#3300)

commit bf08b54
Author: Zack <[email protected]>
Date:   Mon Jun 20 11:25:39 2022 +0800

    feat(frontend): Add sql string into context for debugging (#3312)

    * feat(frontend): Add sql string into context for debugging

    * Remove renaming

    * Refactor to use str

commit f784ba3
Author: zwang28 <[email protected]>
Date:   Mon Jun 20 11:22:12 2022 +0800

    feat(storage): shared buffer flush L0 by compaction group (#3200)

commit bd48bba
Author: Kexiang Wang <[email protected]>
Date:   Sun Jun 19 07:48:45 2022 -0400

    feat: modify interfaces to support specifying parallelism for each fr… (#3283)

    feat: modify interfaces to support specifying parallelism for each fragment

commit 8f0e0b2
Author: Steven Chua <[email protected]>
Date:   Sun Jun 19 12:56:51 2022 +0800

    feat(ctl): Support basic sst dump in risectl (#3309)

    * feat(ctl): Add sst-dump command to risectl

    * feat(ctl): Fix risectl compatibility and remove VNode info

commit daf9222
Author: Alex Chi <[email protected]>
Date:   Sat Jun 18 21:41:44 2022 +0800

    feat(risedev): generate risectl config (#3318)

    * feat(risedev): generate risectl config

    Signed-off-by: Alex Chi <[email protected]>

    * fix

    Signed-off-by: Alex Chi <[email protected]>

commit 86ff992
Author: Alex Chi <[email protected]>
Date:   Sat Jun 18 20:52:51 2022 +0800

    feat(ctl): support table scan (#3317)

    * feat(ctl): support table scan

    Signed-off-by: Alex Chi <[email protected]>

    * license header

    Signed-off-by: Alex Chi <[email protected]>

    * add docs

    Signed-off-by: Alex Chi <[email protected]>

commit 5ac5637
Author: Yikun Chen <[email protected]>
Date:   Sat Jun 18 08:03:06 2022 -0400

    feat: support interval comparison (#3222)

    1. fix timestamp substract timestamp.
    2. support interval comparison. From pgsql, 1 month equal to 30 days and 1 day equal to 86400000 ms.

Signed-off-by: Little-Wallace <[email protected]>
Little-Wallace added a commit to Little-Wallace/risingwave that referenced this pull request Jun 21, 2022
commit 309ce36
Author: Bowen <[email protected]>
Date:   Tue Jun 21 13:19:44 2022 +0800

    refactor(agg): clean up unused fields & refactor (risingwavelabs#3339)

    * refactor(agg): clean up unused fields

    * delete file

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit e48dced
Author: Shmiwy <[email protected]>
Date:   Tue Jun 21 12:55:21 2022 +0800

    feat(storage): support compression setting per level (risingwavelabs#3362)

    Signed-off-by: Shmiwy <[email protected]>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 961936f
Author: Alex Chi <[email protected]>
Date:   Tue Jun 21 12:42:45 2022 +0800

    feat(test): parallelize sqlsmith test (risingwavelabs#3360)

    * feat(test): parallelize sqlsmith test

    Signed-off-by: Alex Chi <[email protected]>

    * more tests

    Signed-off-by: Alex Chi <[email protected]>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit fa90541
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 12:25:33 2022 +0800

    chore(github): feature-request template should use the feature label (risingwavelabs#3359)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 02405e1
Author: Bowen <[email protected]>
Date:   Tue Jun 21 12:13:07 2022 +0800

    style: add more comments & refactor on pg-wire  (risingwavelabs#3358)

    style: add more comments on pg-wire code

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 1c09432
Author: Li0k <[email protected]>
Date:   Tue Jun 21 12:00:37 2022 +0800

    fix(storage): fix slow unit-test in compactor_test (risingwavelabs#3357)

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 19036a6
Author: xxchan <[email protected]>
Date:   Tue Jun 21 05:48:08 2022 +0200

    fix(binder): do not allow correlated subquery in join tables (risingwavelabs#3352)

    * fix(binder): do not allow correlated subquery in join tables

    * clippy

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit c37c9c4
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 11:24:40 2022 +0800

    refactor: remove unnecessary lazy_static (risingwavelabs#3353)

    Signed-off-by: TennyZhuang <[email protected]>

    Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

commit 68596ab
Author: Croxx <[email protected]>
Date:   Tue Jun 21 11:11:53 2022 +0800

    feat(cache): introduce LruCacheEventListener to subscribe erasure and eviction (risingwavelabs#3334)

commit 5627f25
Author: Steven Chua <[email protected]>
Date:   Tue Jun 21 10:54:12 2022 +0800

    feat(ctl): Display SstableIdInfo and Block Metadata in sst-dump (risingwavelabs#3338)

    * feat(ctl): Add sst-dump command to risectl

    * feat(ctl): Fix risectl compatibility and remove VNode info

    * feat(ctl): Add checksum and compression algo for each block

    * feat(ctl): Add SstableIdInfo data to sst-dump

    * feat(ctl): Fix compilation errors

    * feat(ctl): Fix compilation errors and bugs

commit 16ffd98
Author: Name1e5s <[email protected]>
Date:   Tue Jun 21 10:50:33 2022 +0800

    fix(expr): cast int16/int32/int64/float32 to float64 in floor/ceil/round (risingwavelabs#3319)

    * fix(expr): cast int16/int32/int64/float32 to float64 in floor/ceil/round

    * fix plan

    Co-authored-by: TennyZhuang <[email protected]>

commit 964bb92
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 10:23:56 2022 +0800

    ci(Mergify): configuration update (risingwavelabs#3355)

    Signed-off-by: null <[email protected]>

commit dac904e
Author: jon-chuang <[email protected]>
Date:   Tue Jun 21 09:57:33 2022 +0800

    feat(executor): streaming hyperloglog improvements (risingwavelabs#3315)

    * minor

    * rename tests

    * minor

    * remove option, const eval of param, better comments, succint tests

commit cd4f302
Author: TennyZhuang <[email protected]>
Date:   Tue Jun 21 09:34:54 2022 +0800

    ci(Mergify): configuration update (risingwavelabs#3252)

    * ci(Mergify): configuration update

    Signed-off-by: null <[email protected]>

    * Update .mergify.yml

    * Update .mergify.yml

    * Update .mergify.yml

    Co-authored-by: xxchan <[email protected]>

commit 2aa7e8e
Author: Xinpeng Wei <[email protected]>
Date:   Mon Jun 20 22:11:51 2022 +0800

    feat(frontend): add InternalStateTable Catalog (risingwavelabs#3139)

    * use TableMessage for internal table

    * fix risedev check

    * update planner test

    * fix unit test

    * fix misc check

    * fix ci

    * fix issues in PR comments

    * fix clippy

    * fix ci

    * update planner test

commit 1e190dd
Author: xxchan <[email protected]>
Date:   Mon Jun 20 14:16:12 2022 +0200

    fix(binder): do not allow correlated input ref in order by (risingwavelabs#3346)

commit 263d770
Author: Tao Wu <[email protected]>
Date:   Mon Jun 20 18:54:25 2022 +0800

    fix: build failure caused by OptimzierContext::new (risingwavelabs#3340)

commit 9f18401
Author: Tao Wu <[email protected]>
Date:   Mon Jun 20 17:47:22 2022 +0800

    feat: introduce the framework of sqlsmith (risingwavelabs#3305)

commit 096a991
Author: Alex Chi <[email protected]>
Date:   Mon Jun 20 17:40:05 2022 +0800

    feat(ctl): add bench command (risingwavelabs#3337)

    Signed-off-by: Alex Chi <[email protected]>

commit 075d596
Author: TennyZhuang <[email protected]>
Date:   Mon Jun 20 17:05:50 2022 +0800

    build: bump toolchain to 20220620 (risingwavelabs#3324)

    * build: bump toolchain to 20220620

    Signed-off-by: TennyZhuang <[email protected]>

    * also update docker-compose

    Signed-off-by: TennyZhuang <[email protected]>

commit d864f30
Author: Wenzhuo Liu <[email protected]>
Date:   Mon Jun 20 16:38:21 2022 +0800

    feat: add output_indices to join executors (risingwavelabs#3047)

commit 13b9d58
Author: StrikeW <[email protected]>
Date:   Mon Jun 20 16:29:56 2022 +0800

    feat(stream): enable append-only mv plan for kafka source (risingwavelabs#3333)

commit ea386d3
Author: Liang <[email protected]>
Date:   Mon Jun 20 15:50:55 2022 +0800

    refactor(compaction): deprecate the HashStrategy for OverlapStrategy (risingwavelabs#3331)

commit a9fba38
Author: Liang <[email protected]>
Date:   Mon Jun 20 15:34:25 2022 +0800

    fix(picker): fetch info from table_id field in sstableinfo (risingwavelabs#3332)

commit 5d2bb42
Author: Bohan Zhang <[email protected]>
Date:   Mon Jun 20 14:55:59 2022 +0800

    test(stream): add ci for split change mutation in source (risingwavelabs#3039)

    * stage

    Signed-off-by: tabVersion <[email protected]>

    * stage

    Signed-off-by: tabVersion <[email protected]>

    * add test

    Signed-off-by: tabVersion <[email protected]>

    * change e2e to datagen

    Signed-off-by: tabVersion <[email protected]>

    * stage

    Signed-off-by: tabVersion <[email protected]>

    * some bug to fix

    Signed-off-by: tabVersion <[email protected]>

    * fix async issue

    Signed-off-by: tabVersion <[email protected]>

    * add assert

    Signed-off-by: tabVersion <[email protected]>

commit 04fe6d6
Author: Liang <[email protected]>
Date:   Mon Jun 20 14:47:56 2022 +0800

    refactor(vnode bitmap): remove vnode bitmap in sst info (risingwavelabs#3329)

commit c6d1288
Author: Li0k <[email protected]>
Date:   Mon Jun 20 14:29:56 2022 +0800

    feat(storage): add manual compaction picker for targeted compaction (risingwavelabs#3288)

    * feat(storage): add ManualCompactionPicker

    * feat(storage): distinguish get_compaction_task for manual

    * feat(storage): meta client support more parameters for manual_compaction

    * chore(storage): add tracing and some notes

    * chore(storage): split manual_compaction_picker to independent file

    * feat(storage): fix target_input check pending and support manual_pick for dynamic_level_selector

    * fix(storage): internal_table_id include mv_id

    * fix(storage): fix picker check target_input_ssts pending

    * fix(storage): fix picker with total_file_size

commit d04954f
Author: Renjie Liu <[email protected]>
Date:   Mon Jun 20 14:27:01 2022 +0800

    fix(ci): Reduce log (risingwavelabs#3330)

commit eca9239
Author: Bugen Zhao <[email protected]>
Date:   Mon Jun 20 13:59:56 2022 +0800

    refactor(storage): remove `Option` on pk serializer of cell-based table (risingwavelabs#3328)

    * minor refactor

    Signed-off-by: Bugen Zhao <[email protected]>

    * remove option of pk serializer

    Signed-off-by: Bugen Zhao <[email protected]>

    * remove pk serializer in state table

    Signed-off-by: Bugen Zhao <[email protected]>

    * extract vnode compute

    Signed-off-by: Bugen Zhao <[email protected]>

    * remove into order types

    Signed-off-by: Bugen Zhao <[email protected]>

commit 9169436
Author: Bowen <[email protected]>
Date:   Mon Jun 20 13:44:10 2022 +0800

    feat: apply relational refactor for hash agg (max, min) (risingwavelabs#2999)

    * feat: two closure can not get mut ref of same variable

    * use Arc::Mutex to wrap the state table

    * roll back string agg

    * add StateTable to get_output

    * finish basic coding (unit test failed)

    * finish basic coding

    * fix bug

    * show case

    * use empty Row for scan

    * tweak

commit 35bb16a
Author: Bugen Zhao <[email protected]>
Date:   Mon Jun 20 13:43:11 2022 +0800

    refactor: use packed bitmap struct for vnode bitmap (risingwavelabs#3310)

    * use bitmap in streaming

    Signed-off-by: Bugen Zhao <[email protected]>

    * use bitmap in storage

    Signed-off-by: Bugen Zhao <[email protected]>

    * minor fix

    Signed-off-by: Bugen Zhao <[email protected]>

    * make bitmap optional

    Signed-off-by: Bugen Zhao <[email protected]>

commit be50f93
Author: Liang <[email protected]>
Date:   Mon Jun 20 13:27:37 2022 +0800

    feat(compaction): let compactor be unaware of vnode mapping (risingwavelabs#3321)

commit 88258a6
Author: lmatz <[email protected]>
Date:   Sun Jun 19 21:31:01 2022 -0700

    doc: no need to manually check in PR from forks (risingwavelabs#3325)

commit c590e18
Author: zwang28 <[email protected]>
Date:   Mon Jun 20 12:13:50 2022 +0800

    refactor(storage): split HummockVersion's levels by compaction group. (risingwavelabs#3206)

commit f1c3298
Author: zwang28 <[email protected]>
Date:   Mon Jun 20 11:35:18 2022 +0800

    feat(meta): register source to compaction group manager (risingwavelabs#3300)

commit bf08b54
Author: Zack <[email protected]>
Date:   Mon Jun 20 11:25:39 2022 +0800

    feat(frontend): Add sql string into context for debugging (risingwavelabs#3312)

    * feat(frontend): Add sql string into context for debugging

    * Remove renaming

    * Refactor to use str

commit f784ba3
Author: zwang28 <[email protected]>
Date:   Mon Jun 20 11:22:12 2022 +0800

    feat(storage): shared buffer flush L0 by compaction group (risingwavelabs#3200)

commit bd48bba
Author: Kexiang Wang <[email protected]>
Date:   Sun Jun 19 07:48:45 2022 -0400

    feat: modify interfaces to support specifying parallelism for each fr… (risingwavelabs#3283)

    feat: modify interfaces to support specifying parallelism for each fragment

commit 8f0e0b2
Author: Steven Chua <[email protected]>
Date:   Sun Jun 19 12:56:51 2022 +0800

    feat(ctl): Support basic sst dump in risectl (risingwavelabs#3309)

    * feat(ctl): Add sst-dump command to risectl

    * feat(ctl): Fix risectl compatibility and remove VNode info

commit daf9222
Author: Alex Chi <[email protected]>
Date:   Sat Jun 18 21:41:44 2022 +0800

    feat(risedev): generate risectl config (risingwavelabs#3318)

    * feat(risedev): generate risectl config

    Signed-off-by: Alex Chi <[email protected]>

    * fix

    Signed-off-by: Alex Chi <[email protected]>

commit 86ff992
Author: Alex Chi <[email protected]>
Date:   Sat Jun 18 20:52:51 2022 +0800

    feat(ctl): support table scan (risingwavelabs#3317)

    * feat(ctl): support table scan

    Signed-off-by: Alex Chi <[email protected]>

    * license header

    Signed-off-by: Alex Chi <[email protected]>

    * add docs

    Signed-off-by: Alex Chi <[email protected]>

commit 5ac5637
Author: Yikun Chen <[email protected]>
Date:   Sat Jun 18 08:03:06 2022 -0400

    feat: support interval comparison (risingwavelabs#3222)

    1. fix timestamp substract timestamp.
    2. support interval comparison. From pgsql, 1 month equal to 30 days and 1 day equal to 86400000 ms.

commit 722ff53
Author: Yuanxin Cao <[email protected]>
Date:   Fri Jun 17 16:47:22 2022 +0800

    feat(meta): inform frontend of mview data distribution (risingwavelabs#3304)

    * feat(meta): inform frontend of mview data distribution

    * fix ut

    * set vnode mapping for materialzied source

    * move ParallelUnitId into common, move vnode related contants into common/types

commit 8571ff4
Author: Renjie Liu <[email protected]>
Date:   Fri Jun 17 16:46:17 2022 +0800

    feat(batch): All tests should run in both local and distributed mode (risingwavelabs#3306)

    * feat(batch): All tests should run in both local and distributed mode

Signed-off-by: Little-Wallace <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants