refactor(batch): return Err when polling finished executor #1544

Enter-tainer · 2022-04-01T14:49:45Z

What's changed and what's your intention?

Summarize your change

When an executor is finished(i.e. calls to next() returns Ok(None) ), subsequent call to next will now return Err.

How does this PR work? Need a brief introduction for the changed logic (optional)

FuseExecutor is a wrapper on an original executor, it tracks the state of the underlying executor. Once the underlying executor returns Ok(None), FuseExecutor will return Err for all subsequent calls.

Describe clearly one logical change and avoid lazy messages (optional)
Describe any limitations of the current code (optional)

Checklist

I have written the necessary docs and comments
I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

this pr fix #890

skyzh · 2022-04-01T14:56:00Z

Maintaining such code is hard. You will never know whether there will be new executors in the future, and how they will act.

https://doc.rust-lang.org/stable/std/iter/struct.Fuse.html

One good approach is to have something like fuse for all executors. No matter how the executor is implemented, if we fuse(e: impl Executor) -> impl Executor, it will always return an executor that matches the behavior your defined.

skyzh · 2022-04-01T14:57:16Z

Also by convention, it would be better to always return Ok(None) instead of error.

codecov · 2022-04-01T15:00:49Z

Codecov Report

Merging #1544 (16071aa) into main (0c89d0b) will decrease coverage by 0.02%.
The diff coverage is 27.00%.

@@             Coverage Diff              @@
##               main    #1544      +/-   ##
============================================
- Coverage     69.73%   69.71%   -0.03%     
  Complexity     2766     2766              
============================================
  Files          1041     1042       +1     
  Lines         91493    91629     +136     
  Branches       1790     1790              
============================================
+ Hits          63799    63875      +76     
- Misses        26803    26863      +60     
  Partials        891      891

Flag	Coverage Δ
java	`61.01% <ø> (ø)`
rust	`71.53% <27.00%> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
rust/batch/src/executor/create_source.rs	`0.00% <0.00%> (ø)`
rust/batch/src/executor/create_table.rs	`52.54% <0.00%> (-2.82%)`	⬇️
rust/batch/src/executor/delete.rs	`80.89% <0.00%> (-1.58%)`	⬇️
rust/batch/src/executor/drop_stream.rs	`0.00% <0.00%> (ø)`
rust/batch/src/executor/drop_table.rs	`0.00% <0.00%> (ø)`
rust/batch/src/executor/filter.rs	`75.83% <0.00%> (-1.56%)`	⬇️
rust/batch/src/executor/generate_series.rs	`66.26% <0.00%> (-2.49%)`	⬇️
rust/batch/src/executor/generic_exchange.rs	`59.42% <0.00%> (-1.33%)`	⬇️
rust/batch/src/executor/insert.rs	`84.92% <0.00%> (-1.30%)`	⬇️
rust/batch/src/executor/join/hash_join.rs	`84.04% <0.00%> (-0.37%)`	⬇️
... and 15 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Enter-tainer · 2022-04-01T15:05:23Z

Also by convention, it would be better to always return Ok(None) instead of error.

I also find that returning Err is quite strange 🤔 . I guess the further discussion is needed.

Enter-tainer · 2022-04-02T04:42:02Z

I use the method explained by @skyzh to wrap the original executor. I also add an example use of the wrapper on FilterExecutor.

FuseExecutor is a wrapper on an original executor, it tracks the state of the underlying executor. Once the underlying executor returns Ok(None), FuseExecutor will return Err for all subsequent calls.

Any suggestions? I will work on all other executors if this approach is desired.

skyzh

Generally LGTM.

The renaming of FilterExecutor -> FilterExecutorInner seems not necessary. We can simply refactor all new_boxed_executor to return fused executors.

We can then add a fuse function on executor trait (or have a new trait that implements for all executors), and make all new_boxed_executor to return Box::new(original_executor.fuse()).

rust/batch/src/executor/fuse.rs

lmatz · 2022-04-02T05:18:00Z

IIRC, there was one bug where one executor polls its child and gets Ok(None), and it somehow polls its child again and gets some values, which leads to more output than the truth. If I need to debug this case, I somehow need to know where and when a hypothetical Err occurs to locate the problematic executor(s).

I don't argue that the bug can only be exposed by returning Err, maybe there are other ways of achieving the same effect. I think it helps as it lets any two consecutive executors check each other in a more explicit way.

I feel the convention of returning Ok(None) after Ok(None) once is solely designed for the convenience of the caller, i.e. no special case needs to be handled and one propagates Ok(None) downstream easily. The idea of returning an Err is exactly the opposite, to let the one maintaining the code inconvenient, i.e. edge cases must be explicitly handled and cannot fail silently.

One argument against using Err may be that it is more likely to produce bugs that lead to less output than the truth, the opposite of using Ok(None).

skyzh · 2022-04-02T05:19:33Z

I would +1 for return Ok(None). In some parts of our program, there might be some logic that calls next of executors endlessly. (e.g., something like doing prefetch). If there's error, things would be strange...

liurenjie1024 · 2022-04-02T05:30:58Z

Maybe we can work on this? #1190

Enter-tainer · 2022-04-02T07:35:48Z

Maybe we can work on this? #1190

Refactoring all of these executors with futures_async_stream would solve issue #1174 in addition to issue #890. And that sounds great. But I don't consider this pr is in conflict with this refactoring. Because this pr is a refinement of the current implementation of executors.

In some parts of our program, there might be some logic that calls next of executors endlessly. If there's an error, things would be strange.

This pr is based on the assumption that after a batch executor emits an Ok(None), it will no longer produce any meaningful results. Current behavior looks like data0, data1, data2, ...., Ok(None), Ok(None), Ok(None), .... And this pr changes the behavior to data0, data1, data2, ...., Ok(None), Err(...), Err(...).

In the following example, the loop will terminate after it receives an Ok(None). And this pr will not break it.
https://github.com/singularity-data/risingwave/blob/482e064f49a10a87aa3281600ea3d930a4fc6fe6/rust/batch/src/task/task_.rs#L260-L270

I've checked the use of the batch executor, but I'm not sure if I'm missing something. Can you provide a code example (if there is any) where the implementation of this pr would break the behavior of the program?

skyzh

LGTM. Keep refactoring other executors, and we can get this merged. Hope this can be done today, as we have a big renaming tomorrow.

github-actions bot added the type/refactor label Apr 1, 2022

add fuse executor and example use on filter

112fee1

Enter-tainer force-pushed the return-err-on-finished-executor branch from 844704f to 112fee1 Compare April 2, 2022 04:26

fix typo in comment

9a28559

skyzh reviewed Apr 2, 2022

View reviewed changes

rust/batch/src/executor/fuse.rs Outdated Show resolved Hide resolved

Enter-tainer added 2 commits April 2, 2022 15:10

fix fuse new_boxed_executor

d5035d3

fix docs for FusedExecutor

8915319

remove unused BoxedExecutorBuilder for FusedExecutor

a603b0b

skyzh approved these changes Apr 2, 2022

View reviewed changes

Enter-tainer added 5 commits April 2, 2022 16:03

fuse all trivial executors

9700ad4

loosen trait bounds and fuse row_seq_scan

848e801

fuse hash related executors

e2504bc

add ut for fuse

25c4972

Merge branch 'main' into return-err-on-finished-executor-new

16071aa

Enter-tainer marked this pull request as ready for review April 2, 2022 08:44

skyzh enabled auto-merge (squash) April 2, 2022 08:48

skyzh merged commit 8573cfa into main Apr 2, 2022

skyzh deleted the return-err-on-finished-executor branch April 2, 2022 08:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(batch): return Err when polling finished executor #1544

refactor(batch): return Err when polling finished executor #1544

Enter-tainer commented Apr 1, 2022 •

edited

Loading

skyzh commented Apr 1, 2022

skyzh commented Apr 1, 2022

codecov bot commented Apr 1, 2022 •

edited

Loading

Enter-tainer commented Apr 1, 2022

Enter-tainer commented Apr 2, 2022

skyzh left a comment •

edited

Loading

lmatz commented Apr 2, 2022

skyzh commented Apr 2, 2022

liurenjie1024 commented Apr 2, 2022

Enter-tainer commented Apr 2, 2022 •

edited

Loading

skyzh left a comment •

edited

Loading

refactor(batch): return Err when polling finished executor #1544

refactor(batch): return Err when polling finished executor #1544

Conversation

Enter-tainer commented Apr 1, 2022 • edited Loading

What's changed and what's your intention?

Checklist

Refer to a related PR or issue link (optional)

skyzh commented Apr 1, 2022

skyzh commented Apr 1, 2022

codecov bot commented Apr 1, 2022 • edited Loading

Codecov Report

Enter-tainer commented Apr 1, 2022

Enter-tainer commented Apr 2, 2022

skyzh left a comment • edited Loading

Choose a reason for hiding this comment

lmatz commented Apr 2, 2022

skyzh commented Apr 2, 2022

liurenjie1024 commented Apr 2, 2022

Enter-tainer commented Apr 2, 2022 • edited Loading

skyzh left a comment • edited Loading

Choose a reason for hiding this comment

Enter-tainer commented Apr 1, 2022 •

edited

Loading

codecov bot commented Apr 1, 2022 •

edited

Loading

skyzh left a comment •

edited

Loading

Enter-tainer commented Apr 2, 2022 •

edited

Loading

skyzh left a comment •

edited

Loading