Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(batch): return Err when polling finished executor #1544

Merged
merged 10 commits into from
Apr 2, 2022

Conversation

Enter-tainer
Copy link
Contributor

@Enter-tainer Enter-tainer commented Apr 1, 2022

What's changed and what's your intention?

  • Summarize your change

When an executor is finished(i.e. calls to next() returns Ok(None) ), subsequent call to next will now return Err.

  • How does this PR work? Need a brief introduction for the changed logic (optional)

FuseExecutor is a wrapper on an original executor, it tracks the state of the underlying executor. Once the underlying executor returns Ok(None), FuseExecutor will return Err for all subsequent calls.

  • Describe clearly one logical change and avoid lazy messages (optional)

  • Describe any limitations of the current code (optional)

Checklist

  • I have written the necessary docs and comments
  • I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

this pr fix #890

@skyzh
Copy link
Contributor

skyzh commented Apr 1, 2022

Maintaining such code is hard. You will never know whether there will be new executors in the future, and how they will act.

https://doc.rust-lang.org/stable/std/iter/struct.Fuse.html

One good approach is to have something like fuse for all executors. No matter how the executor is implemented, if we fuse(e: impl Executor) -> impl Executor, it will always return an executor that matches the behavior your defined.

@skyzh
Copy link
Contributor

skyzh commented Apr 1, 2022

Also by convention, it would be better to always return Ok(None) instead of error.

@codecov
Copy link

codecov bot commented Apr 1, 2022

Codecov Report

Merging #1544 (16071aa) into main (0c89d0b) will decrease coverage by 0.02%.
The diff coverage is 27.00%.

@@             Coverage Diff              @@
##               main    #1544      +/-   ##
============================================
- Coverage     69.73%   69.71%   -0.03%     
  Complexity     2766     2766              
============================================
  Files          1041     1042       +1     
  Lines         91493    91629     +136     
  Branches       1790     1790              
============================================
+ Hits          63799    63875      +76     
- Misses        26803    26863      +60     
  Partials        891      891              
Flag Coverage Δ
java 61.01% <ø> (ø)
rust 71.53% <27.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
rust/batch/src/executor/create_source.rs 0.00% <0.00%> (ø)
rust/batch/src/executor/create_table.rs 52.54% <0.00%> (-2.82%) ⬇️
rust/batch/src/executor/delete.rs 80.89% <0.00%> (-1.58%) ⬇️
rust/batch/src/executor/drop_stream.rs 0.00% <0.00%> (ø)
rust/batch/src/executor/drop_table.rs 0.00% <0.00%> (ø)
rust/batch/src/executor/filter.rs 75.83% <0.00%> (-1.56%) ⬇️
rust/batch/src/executor/generate_series.rs 66.26% <0.00%> (-2.49%) ⬇️
rust/batch/src/executor/generic_exchange.rs 59.42% <0.00%> (-1.33%) ⬇️
rust/batch/src/executor/insert.rs 84.92% <0.00%> (-1.30%) ⬇️
rust/batch/src/executor/join/hash_join.rs 84.04% <0.00%> (-0.37%) ⬇️
... and 15 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@Enter-tainer
Copy link
Contributor Author

Also by convention, it would be better to always return Ok(None) instead of error.

I also find that returning Err is quite strange 🤔 . I guess the further discussion is needed.

@Enter-tainer Enter-tainer force-pushed the return-err-on-finished-executor branch from 844704f to 112fee1 Compare April 2, 2022 04:26
@Enter-tainer
Copy link
Contributor Author

I use the method explained by @skyzh to wrap the original executor. I also add an example use of the wrapper on FilterExecutor.

FuseExecutor is a wrapper on an original executor, it tracks the state of the underlying executor. Once the underlying executor returns Ok(None), FuseExecutor will return Err for all subsequent calls.

Any suggestions? I will work on all other executors if this approach is desired.

Copy link
Contributor

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM.

The renaming of FilterExecutor -> FilterExecutorInner seems not necessary. We can simply refactor all new_boxed_executor to return fused executors.

We can then add a fuse function on executor trait (or have a new trait that implements for all executors), and make all new_boxed_executor to return Box::new(original_executor.fuse()).

rust/batch/src/executor/fuse.rs Outdated Show resolved Hide resolved
@lmatz
Copy link
Contributor

lmatz commented Apr 2, 2022

IIRC, there was one bug where one executor polls its child and gets Ok(None), and it somehow polls its child again and gets some values, which leads to more output than the truth. If I need to debug this case, I somehow need to know where and when a hypothetical Err occurs to locate the problematic executor(s).

I don't argue that the bug can only be exposed by returning Err, maybe there are other ways of achieving the same effect. I think it helps as it lets any two consecutive executors check each other in a more explicit way.

I feel the convention of returning Ok(None) after Ok(None) once is solely designed for the convenience of the caller, i.e. no special case needs to be handled and one propagates Ok(None) downstream easily. The idea of returning an Err is exactly the opposite, to let the one maintaining the code inconvenient, i.e. edge cases must be explicitly handled and cannot fail silently.

One argument against using Err may be that it is more likely to produce bugs that lead to less output than the truth, the opposite of using Ok(None).

@skyzh
Copy link
Contributor

skyzh commented Apr 2, 2022

I would +1 for return Ok(None). In some parts of our program, there might be some logic that calls next of executors endlessly. (e.g., something like doing prefetch). If there's error, things would be strange...

@liurenjie1024
Copy link
Contributor

Maybe we can work on this? #1190

@Enter-tainer
Copy link
Contributor Author

Enter-tainer commented Apr 2, 2022

Maybe we can work on this? #1190

Refactoring all of these executors with futures_async_stream would solve issue #1174 in addition to issue #890. And that sounds great. But I don't consider this pr is in conflict with this refactoring. Because this pr is a refinement of the current implementation of executors.

In some parts of our program, there might be some logic that calls next of executors endlessly. If there's an error, things would be strange.

This pr is based on the assumption that after a batch executor emits an Ok(None), it will no longer produce any meaningful results. Current behavior looks like data0, data1, data2, ...., Ok(None), Ok(None), Ok(None), .... And this pr changes the behavior to data0, data1, data2, ...., Ok(None), Err(...), Err(...).

In the following example, the loop will terminate after it receives an Ok(None). And this pr will not break it.
https://github.com/singularity-data/risingwave/blob/482e064f49a10a87aa3281600ea3d930a4fc6fe6/rust/batch/src/task/task_.rs#L260-L270

I've checked the use of the batch executor, but I'm not sure if I'm missing something. Can you provide a code example (if there is any) where the implementation of this pr would break the behavior of the program?

Copy link
Contributor

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Keep refactoring other executors, and we can get this merged. Hope this can be done today, as we have a big renaming tomorrow.

@Enter-tainer Enter-tainer marked this pull request as ready for review April 2, 2022 08:44
@skyzh skyzh enabled auto-merge (squash) April 2, 2022 08:48
@skyzh skyzh merged commit 8573cfa into main Apr 2, 2022
@skyzh skyzh deleted the return-err-on-finished-executor branch April 2, 2022 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return err when an already finished executor get polled
4 participants