-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] A failed wf node leaves the other nodes (spark tasks) running until they finish #263
Closed
3 of 20 tasks
Comments
EngHabu
added
bug
Something isn't working
untriaged
This issues has not yet been looked at by the Maintainers
labels
Apr 13, 2020
@EngHabu does the abort not propagate? |
8 tasks
8 tasks
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Dec 6, 2022
* Fix: register remaining tasks without default plugins Signed-off-by: Filipe Regadas <[email protected]> * Add test Signed-off-by: Filipe Regadas <[email protected]> * fixup! Add test Signed-off-by: Filipe Regadas <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Dec 6, 2022
* Migrate to golang-jwt/jwt/v4 Signed-off-by: Haytham Abuelfutuh <[email protected]> * go mod tidy Signed-off-by: Haytham Abuelfutuh <[email protected]> * Move to go 1.17 Signed-off-by: Haytham Abuelfutuh <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Dec 20, 2022
Signed-off-by: Yuvraj <[email protected]> Co-authored-by: Yuvraj <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Dec 20, 2022
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Dec 20, 2022
Signed-off-by: Flyte-Bot <[email protected]> Co-authored-by: pmahindrakar-oss <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Aug 9, 2023
* Fix: register remaining tasks without default plugins Signed-off-by: Filipe Regadas <[email protected]> * Add test Signed-off-by: Filipe Regadas <[email protected]> * fixup! Add test Signed-off-by: Filipe Regadas <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Aug 21, 2023
* Migrate to golang-jwt/jwt/v4 Signed-off-by: Haytham Abuelfutuh <[email protected]> * go mod tidy Signed-off-by: Haytham Abuelfutuh <[email protected]> * Move to go 1.17 Signed-off-by: Haytham Abuelfutuh <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Aug 21, 2023
Signed-off-by: Yuvraj <[email protected]> Co-authored-by: Yuvraj <[email protected]>
eapolinario
pushed a commit
to eapolinario/flyte
that referenced
this issue
Apr 30, 2024
Signed-off-by: Flyte-Bot <[email protected]> Co-authored-by: pmahindrakar-oss <[email protected]>
austin362667
pushed a commit
to austin362667/flyte
that referenced
this issue
May 7, 2024
Signed-off-by: Flyte-Bot <[email protected]> Co-authored-by: pmahindrakar-oss <[email protected]>
robert-ulbrich-mercedes-benz
pushed a commit
to robert-ulbrich-mercedes-benz/flyte
that referenced
this issue
Jul 2, 2024
Signed-off-by: Flyte-Bot <[email protected]> Co-authored-by: pmahindrakar-oss <[email protected]>
troychiu
pushed a commit
that referenced
this issue
Jul 8, 2024
## Overview This PR enables graceful aborts (rather than panics) when a fasttask times out waiting for worker availability. ## Test Plan Tested locally. ## Rollout Plan (if applicable) This can be rolled out along with any other changes. ## Upstream Changes Should this change be upstreamed to OSS (flyteorg/flyte)? If so, please check this box for auditing. Note, this is the responsibility of each developer. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F). - [ ] To be upstreamed ## Jira Issue https://unionai.atlassian.net/browse/EXO-103 ## Checklist * [ ] Added tests * [ ] Ran a deploy dry run and shared the terraform plan * [ ] Added logging and metrics * [ ] Updated [dashboards](https://unionai.grafana.net/dashboards) and [alerts](https://unionai.grafana.net/alerting/list) * [ ] Updated documentation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
If a node in a workflow fails, the entire workflow fails, if there are other nodes running spark, the underlying CRD will continue to run wasting resources.
Expected behavior
If the workflow shows as failed, the underlying CRDs it created should be cleaned up.
Flyte component
Environment
Flyte component
The text was updated successfully, but these errors were encountered: