This repository has been archived by the owner on Oct 9, 2023. It is now read-only.
Added IsFailurePermanent flag on DynamicTaskStatus #567
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR
Currently dynamic tasks treat all failures and retryable which results in unintended behavior when subtasks are attempted multiple times. This PR adds a
IsFailurePermanent
field on theDynamicTaskStatus
struct to indicate that the failure is permanent, and therefore should not be recovered.Type
Are all requirements met?
Complete description
The choice to use a new flag is entirely out of backwards compatibility. Other options include:
(1) Using the ExecutionError in flyteplugins io rather than flyteidl ExecutionError: This would make retrieval of in progress dynamic task status fail when it expects a different error type on the field.
(2) Naming the flag
IsRecoverable
vsisPermanent
: This is difficult because intuition says that we should useIsRecoverable
which adheres more to the Flyte standard. The biggest issue is that dynamics treat all failures as retryable by default now, so this would change the default to be non-recoverable (ie.false
forIsRecoverable
) which changes the default behavior. It is VERY difficult to ensure there will be no regressions.Tracking Issue
fixes flyteorg/flyte#3606
Follow-up issue
NA