Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] workflow without execution cannot be failed/aborted/finalized correctly after max retry #2577

Closed
2 tasks done
honnix opened this issue Jun 3, 2022 · 2 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@honnix
Copy link
Member

honnix commented Jun 3, 2022

Describe the bug

For unknown reason, we have a workflow execution that never managed to start with error:

ExecutionNotFound: The execution that the event belongs to does not exist,
    caused by [rpc error: code = NotFound desc = missing entity of type execution
    with identifier project...

So trying to mutate this workflow will always result in failure and eventually mutableW.Status.FailedAttempts > maxRetries.

When this happens, the abort process cannot handle it correctly.

If we try to trace back the very first time when retry was exhausted, there was an error when publishing event because for same reason that the execution did not exist. However, transitioning to StatusFailed would fail because of failing to publish event and that did not match this condition; and then everything would be retried over and over again as we have seen logs like RuntimeExecutionError: max number of system retry attempts [38676/50] exhausted. These lines also seems to be related.

And this of course prevents GC from cleaning it up.

Expected behavior

Workflow can be terminated successfully even it didn't start.

Additional context to reproduce

Not sure the root cause of missing execution, so don't know how to reproduce it.

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@honnix honnix added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Jun 3, 2022
@hamersaw
Copy link
Contributor

hamersaw commented Jun 6, 2022

@honnix is this a duplicate of #2275? Can we close this one?

@hamersaw hamersaw removed the untriaged This issues has not yet been looked at by the Maintainers label Jun 6, 2022
@honnix
Copy link
Member Author

honnix commented Jun 6, 2022

@hamersaw Right. Sorry I missed that. Please feel free to close. We can move some details of this ticket to the other if that helps.

@hamersaw hamersaw closed this as completed Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants