-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of CurrentWorkflowConditionFailedError when create wf #3349
Conversation
@@ -496,6 +496,7 @@ Create_Loop: | |||
switch err.(type) { | |||
case *shared.WorkflowExecutionAlreadyStartedError, | |||
*persistence.WorkflowExecutionAlreadyStartedError, | |||
*persistence.CurrentWorkflowConditionFailedError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the range ID changes, it will throw this CurrentWorkflowConditionFailedError? If it is the case, why no update the range id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CurrentWorkflowConditionFailedError in StartWorkflow only happened when concurrent record messed up. RangeID changes will not lead to this error. That's my understanding based on reading of https://github.com/uber/cadence/blob/master/common/persistence/cassandra/cassandraPersistence.go#L1229 - L1242
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @yycptt makes a point. should we retry on this error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this error will be retry by client; it should be non-retriable from history service point of view currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the metric to check once it gets landed?
|
What changed?
Fix handling of CurrentWorkflowConditionFailedError when create workflow
Why?
In create workflow, when CurrentWorkflowConditionFailedError happens, it doesn't make sense for shard to renewRange, because it doesn't help such error case and likely to cause more errors.
How did you test it?
existing tests.
bench test on staging
Potential risks
Limited. in worst case, create workflow will encounter unexpected error.