Fix handling of CurrentWorkflowConditionFailedError when create wf #3349

vancexu · 2020-06-18T18:29:31Z

What changed?
Fix handling of CurrentWorkflowConditionFailedError when create workflow

Why?
In create workflow, when CurrentWorkflowConditionFailedError happens, it doesn't make sense for shard to renewRange, because it doesn't help such error case and likely to cause more errors.

How did you test it?
existing tests.
bench test on staging

Potential risks
Limited. in worst case, create workflow will encounter unexpected error.

yux0 · 2020-06-19T00:00:46Z

service/history/shard/context.go

@@ -496,6 +496,7 @@ Create_Loop:
 			switch err.(type) {
 			case *shared.WorkflowExecutionAlreadyStartedError,
 				*persistence.WorkflowExecutionAlreadyStartedError,
+				*persistence.CurrentWorkflowConditionFailedError,


If the range ID changes, it will throw this CurrentWorkflowConditionFailedError? If it is the case, why no update the range id?

CurrentWorkflowConditionFailedError in StartWorkflow only happened when concurrent record messed up. RangeID changes will not lead to this error. That's my understanding based on reading of https://github.com/uber/cadence/blob/master/common/persistence/cassandra/cassandraPersistence.go#L1229 - L1242

I think @yycptt makes a point. should we retry on this error

this error will be retry by client; it should be non-retriable from history service point of view currently.

coveralls · 2020-06-23T23:23:07Z

Coverage increased (+0.1%) to 67.234% when pulling 190c01c on fixconcurrent into fa3155e on master.

mkolodezny

Whats the metric to check once it gets landed?

vancexu · 2020-06-24T18:42:25Z

Whats the metric to check once it gets landed?

Error log on CurrentWorkflowConditionFailedError should no longer see same failing signalwithstart request repeated happening with increasing rangeID
History service signalwithstart latency should drop.

…adence-workflow#3349)

Fix handling of CurrentWorkflowConditionFailedError when create wf

0aa7fce

vancexu requested a review from yux0 June 18, 2020 18:29

yux0 reviewed Jun 19, 2020

View reviewed changes

Merge branch 'master' into fixconcurrent

c64faea

vancexu requested a review from yux0 June 22, 2020 18:38

Merge branch 'master' into fixconcurrent

190c01c

yux0 approved these changes Jun 24, 2020

View reviewed changes

vancexu merged commit b5ce9c7 into master Jun 24, 2020

vancexu deleted the fixconcurrent branch June 24, 2020 01:29

mkolodezny self-requested a review June 24, 2020 17:52

mkolodezny reviewed Jun 24, 2020

View reviewed changes

yux0 pushed a commit to yux0/cadence that referenced this pull request May 4, 2021

Fix handling of CurrentWorkflowConditionFailedError when create wf (c…

e2780bc

…adence-workflow#3349)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix handling of CurrentWorkflowConditionFailedError when create wf #3349

Fix handling of CurrentWorkflowConditionFailedError when create wf #3349

vancexu commented Jun 18, 2020

yux0 Jun 19, 2020

vancexu Jun 22, 2020

yux0 Jun 24, 2020

vancexu Jun 24, 2020

coveralls commented Jun 23, 2020

mkolodezny left a comment

vancexu commented Jun 24, 2020

Fix handling of CurrentWorkflowConditionFailedError when create wf #3349

Fix handling of CurrentWorkflowConditionFailedError when create wf #3349

Conversation

vancexu commented Jun 18, 2020

yux0 Jun 19, 2020

Choose a reason for hiding this comment

vancexu Jun 22, 2020

Choose a reason for hiding this comment

yux0 Jun 24, 2020

Choose a reason for hiding this comment

vancexu Jun 24, 2020

Choose a reason for hiding this comment

coveralls commented Jun 23, 2020

mkolodezny left a comment

Choose a reason for hiding this comment

vancexu commented Jun 24, 2020