Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix workflow deletion #5793

Merged
merged 2 commits into from
Mar 18, 2024
Merged

Fix workflow deletion #5793

merged 2 commits into from
Mar 18, 2024

Conversation

Shaddoll
Copy link
Member

What changed?
Fix workflow deletion

Why?
To prevent garbage data from workflows which are supposed to be deleted.
Workflow execution should exist in order to execute the deletion task, so it must be the last one to be deleted.

How did you test it?
unit tests

Potential risks

Release notes

Documentation Changes

g.Go(func() (e error) {
defer func() { recoverPanic(recover(), &e) }()
_, e = m.db.DeleteFromExecutions(ctx, &sqlplugin.ExecutionsFilter{
return m.txExecute(ctx, dbShardID, "DeleteWorkflowExecution", func(tx sqlplugin.Tx) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not only changing the order but also making this operation a transaction.

  • Other transactional operations in this file are using txExecuteShardLockedFn. Should this also grab shard lock?
  • This transaction will perform deletions on multiple tables an will have to acquire corresponding locks at DB level. Compared to previous implementation the overall latency might be high which is OK for deletion case but do you think this has a chance to impact performance of other queries on those tables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It doesn't have to grab a shard lock, we're trying to delete a workflow passing the retention period. Whether it is deleted by the owner of the shard isn't important.
  • For innodb with repeatable read isolation level, the delete acquires a next-key lock on every record it encounters. The deletion are either point deletion or range deletion that deleting all records within a range. So the impact performance of other queries on those tables should be low. (Only 1 workflow next to the deleted workflow could be impacted.) For innodb with read-committed isolation level and myrocksdb, there is no next-key lock or gap lock, only the records being deleted are locked. So there is no impact to other workflows.

@coveralls
Copy link

coveralls commented Mar 18, 2024

Pull Request Test Coverage Report for Build 018e52a1-d0d6-4fee-958b-7552733219dd

Details

  • 27 of 45 (60.0%) changed or added relevant lines in 2 files are covered.
  • 18 unchanged lines in 6 files lost coverage.
  • Overall coverage increased (+0.04%) to 64.976%

Changes Missing Coverage Covered Lines Changed/Added Lines %
service/history/task/timer_task_executor_base.go 6 10 60.0%
common/persistence/sql/sql_execution_store.go 21 35 60.0%
Files with Coverage Reduction New Missed Lines %
service/history/queue/timer_queue_processor_base.go 1 77.62%
service/history/execution/mutable_state_task_refresher.go 1 71.2%
common/persistence/historyManager.go 2 66.67%
common/types/history.go 4 45.35%
service/history/engine/engineimpl/historyEngine.go 5 68.3%
common/task/fifo_task_scheduler.go 5 82.47%
Totals Coverage Status
Change from base Build 018e5285-f63a-479f-ba71-e44430d5acfc: 0.04%
Covered Lines: 94837
Relevant Lines: 145956

💛 - Coveralls

Copy link

codecov bot commented Mar 18, 2024

Codecov Report

Merging #5793 (5420c3c) into master (4ebaaa2) will decrease coverage by 0.03%.
The diff coverage is 37.14%.

Additional details and impacted files
Files Coverage Δ
service/history/task/timer_task_executor_base.go 57.14% <0.00%> (ø)
common/persistence/sql/sql_execution_store.go 85.16% <48.14%> (-1.55%) ⬇️

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ebaaa2...5420c3c. Read the comment docs.

@Shaddoll Shaddoll merged commit d43b582 into cadence-workflow:master Mar 18, 2024
20 checks passed
@Shaddoll Shaddoll deleted the fix-delete branch March 18, 2024 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants