Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slice reuse in cassandra/domain.go #5937

Merged
merged 1 commit into from
Apr 24, 2024

Conversation

natemort
Copy link
Member

What changed?
Reset the fields for isolationGroups and asyncWFConfigData after each row.

While reading domains from Cassandra the isolationGroups and asncWFConfigData are not properly reset for each row. This can result in the same backing array being reused across multiple rows, resulting in an incorrect or corrupted value being read.

Why?
Fixes a bug where the domain cache contains the wrong async wf config data and is unable to trigger an async workflow despite it being enabled for that ddomain.

How did you test it?

  • Updated the persistence tests to include isolationGroups and asyncWFConfigData.
  • Added a secondary domain to the async wf integration test
  • Ran locally and started an async workflow via the java client

Potential risks

Release notes

Documentation Changes

Copy link

codecov bot commented Apr 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.53%. Comparing base (1a0c232) to head (4fef375).

❗ Current head 4fef375 differs from pull request most recent head 9671c7a. Consider uploading reports for the commit 9671c7a to get more accurate results

Additional details and impacted files
Files Coverage Δ
.../persistence/nosql/nosqlplugin/cassandra/domain.go 98.87% <100.00%> (+0.01%) ⬆️

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a0c232...9671c7a. Read the comment docs.

@natemort natemort force-pushed the async branch 2 times, most recently from b318d8e to 4fef375 Compare April 24, 2024 20:53
@coveralls
Copy link

coveralls commented Apr 24, 2024

Pull Request Test Coverage Report for Build 018f1233-8091-4e63-bf32-9f39ae5784bc

Details

  • 4 of 19 (21.05%) changed or added relevant lines in 2 files are covered.
  • 90 unchanged lines in 11 files lost coverage.
  • Overall coverage decreased (-0.04%) to 67.695%

Changes Missing Coverage Covered Lines Changed/Added Lines %
common/persistence/persistence-tests/metadataPersistenceV2Test.go 0 15 0.0%
Files with Coverage Reduction New Missed Lines %
tools/cli/admin_db_decode_thrift.go 1 71.79%
service/matching/taskReader.go 2 84.88%
common/task/fifo_task_scheduler.go 2 87.63%
service/frontend/api/handler.go 2 61.98%
service/history/execution/mutable_state_util.go 2 78.52%
service/history/queue/timer_gate.go 3 95.83%
service/history/task/transfer_standby_task_executor.go 4 87.04%
service/matching/taskListManager.go 5 79.9%
common/persistence/pinot_visibility_triple_manager.go 9 97.65%
service/history/task/task_util.go 24 69.43%
Totals Coverage Status
Change from base Build 018f11a6-4daf-468f-bb3c-f44e3604ef0e: -0.04%
Covered Lines: 99226
Relevant Lines: 146579

💛 - Coveralls

@@ -410,6 +410,10 @@ func (db *cdb) SelectAllDomains(
replicationClusters = []map[string]interface{}{}
badBinariesData = []byte("")
badBinariesDataEncoding = ""
isolationGroups = []byte("")
isolationGroupsEncoding = ""
asyncWFConfigData = []byte("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asyncWFConfigData = nil would also do the job. not sure why the byte arrays were reset like this before.

@natemort natemort enabled auto-merge (squash) April 24, 2024 21:47
While reading domains from Cassandra the isolationGroups and asncWFConfigData are not properly reset for each row. This can result in the same backing array being reused across multiple rows, resulting in an incorrect or corrupted value being read. Notably this is used for the domain cache, so any operation relying on it may be unable to get the correct values of these fields.
@natemort natemort merged commit 882796c into cadence-workflow:master Apr 24, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants