Releases: cadence-workflow/cadence
v0.18.1 Release
Bug Fix & Improvements
b6dfe37 - Move tcheck to go.mod, get rid of glide dependency (#3999)
7882518 - Provide background context to all scanners (#3965)
f30083a - cli v0.18.3 (#3959) (4 weeks ago)
6bdf90a - CLI v0.18.2 (#3944) (4 weeks ago)
c5643e7 - Update test to suppourt multiple database concurrent updates (#3939)
v0.18.0 Release
Schema
- MySQL schema upgrades: default database upgrade to 0.4, visibility upgrade to 0.3
- PostgreSQL schema upgrades: visibility upgrade to 0.3
Notes
-
Allow using Kafka TLS without cert ca and key (#3862)
EnableHostVerification
started to be used for Kafka TLS, by default it’s false and which meansInSecureSkipVerify
is true for Kafka TLS. But previouslyInSecureSkipVerify
is false. If you want to keep the same behavior, please update your config to setEnableHostVerification
to be true. It won’t break anything if not doing so, but may be risky to not verify it. This option is basically the inverse of InSecureSkipVerify. See http://golang.org/pkg/crypto/tls/ for more info. -
Support visibility query with close status represented in string (#3865)
Advanced workflow visibility record query syntax now supports using string as workflow close status. Accepted values are:COMPLETED
,FAILED
,CANCELED
,TERMINATED
,CONTINUED_AS_NEW
andTIMED_OUT
, case insensitive.
New Features & Improvements
GRPC
0efe3b6 Remove proto dependency
c6b3060 Switch Health status endpoints to internal types (#3842)
435f1e8 Switch the remaining history component to internal types (#3843)
cb65b83 Switch common/ndc to internal types (#3841)
a5aaeb1 Switch history/query to internal types (#3836)
54559f4 Switch history related types to internal (#3835)
e336c77 Switch timeout type to internal (#3833)
d59208a Use internal types in persistence store parser (#3821)
2675fb9 Switch decision related types to internal (#3822)
ecdfd7c Switch history/replication to internal types (#3805)
85bb979 Switch integration tests to use internal types (#3815)
0f02810 Switch frontend handlers to use internal types (#3828)
b786d3c Switch admin handler to internal types (#3823)
cca6243 Switch WorkflowExecution type to internal within history service (#3816)
a6edb24 Switch domain stack to use internal types (#3754)
4318e53 Add thrift mapper from internal sql types to and from thrift (#3814)
c324af8 Switch common/testing to internal types (#3793)
f8dbcda Introduce internal types for sql blobs (#3810)
c361411 Regenerated internal types to include ActivitiesToDispatchLocally (#3812)
2069ee4 Convert PayloadSerializer to internal types (#3799)
f3e13b2 Switch CLI server clients to thrift wrappers (#3803)
fb144ff Switch parentclosepolicy worker to internal types (#3807)
bb71629 Switch worker/archiver to internal types (#3806)
2de9816 Switch common/archiver to internal types (#3788)
25ea280 Switch errors to internal types (#3759)
2b635fb Add json tags for internal types (#3769)
340eee7 Convert execution persistence types to use time.Time, time.Duration and internal types (#3775)
b0316fa Switch visibility related types to internal (#3770)
32b167b Switch shard related types to internal (#3768)
25281df Switch remaining task/matching types to internal (#3767)
94e61d8 Explicitly list frontend handler interface (#3763)
Task Processing
5bb1c71 Add task processing workflow busy metric (#3892)
aa61ba2 Fix activity lost metrics (#3889)
5928621 Transfer queue validator (#3875)
926e909 Task processing debug logs (#3877)
e855a6a Add logging/metrics for decision attempts (#3849)
c5a16ab Fix error handling when processing parent close policy (#3845)
005d7e8 Improve retry policy for operations in task processing (#3765)
Replication
79ce048 Handle data corruption error in replication (#3895)
66c4843 Add replication error logging and metrics (#3891)
8b63df7 Update read DLQ messages API to return raw task info (#3869)
0d737d4 Add domain name tag in failover metrics (#3882)
49b64fc Add admin CLI scan command for unsupported workflows (#3824)
0912eef Ignore workflow without version histories (#3773)
Scanner
ba9eecf Start enabled shardscanner fixers (#3906)
04044c6 Add timers shardscanner (#3846)
76a38b4 Set config for shardscanner fixer (#3844)
4dce7d4 Use shardscanner for execution scanners (#3772)
94c063a Run fixer workflow automatically (#3808)
8a6df38 Create domain filter for fixer workflow (#3771)
Workflow Reset
51ddd5b Fix workflow reset command (#3904)
94d510e Remove strict sanity check to allow reset (#3879)
b9f3cff reset workflow with no decision task complete (#3687)
4b75445 Patch a change to support reset 2dc workflow (#3778)
ElasticSearch
6e69fa1 Support visibility query with close status represented in string (#3865)
d58cf9d Fixed elastic config parsing and introduced optional disableHealthCheck (#3838)
66efd3f Fix lint error for esUtils package (#3813)
a201ed7 Fix ES client config NPE (#3794)
ea6ae3d Allow custom ES client options (#3790)
1b4c9a8 Allow disable Sniff for ElasticSearch (#3777)
Kafka
54abf09 Allow using TLS without cert ca and key (#3862)
df3c714 Rewrite Kafka client and support SASL/TLS
SQL
98cd0a5 Pick sql index changes (#3866)
ba840f3 Merge sql updates: Blob size increase (#3858)
2cc64c6 Support different SSL modes (#3787)
Others
955f41c break if adminClient returns error (#3887)
405b334 Fix dynamic config collection logValue function (#3880)
4044e85 break out when response is nil (#3886)
4b2ae3e Improve shard context timeout handling (#3881)
bc163ed Make max activity schedule to start timeout for retry configurable by domain (#3878)
6a8e9a3 replace string based logging with tagged logs (#3871)
eb161e6 Fix go-generate (#3864)
607b605 Handle matching task list conditional error (#3867)
2bb477b Delete unused dynamic configs that have no referrence anymore (#3859)
9a34dcf error check before return the ActivityLocalDispatchInfo (#3853)
01028b6 Fix NPE in DescribeMutableState (#3850)
2ac93ae Switch to gocql interface (#3837)
9cb4052 Fix get raw history for transient decision (#3847)
a5a62d6 Add various default config to startup
11dc425 Add CLI command for signalWithStart workflow execution (#3819)
3e5100f Log short context timeout for long poll GetWorkflowExecutionHistory (#3818)
0d10cd3 History Error Injection Client (#3796)
c2e2e60 Add cadence_service tag for all metrics in rootScope
127a97a Fix workflow retry policy for childWorkflow (#3780)
a64c8c5 Gocql interface and open source implementation (#3774)
7128c26 Frontend and Admin Error Injection Client (#3764)
0993f6e Set default MaximumSignalsPerExecution to 10K (#3776)
5d7f934 Matching Error Injection Client (#3746)
Misc.
d0290a1 Update docker for 0.18.0 release
5f8e1f2 CLI 0.18.1 patch release (#3908)
13303a3 CLI 0.18.0 release (#3896)
0b5a851 Latest idl (#3888)
218de53 Add instructions to setup local MySQL and Postgres (#3868)
4f405e9 Downgrade golang tools version (#3876)
ea35743 Improve development documentations
d3aeabd Improve building tools and docs
e41832d Add compose file for prometheus and improve docs (#3782)
v0.16.1 Release
v0.17.0 Release
Release note: Upgrade Cadence server to the latest 0.16.x release prior to deploying this release.
0.17 has a change that is not compatible with releases before 0.15. Upgrading from <=0.15 releases directly to 0.17 would cause StartWorkflowExecution and SignalWithStartWorkflowExecution APIs to return errors with workflow ID re-use scenario during the server upgrade. Please consider upgrade the Cadence server to the latest 0.16.x release before deploying this release.
New Features & Improvements
ElasticSearch
4588ce4 Allow custom ES client options (#3790)
35c788c Allow disable Sniff for ElasticSearch (#3777)
2ea3ca8 Add username/password support for ElasticSearch
1125076 Support ElasticSearch v7 (#3700)
242db1d Refactor ElasticSearch visibility code path for multi version support (#3666)
Multi-tenant Task Processing Improvements
157ed7e Improve retry policy for operations in task processing (#3765)
c3f691b Enforce queue max poll interval correctly (#3693)
2c08e6e Fix error handling for multi cursor timer queue look ahead (#3671)
Graceful Domain Failover and Replication related improvements
5e444be Add domain name tag in failover metrics (#3882)
7fda7c6 Refactor domain queue and add domain dlq size metrics (#3704)
fcfe972 Backfill workflow version histories when resetting (#3742)
995c4dc Adding nil checks for version histories (#3724)
8055ae8 Service busy error will not put replication task into DLQ (#3697)
97d824e Update domain replication retry policy
17e62aa Update workflow execution with version histories check (#3679)
51355ed Remove replication state (#3649)
3ea959f Add replication task end to end latency (#3659)
1001823 Fix if the replication state is nil (#3645)
118ef83 Emit different metrics on pending active task redispatch (#3641)
Scanner
b12060d Run fixer workflow automatically (#3808)
94155bf Create domain filter for fixer workflow (#3771)
59d8af5 Disable history scanner by default and tune the default RPS
6a1c6a0 [scanner] add generic shard scanner (#3638)
GRPC
4c4bdf9 Regenerated internal types from latest package (#3854)
bf8fd84 Regenerated internal types to include ActivitiesToDispatchLocally (#3812)
dd44d4c Convert execution persistence types to use time.Time, time.Duration and internal types (#3775)
5e697d0 Internal types Ptr() helper (#3752)
1acd51b Use internal types for visibility store (#3749)
60b68d3 Switch matching service to internal types (#3744)
1e3278a Keep original enum order for internal types (#3745)
436bb56 History client internal types (#3712)
f65640a Admin client internal types (#3710)
2cb47c1 Frontend client internal types (#3711)
f235d63 Switch ElasticSearch validator to internal-types (#3728)
96b9379 Domains manager and store serialization refactor (#3723)
13a2eb9 History persistence store manager types (#3730)
f3b9fdb Matching client internal types (#3713)
a933709 Revert datablob change from manager/store for Visibility (#3716)
e493f3c Revert datablob change from manager/store for metaData store (#3706)
88fd2c5 Move serialization of datablobs to manager for shardManager (#3709)
b9485f8 Thrift clients (#3695)
d2289ce Map internal errors to thrift in thrift handlers (#3694)
7e03347 Fix lint warnings for generated internal types (#3689)
2bb5d47 Add thrift to internal error conversion mapper (#3696)
aff5afa Separate thrift handler for frontend service (#3688)
81c1834 Separate admin thrift handler for frontend service (#3657)
b7ceac2 Separate thrift handler for history service (#3646)
ad06c81 Separate thrift handler for matching service (#3647)
85a2c9c Rename thrift visibility_timestamp to task_timestamp (#3680)
3f3f18a Add history types and update generation script (#3665)
22bf577 Convert visibilityStore/manager to use internal types (#3656)
39dfa44 Convert task store/manager to use internal types (#3663)
aa81104 Lean frontend Handler interface (#3658)
630357c Replicator internal types (#3660)
9815a9c Add comments to correct linter to internal type getters (#3654)
273159d Add getters to internal generated types (#3650)
Activity Local Dispatch
05f7678 error check before return the ActivityLocalDispatchInfo (#3853)
31b5918 Add task token to activityDispatchInfo for worker (#3672)
e533892 Populate activityDispatchInfo with timestamps needed for local activity dispatch by worker (#3669)
6890e42 Update idls to use ActivityLocalDispatchInfo (#3668)
Others
fe9c07f Remove strict sanity check to allow reset
7374645 Make max activity schedule to start timeout for retry configurable by domain (#3878)
cdb41e7 Fix DC redirect error overwrite (#3856)
a43c646 Fix NPE in DescribeMutableState (#3850)
afa221a Add logging/metrics for decision attempts (#3849)
66c2b12 Fix error handling when processing parent close policy (#3845)
d3c9b9d Fix get raw history for transient decision (#3847)
1e4da1b Fix ES client config NPE (#3794)
03d7dfe Allow batch delete for several mutable state fields (#3760)
c52c1c2 Fix update sticky task list in task store (#3761)
f9bb732 Improve logging for dynamic config
11fde99 Remove wrong extra sslParam in Postgres TLS
c4fe47b Enforce context in cassandra persistence implementation (#3751)
3eebd2a Various fixes and improvements for reset and retry error, add integration test for reset
f01f892 Increase default ListMaxQPS config
02da254 Add cassandra username and password config values to docker image
23d95d3 Improve logging when loading a dynamic config
4ffbfa6 Added cadence docs github repo to README (#3748)
048cac0 Allow dockerize to use config template from a different directory
94c05dd Persistence Error Injection Client (#3734)
52aa8af Refactor ES query validator to not modify requests (#3733)
94ecc07 Fix iterator context (#3743)
5679b15 Add warmup period to frontend service health reporting (#3731) (#3736)
1e7502a Dynamic config for max user provided task list name (#3732)
c11be59 Add ratelimiter to QueryWorkflow API (#3735)
11d9bd2 Update sqlx dependency to get rid of hacking fix (#3722)
7cd851d Improve reset CLI: add DecisionCompletedTime as resetType and also provide … (#3667)
0727b17 Don't retry when reset to a middle of a batch to save perf (#3721)
6e5f9ec Reset allows skipping signal reapply (#3715)
d47b0d1 Make id length dynamic config per domain (#3705)
923b7a7 automate creating docker images and upload to dockerhub on release (#3703)
291897c Improve Error Handling For Cassandra Persistence Implementation (#3699)
3f47f9c Fix panic for concrete execution checks (#3698)
14c153b Expose pending decision task info (#3691)
e0f0311 Use Now as timestamp for all transfer and replication tasks (#3684)
a061070 Fix and improve archival query parser error messages (#3681)
10a84f3 Notify new tasks when persistence operation timeout (#3678)
3e32b94 Add workflow type tag when emitting workflow completion metrics (#3670)
6eb918e Support TTL where needed for SQL plugin interface (#3664)
0568fa9 Only query on sticky task list when domain is active (#3661)
4281089 Enforce persistence context timeout in application layer: Part 5 (#3653)
3ab327b Refactoring Cassandra domain persistence manager for NoSQL support
d063f27 Refactor visibility Persistence API and Cassandra implementation
0ea68b8 implement postgres support for TLS (#3488)
ba0eacb Enforce persistence context timeout in application layer: Part 4 (#3643)
e1126ac Fix for unifying shard rangeID (#3651)
9a787f7 Unify shard rangeID column and field in Cass persistence (#3632)
da72f32 Enforce persistence context timeout in application layer: Part 3 (#3631)
Misc.
958b939 Update docker for 0.17.0 release
8f88cb5 Pin IDL submodule to commit from master branch (#3683)
daba348 Add correct license header to attribute to temporal and update license generation tool (#3674)
09267d6 Update CONTRIBUTING.md to include make fmt command
f4a9df2 CLI release 0.15.0
v0.16.0 Release
Breaking Change
This release contains a breaking change in workflow metadata. This change has been enabled since 0.14 release. If your workflow could be open for 6+ months or you upgrade to this release from 0.13 or below, please follow the migration instruction.
Breaking Change on config for MySQL/Postgres
It's required to add
encodingType: "thriftrw"
decodingTypes: [ "thriftrw" ]
to persistence configuration like in this example
Note that this requirement is removed in later in 0.18.
Schema Change
- Cassandra
cadence
keyspace update from v0.29 to v0.30
New Features
ff47f25 Add task token to activityDispatchInfo for worker (#3672)
12703dc Populate activityDispatchInfo with timestamps needed for local activity dispatch by worker (#3669)
5b455a3 Update idls to use ActivityLocalDispatchInfo (#3668)
Improvements
d4ec2e2 0.16.x compatible with future releases (#3839)
f56ea60 Add log/metrics for decision attempts and force schedule new decision (#3840)
40ab8cc Patch a change to support reset 2dc workflow (#3778)
7e54e71 Ignore workflow without version histories (#3773)
046ec4a Improve retry policy for operations in task processing (#3765)
09ad293 Backfill workflow version histories when resetting (#3742)
2c0c3a9 Dynamic config for max user provided task list name (#3732)
b8aeca9 Add warmup period to frontend service health reporting (#3731)
a5b5520 Make id length dynamic config per domain (#3705)
f440ca6 Service busy error will not put replication task into DLQ (#3697)
0664a6c Update domain replication retry policy
bb7ad07 Notify new tasks when persistence operation timeout (#3678)
6e20d59 Add workflow type tag when emitting workflow completion metrics (#3670)
efcbcc8 Only query on sticky task list when domain is active (#3661)
d3a1646 Revert "Convert metadataStore/manager to use internal types (#3615)"
e6ff5fd Revert "Pin gocql version"
Bug Fixes
0e363f8 Update nil check (#3791)
7c466a8 Fix iterator context (#3743)
08ecbdf Adding nil checks for version histories (#3724)
a175e44 Fix panic for concrete execution checks (#3698)
472509f Enforce queue max poll interval correctly (#3693)
f1ae5a7 Fix error handling for multi cursor timer queue look ahead (#3671)
v0.15.1 Release
Bug Fix & Improvements
ea865f7 Fix if the replication state is nil (#3645)
6960e9b Fix NPE for failover query
cd103de Add operator to admin failover
38e0451 Move batcher to new domain
62292ce Update failover default settings
295c897 Fix managed failover list
Misc.
ede38f0 Update docker for 0.15.1 patch release
77af12d Revert "Pin gocql version"
3a633f1 CLI release 0.15.0
v0.15.0 Release
Schema Change
- Cassandra
cadence
keyspace update from v0.28 to v0.29
New Features
Multi-tenant Task Processing Improvements
b24ea12 Enable priority task processor by default (#3571)
006db8b Persist and load multi-cursor processing queue states (#3480)
236b644 Multi-cursor Queue Improvements Part 2 (#3518)
57a20dc Multicursor Queue Processor Bug Fix (#3508)
4d4482e Add multicursor processing queue related metrics (#3510)
fb076fb Multicursor Queue Processor Improvements (#3509)
728d1a0 Start queue processor before failover callback registration (#3494)
1161839 Enable processing queue split policy by domainID (#3486)
fa57460 Introduce per domain metrics for task processing. (#3467)
65a686d Admin describe queue state command (#3462)
85e0cb1 Add queue action for getting processing queue states (#3436)
3e8f6a5 Fix multi-cursor queue polling logic (#3420)
d096317 Fix task attempt metrics in priority task processor (#3428)
50ca22b Add Reset Queue Command (#3414)
Graceful Domain Failover and Replication related improvements
fc856d5 Update shard info when adding failover marker (#3507)
f800142 Ignore reapplication if the domain is pending active (#3502)
638a620 Clean up kafka replicagtion in worker (#3493)
0445e3e Allow DLQ cli use a range of shard ids (#3481)
33affc3 Enforce re-replication context timeout for standby tasks (#3473)
acb6b62 Add persistence layer for multi-cursor queue processing (updated) (#3468)
4ac545c Replication task generation delay (#3465)
6b51462 Add a default timeout for get replication messages API (#3459)
c4b824a Remove rpc replication migration features (#3461)
ed77ef1 Fix message for admin DLQ merge command (#3460)
e396236 Adding delay on replication task processing (#3458)
5d12110 Adding replication task processing metrics (#3452)
60c3463 Make the failover end time check compatible with original value (#3450)
01d2d1a Fix get dlq size NPE (#3422)
121ef1e Bug fixes for failover (#3415)
Managed Failover
554b8bc Add Managed Failover (#3558)
Scanner
ea62bcc Disable concrete execution scanner by default (#3438)
6b64e4c [Scanner] handle current execution to read from blobstore (#3435)
e284980 [Scanner] Fixed missing scan type param in activities (#3434)
42444dc [Scanner] Use CurrentExecution instead of base one (#3421)
6aca638 [Scanner] Register current execution scanner (#3418)
e179113 [Scanner] Use constant for current execution run id (#3419)
GRPC
c8b2b4c Finish proto definition for persistence impl (#3528)
5616dac Define proto blobs for executions, child executions and history tree (#3521)
7223ddf Define protoblob for activityInfo and other supporting types (#3520)
87cf4a8 Define domains protos (#3512)
Bug Fix & Improvements
e3f0b40 Add cluster name filter (#3561)
3a15e73 Emit metrics on customer ids being too long (#3529)
280dba2 Add failure metrics for failover (#3526)
8451d1f Use a limited context timeout to get current exec lock (#3515)
0c16ca1 LimitExceededError should be non-retryable (#3511)
a054dbe Split reconciliation/common (#3513)
673750a Emit active_cluster metric during domain cache update (#3517)
3b56ad7 Fix NDC resetter persistence bugs (#3500)
d3cab3e k8s: fix cassandra-tool env key conflict with k8s (#3505)
1efa69b move retryer to persistance package (#3497)
bb7046b Remove kafka replication from history and cli (#3503)
0dd015e Add domain tag to history query metrics (#3504)
b7b9008 [SQL]Fix upsert SQL template for Postgres plugin (#3498)
7acfe13 Do not extend activity expiration time (#3489)
007716d Integrate current execution check with replication resender (#3487)
6e0e0f1 Extend activity ScheduleToClose timeout to expiration time (#3485)
63e8bac Fix indexer to filter out invalid msg (#3479)
3a3ba20 Reduce log verbosity and add message to logs that didn't have one (#3478)
6e8cdee Cap activity ScheduleToStart timeout when large retry expiration time is specified (#3470)
570e684 reverted import groups (#3476)
dcfa0a3 Fix indexer retry on bulk commit failure (#3474)
5535d29 Add admin list domain command (#3472)
4f49f61 Integrate current execution check with replication resender (#3466)
15eed39 add tests for getRawHistory (#3469)
0325ee5 fix transient decision serialization for getRawHistory feature (#3463)
f5b91c1 Slow down decision tasks after failure (#3447)
07a995f Update metrics cache to be unblocking (#3443)
fc2e167 Fix shard ID filter (#3451)
cc2f567 Enforce timeout for getting workflow execution in task processing (#3442)
aa11088 Update the config to be able to set the value by shard (#3441)
92232ab Reduce context timeout for transfer tasks (#3437)
db444cf Fix sticky decision keeps failing on ConditionFailedError (#3423)
dee9745 Drop stuck close execution transfer task (#3240) (#3429)
Misc.
73df4f4 Update docker for 0.15.0 release
84631ab Pin gocql version
e813e26 Update slack invitation link
3c20691 Go mod tidy (#3523)
2fc957e Improve developer contribution guide for Postgres development (#3495)
0039b4b Fix lint errors (#3455)
7e8b7ef CLI 0.13.0 release (#3453)
68a0fe2 remove idl changes (#3444)
v0.14.2 Release
v0.14.0 Release
Schema Change
- Cassandra
cadence
keyspace update from v0.27 to v0.28
New features
Multi-tenant Task Processing Improvements
- 4f3374f Add filter for pending active task to redispatch (#3279)
- 34bac76 Priority Task Processor Improvements (#3284)
- da4cffa Transfer queue processor base V2 (#3278)
- c854872 Wire up multi-cursor transfer queue processor implementation (#3285)
- 2db63ea Enforce time resolution for timerMaxReadLevel (#3411)
- d9dc526 Task Redispatcher (#3406)
- b0b0bbb Add timers subcommand to admin CLI (#3404)
- 8420309 Refactor queue processor base implementation (#3380)
- 582d497 Improve processing queue split policy (#3287)
- 9ffa5e4 Timer queue processor base v2 (#3306)
- dd9cb49 Wire up multi-cursor timer queue implementation (#3318)
- 27dcc41 Wire up multi-cursor queue split policy (#3326)
- 034959e Add back pressure mechanism for multi-cursor queue (#3338)
DB Scanner
- a1a1483 Scanner impl (#3286)
- 11c17f9 Add invariant manager (#3263)
- b65e13f [Scanner] handle current execution to read from blobstore (#3435)
- 3f32331 [Scanner] Fixed missing scan type param in activities (#3434)
- 25ff793 [Scanner] Use CurrentExecution instead of base one (#3421)
- 49f30ca [Scanner] Use constant for current execution run id (#3419)
- 0e288fb [Scanner] Register current execution scanner (#3418)
- 78031ab [Scanner] Add persistence APIs for current execution (#3416)
- 8904368 [Scanner] Add concrete execution check for current execution (#3409)
- 293d5c2 Extend execution scanner framework with an entity type param (#3402)
- 6a238a7 Run concrete executions scanner always (#3370)
- 779b711 Scanner and fixer workflow implementations (#3307)
- af2c994 Remove dependencies between invariants (#3316)
- 754870f Size reduction of execution scanner workflow (#3313)
- 4aa034a Add determining invariant from invariant manager (#3320)
Replication
- e3f4248 Fix get dlq size NPE (#3422)
- 99e0a09 Failover marker persistence (#3274)
- dcb6909 Integrate current execution check with replication resender (#3487)
- 00c60df Replication task generation delay (#3465)
- 2e58738 Add a default timeout for get replication messages API (#3459)
- 54b4700 Adding delay on replication task processing (#3458)
- 7fcf1f7 Adding replication task processing metrics (#3452)
- 7dbbfd1 Notify failover marker api (#3296)
- 514bc73 Add processor to handle graceful failover timeout (#3277)
- 45096ab Add previous failover version in domain v2 table (#3308)
- dd168b1 Add config to enable/disable worker replication (#3368)
- 5d31725 Handle shard lose during inserting markers (#3328)
- 009d5c3 Add failover marker coordinator (#3288)
- fc543b8 Insert failover when domain moves from active to passive (#3290)
- 19a1ae8 Wire up update previous failover version (#3321)
- 0474c64 Redispatch task on passive processor during failover (#3333)
- 1e8db70 Persist failover marker in shard info from replication queue (#3304)
- 92d7230 Wire up DLQ ack level in shard info (#3366)
Domain tag
- f4f7ab2 Add domain tag to history query metrics (#3504)
- 8de8f99 Introduce per domain metrics for task processing. (#3467)
- a14011e Improve domain tagged metrics in priority task processor (#3379)
Bug Fix & Improvements
- aa6b8a9 Update docker for 0.14.0 release
- f3b841b Allow DLQ cli use a range of shard ids (#3481)
- 7376035 Fix 0.14 build error
- d444681 Fix indexer to filter out invalid msg (#3479)
- c747a85 Fix indexer retry on bulk commit failure (#3474)
- 36b19d8 fix transient decision serialization for getRawHistory feature (#3463)
- 3a315e4 Fix sticky decision keeps failing on ConditionFailedError (#3423)
- 8259958 Reduce log verbosity and add message to logs that didn't have one (#3478)
- 456ea42 Fix shard ID filter (#3451)
- e3f4248 Fix get dlq size NPE (#3422)
- cba5e5f Bug fixes for failover (#3415)
- 6c18bc6 show timers distribution on a histogram (#3413)
- 713a7a9 Fix failover marker update lock (#3412)
- 624862f Reduce standby task attempts (#3410)
- 18b7314 Adding metrics for failover marker latency (#3408)
- 8da382d Fix visibility archival canary metrics (#3407)
- 080db04 Fix returned error in update wf when shard lost (#3405)
- b21d93f kafka replication should be enable by default (#3403)
- fe18264 Fix some minor error messages
- 078cdb4 Remove executable permission from JSON files (#483)
- 5839172 Fix describe shard issues
- 6b109f8 Create CLI command to describe shard by id (#370)
- d99fb20 Fix cli watching history times out after 2 minutes (#336)
- 2c2b484 Create utility methods for scanner shard struct (#3400)
- 9a04e00 Skip adding DLQ message if the shard is closed (#3401)
- a21d455 Add logs and metrics for multi-cursor queue (#3381)
- ae14d47 Call cancel on context to prevent memory leak (#438) (#3398)
- 5791f07 Move scanner common code to common folder (#3390)
- 891e5ed Add workflowID runID to frontend API error log (#3382) (#3391)
- 78168f2 Fix error import (#3396)
- a24a6c0 Increase timeout for workflow start to close on scanner activity (#3395)
- 13ed458 Fix Get DLQ if the workflow is archived (#3394)
- a1aad87 Emitting metrics if DLQ is not empty (#3389)
- de1ad6d Add retries to range renewal (#3388)
- 5aac215 Add warn logs for shard closures (#3387)
- 5cf711c Make read history branch page size configurable (#3385)
- c86d8da Fix task redispatch logic (#3384)
- 85cd226 Add shard distribution metrics and query (#3377)
- 598c507 Update replication path to use lazy retry on service busy error (#3376)
- 2911586 Emit metrics with domain tag in priority task processor (#3375)
- 19d1b49 Update worker replication config (#3373)
- 129b62e Fix search attribute validation error on bool/double type (#3372)
- f8c60c1 Fix start workflow execution expiration time (#3371)
- 686f812 Fix task agressive retry with TwoPhaseRetryPolicy (#3369)
- 3dfddc5 Fix task processing retry policy (#3367)
- 56a510e Add region filter for cassandra config (#3363)
- abd21dc Fix history resender source cluster (#3365)
- 42b29a8 Update cadence-sql-tool README (#3362)
- 0fae61c Improve task latency metrics (#3364)
- 6fcee79 Replication metrics (#3361)
- b5ce9c7 Fix handling of CurrentWorkflowConditionFailedError when create wf (#3349)
- fa3155e Fix redispatch queue initialization (#3359)
- 9367197 fix cli argument for ndc workflow resend (#3357)
- 875e8ad Improve task redispatch timer (#3355)
- 45ac137 Update replication ack level if response has no task (#3356)
- 33e4a75 Removed unused metric scopes for event cache (#3354)
- b058449 Add ability to disable shard level worker pool (#3353)
- b78a29a Fix dynamic config map property conversion func (#3352)
- e984b00 Fix retry and metrics on context error (#3327)
- 30bd2b5 Add header options for starting workflow via CLI (#2862) (#3341)
- fb6e4e0 Update replication retry policy (#3346)
- 208c34f Move resetor to reset package (#3340)
- b7f42fa Fix history task replication DLQ retry policy (#3343)
- e050be3 Fix ListTaskListPartition command (#3342)
- e2401c4 Remove -i go build flag (#3128) (#3335)
- 5068665 Add retry options for starting workflow via CLI (#3289) (#3330)
- c1cf2ca Add host level task worker pool (#3331)
- efb2e7c Removed unused event cache API (#3337)
- f7683cb Remove safety check for event global cache (#3336)
- a9e4715 Pin previous failover idl (#3317)
- 4b8f50a Fix NPE for reset (#3309)
- 3c0597e Misc Renaming (#3310)
- 12c258a Added metricScope cache implementation (#3299)
- f333dff Fix fossa script (#3312)
- 6215290 Fix task processor shutdown logic (#3311)
- eca98ba Fix remote sync match error metrics (#3305)
- d787ccc Fix task loading for multi-cursor transfer queue processor (#3298)
- 526a96d Make event cache size based (#3294)
- f8c0e93 Move PR template (#3293)
- 950591c Add fixer impl (#3291)
- 55bf6da Implement simple RWMutex cache to be used in domainCache (#3273)
v0.13.0 Release
Schema Change
- Cassandra
cadence
keyspace update from v0.26 to v0.27 - Cassandra
cadence-visibility
keyspace update from v0.4 to v0.5
New features
Multi-tenant Task Processing Improvements
- d564e84 Fix task processing retry policy (#3367)
- 4f84809 Improve task redispatch timer (#3355)
- 28e4485 Add ability to disable shard level worker pool (#3353)
- d146aa3 Add host level task worker pool (#3331)
- 79606aa Priority Task Processor Improvements (#3284)
- 1bef70a Implement task processing queue collection (#3260)
- c43b6dc Implement Processing Queue Split Policy (#3232)
- fef64d7 Multi-cursor processing queue implementation (#3214)
- 0e2df8a Implement domain filter for multi-cursor queue (#3207)
- b365258 Add interfaces for multi-cursor queue implementation (#3194)
- ffc2f55 Add back pressure logic for task loading (#3160)
- 9b02fec Fix redispatch queue nil pointer exception (#3156)
- 97e0c83 Wire up priority task processor implementation (#3146)
- 4d80f0b Task redispatch queue (#3124)
Replication & Graceful Failover
- 514d765 Wire up DLQ ack level in shard info (#3366)
- 610b15d Replication metrics (#3361)
- 0a11569 Update replication ack level if response has no task (#3356)
- 82294e6 Update replication retry policy (#3346)
- 8e542c0 Adding failover marker replication task in thrift and persistence struct (#3270)
- 3aab131 Rpc replication ack level update (#3266)
- cd7ec20 Check on-going failover across clusters (#3206)
- 0f34983 Fix domain replication queue cleanup query (#3259)
- afcc51e Add flag for replication cleanup process (#3241)
- 06d551b Adding design doc for graceful failover (#3129)
- ae834c0 Update isDomainActive condition to honor pending active statue (#3176)
- 99b5cb9 Support graceful failover in CLI (#3205)
- 6301549 Domain failover update (#3164)
- 5a1f8d3 Add failover end timeout in domain data (#3137)
DB Scanner
- 7efabca Run concrete executions scanner always
- ea5ba83 Add determining invariant from invariant manager (#3320)
- ec9fd41 Size reduction of execution scanner workflow (#3313)
- ff87065 Remove dependencies between invariants (#3316)
- f6cffc7 Scanner and fixer workflow implementations (#3307)
- a956023 Add fixer impl (#3291)
- e61348b Scanner impl (#3286)
- e40e857 Add invariant manager (#3263)
- bcc0d8f Add scanner invariants (#3257)
- 2bf7e39 Make output of file based blobstore human readable (#3246)
- c9d4837 Add iterators and writers for persistence and blobstore for scanner workflow (#3234)
- 6a6f209 add interfaces and types for scanner and fixer workflow (#3226)
- 848e34f Add iterator for buffered writer (#3224)
- 9f15885 Add blobstore buffered writer (#3219)
- 3b88ac6 Add blobstore interface to bootstrap params and implement filestore (#3210)
- b98988c Close iterator on ListConcreteExecutions (#3187)
- d2054fe DB scan admin command retry db operations (#3184)
- e034cb0 Check MS still exists before checking history scan invariant (#3178)
- 7f1339e Add admin db clean command (#3174)
- 7fe008e Db scanner additions (#3172)
- aefcfee Add database admin scan command (#3165)
Refactor
- cfb3693 Refactor mutablestate builder part 1 (#3238)
- d6cbc1a Move NDC related code to ndc package (#3213)
- 24aa94d Move replication related code to subfolder under history (#3204)
- 8f77569 Replace context in historyEngine (#3203)
- 8e15b08 Refactor start and signalwithstart logic (#3201)
- d83a2ba Decompose history service logic into separate packages (part 4) (#3197)
- 8022ac6 Refactor get history API (#3196)
- 9e6ce72 Refactor db scan checks and include in delete (#3193)
- 5ba2353 Refactor progress report and include metadata about open executions (#3189)
- 830974d Decompose history service logic into separate packages (part 3) (#3190)
- 90df514 Decompose history service logic into separate packages (part 2) (#3186)
- e62d76e Move history service logic into separate packages (part 1) (#3180)
- 0e8aecc Refactor frontend handler (#3142)
Bug Fix & Improvements
- 1439249 Update docker for 0.13.0 release
- b276ff4 Fix search attribute validation error on bool/double type (#3372)
- c27734c Fix start workflow execution expiration time (#3371)
- 04d2643 Fix legacy reference
- 70c2881 Fix task agressive retry with TwoPhaseRetryPolicy (#3369)
- b066ed1 Fix history resender source cluster (#3365)
- 14e45bd Improve task latency metrics (#3364)
- 15f0e3a fix cli argument for ndc workflow resend (#3357)
- 9dc6c9e fix unit test failure introduced by cherry-pick
- ef13c65 Fix dynamic config map property conversion func (#3352)
- d57a7ce Fix history task replication DLQ retry policy (#3343)
- 6830be1 Fix task processor shutdown logic (#3311)
- 61126d1 Remove debugging log from add task APIs
- 1579105 Fix NPE for reset (#3300)
- 712a149 Add log in matching (#3297)
- 43fae73 Address cherry-pick errors
- ab20fce Drop stuck close execution transfer task (#3240)
- 21d0245 Remove domain tag in task processing
- 75c5d66 Pin gocql version
- 561e01e easy: Cleanup the go.mod/sum files (#3275)
- 1f3404b Fix shard instability (#3271)
- 74b40ed Make event cache global (#3265)
- 72f9705 Implement schema squashing (#3253)
- c4c485e Add issue report and feature request templates (#3268)
- 6d2cfab Update contributing instructions (#3242)
- af7583c Fix list task list partition issue in matching (#3256)
- 5321680 Fix reset reapply (#3252)
- ec36e6a Populate history response to fix NPE on client side (#3255)
- 83809d8 Better postgresql test defaults on OSX (#3244)
- df82425 address comments (#3248)
- 4e025cf Update license file and some small code cleanup (#3239)
- 4fb1e37 Update slack chat badge (#3245)
- 1ed8702 Add resend context timeout for ndc resender (#3247)
- 9313614 Add new dynamic config filter on shard level (#3243)
- 0209496 Generate and Resend history v2 replication task via CLI (#3233)
- 3deebda Prevent git from ignoring "cadence" directory (#3236)
- 7178f2a Add tag and dynamic config for authz (#3237)
- f48e723 Add roadmap page (#3200)
- e36e80b Remove unused default IDReusePolicy for SignalWithStart (#3229)
- 2082e05 Fix NPE for task when global domain disabled (#3228)
- f38cfde Support NDC raw histroy in message parser (#3227)
- fdc989c Add TerminateIfRunning to WorkflowIDReusePolicy (#3215)
- e236eb6 Fix nil pointer with struct pointer reference (#3222)
- 4a6daab Update CLI version to 0.12.0 (#3221)
- 72f3792 Fix metadata replication task with NDC (#3218)
- a8f7c49 Add feature flags for RPC replication migration (#3216)
- 63e8f5e Enable NDC by default (#3212)
- 6d1a17a Add JavaSDK client version to support raw history (#3209)
- 27af863 Fix batcher EntityNotExist check (#3170)
- ccd1fa0 Fixes start.sh to add connect attr if needed (#3158)
- cda6403 Check go client version for raw history query backwards compatibility (#3199)
- 8b703e4 Rate limit domain cache refresh (#3195)
- 5945b66 Fix panic on get history API (#3188)
- f4d7530 Start cadence-server as pid 1 in the auto-setup container (#3175)
- 45627d1 Allow empty postgres passwords (#3177)
- 2722cd6 Use version history to get branch token as fall back (#3185)
- 651a26b Add task list to CLI search workflow output (#3183)
- 91d9d86 matching: per task list metrics (#3155)
- c842f79 Add execution per shard stats (#3179)
- 58c0ccf Add tasklist to visibility (#3171)
- 62ef53d Fix canary domain creation (#3173)
- b5974d5 Add global ratelimiter for persistence (#3169)
- d098eac Add feature flag to not fail in flight decision (#3167)
- 315ce05 Use reader in admin kafka with 3mb buffer size (#3163)
- b4c1a8b Add global ratelimiter for frontend API (#3161)
- 0a7f74b Fossa integration (#3162)
- c23cd65 Improve filestore query parser for treating more values as valid close status (#3159)
- 0976e69 Fix admin remove task command for timer queue (#3157)
- a3e60fa Change queue message id to big integer in Cas (#3149)
- daa8998 Add admin failover for managed domains (#3154)
- 66e4d0d Add additional logging and metrics around size/count limits (#3147)
- 882f680 Add metrics to AccessControlledHandler (#3145)
- a933ae0 Log workflow info also when visibility data goes to DLQ (#3138)
- 9b46c2e Add autofowarding of consistent queries (#3140)
- 12ca4cd Fix visibility archival config for docker (#3143)
- ca46e6b Validate search attribute value (#3135)
- a88b972 Gracefully handle polling history on standby (#3139)
- 9ddcb08 First pass at graceful shutdown handlers (#3134)
- a5111a6 shardController: improvements for graceful shutdown (#3136)
- 31d2619 ringpop: update hashring immediately on ring change (#3130)
- 8bcbb4f Update IDL submodule (#3133)
- c7ba846 ringpop: add method to selfEvict from ring (#3132)
- 39ce924 Add dedup logic for standby activity heartbeat timer creation (#3131)
- da43539 Adding json tags for ShardInfo struct to reuse for docstore (#3127)
- 74c2c69 Record error info for retried activities (#1873) (#3116)
- 1e52f96 Change error type on query before first decision task (#3121)
- 86ae10c Add context in history resend (#3122)
- ee3c672 Get raw history compatibility (#3111)
- a015671 Update sync shard interval to 5 mins (#3119)
- e3eab7f Fix archival handled requests not match pumped requests (#3117)