Releases: outbrain-inc/orchestrator
Releases · outbrain-inc/orchestrator
GA Release v1.4.543
GTID: - Added "reset-master-gtid-remove-own-uuid" command - ResetMasterGTIDOperation supports removal of self-UUID entries - setGTIDPurged respects --noop - Repoint() makes up a non-existent binlog file name when log file is empty (can be cause when switching to/from GTID with error)
GA Release v1.4.536
Fixed multiple-instances case where the input is empty
GA Release v1.4.532
Relaxed constraints on relocating up from a binlog server
GA Release v1.4.529
- processlist also filters by 'Binlog Dump GTID' when searching for slaves - similarly, for excluding log-running-queries.
GA Release v1.4.515
MakeCoMaster: We allow breaking of an existing co-master replication. Here's the breakdown: Ideally, this would not eb allowed, and we would first require the user to RESET SLAVE on 'master' prior to making it participate as co-master with our 'instance'. However there's the problem that upon RESET SLAVE we lose the replication's user/password info. Thus, we come up with the following rule: If S replicates from M1, and M1<->M2 are co masters, we allow S to become co-master of M1 (S<->M1) if: - M1 is writeable - M2 is read-only or is unreachable/invalid - S is read-only And so we will be replacing one read-only co-master with another. (instance_topology.go) supported by Web interface (cluster.js)
GA Release v1.4.500
Recovery:
- Clusters dashboard indicate per chain whether it has auto failover for master/intermediate master
- Audit-detection page provides with extra info on the crash analysis.
- This includes the analysis changelog for the failed instance
- Also suggests the related recovery (if taken) for this detection
- Audit-recovery page provides with extra info on recovery; also suggests the related discovery event
- Recognizing crashed recoveries (recoveries started by an orchestrator that crashed halfway through)
- Automatically acknowledging such crashes
- This may potentially cause an endless rolling recoveries sequence, in the case where an internal bug causes orchestrator to consistently crash on a given scenario.
- Manually invoked recoveries always override any blocks
- ApplyMySQLPromotionAfterMasterFailover config will set promoted master as writable and issue a RESET SLAVE.
- Better detection of GTID-based recovery scenario
Visibility:
- Any running orchestrator (HTTP or CLI) identifies itself in the node_health table
- including the type of invocation and the primary command it was executing
- This allows for quick understanding of what's running, when and why
- node_health_history table keeps record (for a few days) of all past orchestrator invocations, from anywhere.
Other:
Added "--skip-unresolve" command line flag
GA Release v1.4.476
Support for MySQL 5.7 as backend database - implied by support for NO_ZERO_DATE,NO_ZERO_IN_DATE sql_mode, so shockingly ignored by now. - fixed schemas to never have zero in date - ugly sql_mode hack as backwards-compatability mechanism orchestrator -c help does now initiate any db action
GA Release v1.4.475
node health: - db: PK is on (hostname, token) - aggressively written to, every 10 seconds - aggressively purged (expire = insert interval * 2) - written to also by CLI instances recovery: - added AcknowledgeCrashedRecoveries(): auto-acknowledge (mark as failed) recoveries owned by a now-dead process. This is detected using the above node health changes.
GA Release v1.4.474
clusters page (aka dashboard) indicates replication analysis problems
GA Release v1.4.472
- node that gets elected updates health status (this makes sure its token in health status is aligned with elected node entry) - some minor renaming