Skip to content

Releases: outbrain-inc/orchestrator

GA Release v1.4.543

24 Nov 12:59
Compare
Choose a tag to compare
GTID:

- Added "reset-master-gtid-remove-own-uuid" command
- ResetMasterGTIDOperation supports removal of self-UUID entries
- setGTIDPurged respects --noop
- Repoint() makes up a non-existent binlog file name when log file is
empty (can be cause when switching to/from GTID with error)

GA Release v1.4.536

23 Nov 15:28
Compare
Choose a tag to compare
Fixed multiple-instances case where the input is empty

GA Release v1.4.532

19 Nov 13:12
Compare
Choose a tag to compare
Relaxed constraints on relocating up from a binlog server

GA Release v1.4.529

16 Nov 15:17
Compare
Choose a tag to compare
- processlist also filters by 'Binlog Dump GTID' when searching for

slaves

- similarly, for excluding log-running-queries.

GA Release v1.4.515

12 Nov 12:51
Compare
Choose a tag to compare
MakeCoMaster:

We allow breaking of an existing co-master replication. Here's the
breakdown:
Ideally, this would not eb allowed, and we would first require the user
to RESET SLAVE on 'master'
prior to making it participate as co-master with our 'instance'.
However there's the problem that upon RESET SLAVE we lose the
replication's user/password info.
Thus, we come up with the following rule:
If S replicates from M1, and M1<->M2 are co masters, we allow S to
become co-master of M1 (S<->M1) if:

- M1 is writeable
- M2 is read-only or is unreachable/invalid
- S  is read-only

And so we will be replacing one read-only co-master with another.
(instance_topology.go)

supported by Web interface (cluster.js)

GA Release v1.4.500

11 Nov 10:10
Compare
Choose a tag to compare

Recovery:

  • Clusters dashboard indicate per chain whether it has auto failover for master/intermediate master
  • Audit-detection page provides with extra info on the crash analysis.
    • This includes the analysis changelog for the failed instance
    • Also suggests the related recovery (if taken) for this detection
  • Audit-recovery page provides with extra info on recovery; also suggests the related discovery event
  • Recognizing crashed recoveries (recoveries started by an orchestrator that crashed halfway through)
    • Automatically acknowledging such crashes
    • This may potentially cause an endless rolling recoveries sequence, in the case where an internal bug causes orchestrator to consistently crash on a given scenario.
  • Manually invoked recoveries always override any blocks
  • ApplyMySQLPromotionAfterMasterFailover config will set promoted master as writable and issue a RESET SLAVE.
  • Better detection of GTID-based recovery scenario

Visibility:

  • Any running orchestrator (HTTP or CLI) identifies itself in the node_health table
    • including the type of invocation and the primary command it was executing
    • This allows for quick understanding of what's running, when and why
  • node_health_history table keeps record (for a few days) of all past orchestrator invocations, from anywhere.

Other:
Added "--skip-unresolve" command line flag

GA Release v1.4.476

03 Nov 11:16
Compare
Choose a tag to compare
Support for MySQL 5.7 as backend database

- implied by support for NO_ZERO_DATE,NO_ZERO_IN_DATE sql_mode, so
shockingly ignored by now.
- fixed schemas to never have zero in date
- ugly sql_mode hack as backwards-compatability mechanism

orchestrator -c help does now initiate any db action

GA Release v1.4.475

02 Nov 14:39
Compare
Choose a tag to compare
node health:

- db: PK is on (hostname, token)
- aggressively written to, every 10 seconds
- aggressively purged (expire = insert interval * 2)
- written to also by CLI instances

recovery:
- added AcknowledgeCrashedRecoveries(): auto-acknowledge (mark as
failed) recoveries owned by a now-dead process. This is detected using
the above node health changes.

GA Release v1.4.474

30 Oct 12:19
Compare
Choose a tag to compare
clusters page (aka dashboard) indicates replication analysis problems

GA Release v1.4.472

30 Oct 09:31
Compare
Choose a tag to compare
- node that gets elected updates health status (this makes sure its

token in health status is aligned with elected node entry)
- some minor renaming