-
Notifications
You must be signed in to change notification settings - Fork 40
Software Updates
NOTE: Software update refers to just yum package update and doesn't consider or is supposed to affect any software configurations. If there are any configuration changes or schema changes it qualifies as Software Upgrade and not Software Update
-
Remember current last yum txns ids (to be able to rollback later if needed)
-
Turn on hctl maintenance mode
-
Update provisioner
-
If salt-master config is changed due to Provisioner update, then restart salt-master service
-
If salt-minions configs are changed as well, then restart salt-minion service
-
Update Cortx components
-
Turn off hctl maintenance mode
-
Run HA upgrade script:
/opt/seagate/cortx/hare/libexec/build-ha-update /var/lib/hare/cluster.yaml /var/lib/hare/build-ha-args.yaml /var/lib/hare/build-ha-csm-args.yaml
In case of any error:
- rollback all yum updates (using yum history)
- If error happened during cluster maintenance enablement
- Call hctl to turn off maintenance mode in the background (no wait for the result since it may lead to nodes reboot)
- Otherwise:
- if configs for salt-master and/or minion are changed back due to that yum rollback - restart them
- turn off pacemaker maintenance mode
-
Check cluster status
[user@host ~]# pcs status
-
Take snapshot of installed rpms:
[user@host ~]# (rpm -qa|grep cortx) |tee before_update.txt
-
Set update repo
Command:
[user@host ~]# provisioner set_swupdate_repo --source "<URL> or <Path to ISO file>" <Release_Tag>
Example:
[user@host ~]# provisioner set_swupdate_repo --source "http://<cortx_release_repo>/releases/cortx/integration/centos-7.7.1908/last_successful/" build_2382
Verify the repo is set correctly:
[user@host ~]# salt-call pillar.get release:update:repos local: ---------- Cortx-1.0.0-11-rc6-interim: http://<cortx_release_repo>/releases/cortx/Cortx-1.0.0-11-rc6-interim/
-
Trigger update
Command:[user@host ~]# provisioner sw_update
-
Take snapshot of new rpms:
[user@host ~]# (rpm -qa|grep cortx) |tee after_update.txt
-
Check cluster status
[user@host ~]# pcs status
-
Verify update:
[user@host ~]# diff before_update.txt after_update.txt 1a2 > cortx-core-1.0.0-366_git65ca4ad0e_3.10.0_1062.el7.x86_64 4c5 < cortx-prvsnr-1.0.0-309_gitd4fabec_el7.x86_64 --- > cortx-hare-1.0.0-641_git3aa5c9d.el7.x86_64 7d7 < cortx-s3server-1.0.0-865_git83c3bc2e_el7.x86_64 9,10d8 < cortx-s3iamcli-1.0.0-865_git83c3bc2e.noarch < cortx-hare-1.0.0-639_git3aa5c9d.el7.x86_64 11a10,11 > cortx-prvsnr-1.0.0-310_gitb0273ad_el7.x86_64 > cortx-s3iamcli-1.0.0-869_git44b1198a.noarch 15c15 < cortx-core-1.0.0-364_git932e52fb4_3.10.0_1062.el7.x86_64 --- > cortx-s3server-1.0.0-869_git44b1198a_el7.x86_64
NOTE: Rollback is designed to be only reverting the packages changes effected by rpm update using YUM rollback capabilities.
Please find below the steps that might be used to rollback the cluster to the state before sw update was triggered.
Note (All commands are to be run on primary node)
Before the update please explore last yum transaction ids for each node and remember (later it would be also possible but will require more exploration of the yum history):
salt '*' cmd.run "yum history | grep ID -A 5"
-
Turn on maintenence mode:
hctl node [--verbose] maintenance --all --timeout-sec=600
-
Do yum rollback to stored ids
salt <node1> cmd.run "yum history rollback -y <ID>"
-
Apply salt-master possibly changed configuration
salt-run salt.cmd state.apply components.provisioner.salt_master.config
Check that salt-master is running and all minions are connected
Salt-master might be restarted during the previous step, so minions need time to reconnect to check list of connected minions
salt-run manage.up
-
Apply salt-minions possibly changed configuration
salt '*' state.apply components.provisioner.salt_minion.config
salt-minions might be restarted during that, so check they are re-connected
salt-run manage.up
-
Turn off maintenence mode
hctl node [--verbose] unmaintenance --all --timeout-sec=600