Skip to content
This repository has been archived by the owner on Dec 8, 2023. It is now read-only.

Reboot on takeover install #228

Merged
merged 4 commits into from
Jan 13, 2020
Merged

Conversation

zimme
Copy link
Contributor

@zimme zimme commented Oct 16, 2019

The current behaviour to poweroff during takeover installs is unexpected. This PR changes that behaviour to reboot instead.

@dweomer
Copy link
Contributor

dweomer commented Oct 16, 2019

This might be better addressed by something like #143 which I've been meaning to validate and include in a release.

@zimme
Copy link
Contributor Author

zimme commented Oct 17, 2019

From what I can tell #143 fixes another problem entirely, it's fixing some reboot/poweroff problem when running the install.sh script. This fix is not for when you run install.sh but after that first reboot and k3os is actually installed, i.e. after the first reboot it installs k3os and powers down instead of reboots another time to start the new system.

@dweomer
Copy link
Contributor

dweomer commented Oct 31, 2019

As discussed with @zimme in the rancher-users #k3os slack channel my current hold up with this is an as of yet not validated hunch that the current behavior is depended upon by the AWS/GCP packer builds. Working on validating this Soon ™️

@zimme zimme force-pushed the reboot-on-takeover-install branch from 4d64da3 to adfea69 Compare November 8, 2019 10:24
@zimme
Copy link
Contributor Author

zimme commented Nov 8, 2019

I've tested the AWS packer thing now and it seems to still work. Running packer build ./template.json from form the aws directory with a changed region ´eu-north-1` creates a packer build instance and terminates it after an AMI has been created and I can launch an instance from that.

I don't know if it's a problem that the installed k3os system probably boots during the packer build stage so the AMI is generateed from a running k3os instance?

In the case before this change I guess the installed k3os instance during the build stage is never started and the AMI is generated from a k3os system that has never been started.

@zimme
Copy link
Contributor Author

zimme commented Nov 8, 2019

So I just checked this again and it seems my suspicions were correct.

Looking at the output of kubectl get nodes -o wide on a freshly deployed instance from the AMI build via packer with this change I'm seeing the following

NAME                                           STATUS     ROLES    AGE   VERSION         INTERNAL-IP     EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION      CONTAINER-RUNTIME
ip-172-31-2-104.eu-north-1.compute.internal    NotReady   master   10h   v1.16.2-k3s.1   172.31.2.104    <none>        k3OS adfea69   4.15.0-66-generic   containerd://1.3.0+unknown
ip-172-31-13-221.eu-north-1.compute.internal   Ready      master   11m   v1.16.2-k3s.1   172.31.13.221   <none>        k3OS adfea69   4.15.0-66-generic   containerd://1.3.0+unknown
ip-172-31-13-221 [~]$

As you can see there are 2 nodes present, the one that's not ready is the master node that was configured during the first boot while building the AMI so we need to find a way to make the reboot on takeover install configurable in all stages of the takeover install or find another way of building an AMI.

@zimme zimme force-pushed the reboot-on-takeover-install branch from adfea69 to 4c10d13 Compare November 9, 2019 10:19
@zimme
Copy link
Contributor Author

zimme commented Nov 9, 2019

Updated this to respect --poweroff instead if just blindly rebooting.

I used the updated check from #143 instead of the current check that's used so this is best merged in tandem with #143.

zimme added 2 commits November 9, 2019 19:35
The takeover install will always reboot once to install the system. This change will make it possible to not reboot into the installed system but poweroff instead.

This is necessary for being able to create AWS AMIs from a takeover install.
@zimme zimme force-pushed the reboot-on-takeover-install branch from 4c10d13 to e44b448 Compare November 9, 2019 19:57
@zimme
Copy link
Contributor Author

zimme commented Nov 9, 2019

Fixed the check for k3os/system/poweroff and verified that it works on AWS.

I don't know if any other packer config needs updating too?

@zimme
Copy link
Contributor Author

zimme commented Dec 8, 2019

Is there any reason not to merge this and #143?

@thomasmhofmann
Copy link

I would also like to see this merged.

@dweomer dweomer merged commit 730df25 into rancher:master Jan 13, 2020
@zimme
Copy link
Contributor Author

zimme commented Jan 13, 2020

Just a note on this. There might be updates to packer configs to add --poweroff needed as I haven't tested all packer configs.

@thomasmhofmann
Copy link

Thank you. Working now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants