Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'Improve MicroOS management' RFC #92

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

agraul
Copy link
Member

@agraul agraul commented Oct 11, 2024

No description provided.


These operations do not alter the system, i.e. they don't belong in a (new) snapshot. Uyuni applies them with `state.apply` for two reasons: structured output and no concurrency (`queue=True`).

- `ansible.runplaybook`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ansible.runplaybook intended for SLM/MicroOS to act as a control node that executes the playbook against other minions? Because the playbook can be executed against the control node itself as well, which would then be OS-altering of course.

Then again, if we make this a transactional action, that means the playbook files must be present on the transactional filesystem. We should probably make a note somewhere in the doc at least that we don't support Uyuni/SUMA server to alter itself with Ansible.


Until now, we tried to hide the differences in Uyuni and relied on Salt to do the *right thing*. This strategy was easy for us to use, but it did not work well. We need to change the approach we take with transactional systems.

# Detailed design

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add info here about the changes that pertain to displaying packages? I understand that we want to also display what packages are in a new transaction, but not yet applied (or vice versa, what packages are removed, but still present in the currently-running snapshot). The details of this would go to a different squad, but they might be in scope for this RFC?

* obscure corner cases
* will it impact performance?
* what other parts of the product will be affected?
* will the solution be hard to maintain in the future?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this section complete?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's still the template.

[design]: #detailed-design

## Uyuni differentiates between OS-unchanging and OS-altering operations
Instead of using the `transactional_update` executor, Uyuni calls either `state.apply` or `transactional_update.apply`. Uyuni is in full control of which states are applied in a new snapshot and which states are not. The SLS file is the smallest unit Salt can handle, there is no way to apply only parts of an SLS file inside a snapshot. Therefore, we split SLS files that currently mix OS-unchanging and OS-altering operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, for internal states (example channels state) the decision if we should run it using state.apply or transactional_update.apply is made by uyuni automatically, and user doesn't need to know anything about it?
How about High State? Should we add the option in the UI state scheduler for the user to select in which mode it should run? If that is the case, we should hide this option for non-transational systems.

[design]: #detailed-design

## Uyuni differentiates between OS-unchanging and OS-altering operations
Instead of using the `transactional_update` executor, Uyuni calls either `state.apply` or `transactional_update.apply`. Uyuni is in full control of which states are applied in a new snapshot and which states are not. The SLS file is the smallest unit Salt can handle, there is no way to apply only parts of an SLS file inside a snapshot. Therefore, we split SLS files that currently mix OS-unchanging and OS-altering operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: It would be good to list here as well how state.apply and transactional_update.apply are different from each other. As, it's defined a bit later so not very critical.


### OS-altering operations

These operations alter the operating system itself, i.e. they belong in a (new) snapshot. Uyuni applies them with `transactional_update.apply`, unless otherwise noted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless otherwise noted.

Can you elaborate it please, like in what possible situations this could be the case.


## Automatic reboots during bootstrapping

When bootstrapping a new system, Uyuni relies on information present on the client system to know what kind of system it is. This includes finding out if the new system is a transactional system. Bootstrapping happens with Salt SSH and `state.apply`. The bootstrap SLS file contains logic to install our Salt Minion package correctly on both traditionally-managed and transactional systems.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you're explaining how things are currently working, but reading it in flow, it seems a bit contradictory considering that just a few lines above, we defined state.apply for operations that shouldn't change the OS. Maybe we can list it as an exception.


### Request a reboot without delay

With a delay lock taken, `bootstrap/init.sls` can request a reboot from systemd from the main process. The reboot will be delayed until Salt SSH execution terminates and releases the lock.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to evaluate the option where rebootmgr is default but we can override that in special cases when immediate reboot is needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants