Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timer_dnf-automatic_enabled doesn't remediate via Ansible on RHEL 9 #12831

Open
comps opened this issue Jan 15, 2025 · 5 comments
Open

timer_dnf-automatic_enabled doesn't remediate via Ansible on RHEL 9 #12831

comps opened this issue Jan 15, 2025 · 5 comments
Assignees
Labels
productization-issue Issue found in upstream stabilization process.

Comments

@comps
Copy link
Collaborator

comps commented Jan 15, 2025

Description of problem:

Running Ansible remediation on "host-os" (the SUT itself),


TASK [Enable timer dnf-automatic] **********************************************
changed: [localhost] => {"changed": true, "enabled": true, "name": "dnf-automatic.timer", "state": "started", "status": {"AccuracyUSec": "1min", "ActiveEnterTimestampMonotonic": "0", "ActiveExitTimestampMonotonic": "0", "ActiveState": "inactive", "After": "-.mount sysinit.target time-set.target time-sync.target", "AllowIsolate": "no", "AssertResult": "no", "AssertTimestampMonotonic": "0", "Before": "shutdown.target timers.target dnf-automatic.service", "CanClean": "state", "CanFreeze": "no", "CanIsolate": "no", "CanReload": "no", "CanStart": "yes", "CanStop": "yes", "CollectMode": "inactive", "ConditionResult": "no", "ConditionTimestampMonotonic": "0", "Conflicts": "shutdown.target", "DefaultDependencies": "yes", "Description": "dnf-automatic timer", "FailureAction": "none", "FixedRandomDelay": "no", "FragmentPath": "/usr/lib/systemd/system/dnf-automatic.timer", "FreezerState": "running", "Id": "dnf-automatic.timer", "IgnoreOnIsolate": "no", "InactiveEnterTimestampMonotonic": "0", "InactiveExitTimestampMonotonic": "0", "JobRunningTimeoutUSec": "infinity", "JobTimeoutAction": "none", "JobTimeoutUSec": "infinity", "LastTriggerUSecMonotonic": "0", "LoadState": "loaded", "Names": "dnf-automatic.timer", "NeedDaemonReload": "no", "NextElapseUSecMonotonic": "infinity", "OnClockChange": "no", "OnFailureJobMode": "replace", "OnSuccessJobMode": "fail", "OnTimezoneChange": "no", "Perpetual": "no", "Persistent": "yes", "RandomizedDelayUSec": "1h", "RefuseManualStart": "no", "RefuseManualStop": "no", "RemainAfterElapse": "yes", "Requires": "sysinit.target -.mount", "RequiresMountsFor": "/var/lib/systemd/timers", "Result": "success", "StartLimitAction": "none", "StartLimitBurst": "5", "StartLimitIntervalUSec": "10s", "StateChangeTimestampMonotonic": "0", "StopWhenUnneeded": "no", "SubState": "dead", "SuccessAction": "none", "TimersCalendar": "{ OnCalendar=*-*-* 06:00:00 ; next_elapse=(null) }", "Transient": "no", "Triggers": "dnf-automatic.service", "Unit": "dnf-automatic.service", "UnitFilePreset": "disabled", "UnitFileState": "disabled", "WakeSystem": "no", "Wants": "network-online.target"}}

results in a following oscap xccdf eval FAILing when checking the rule, claiming that dnf-automatic.timer is inactive.

The fact that a VM-based Ansible test didn't have this issue might indicate a (Beaker) environment specific issue, or maybe some incompatibility with older RHELs, ie. different unit file naming or disabled-by-default service that we never enable, or perhaps an external override (in /etc ?) that would inactivate the service.

Some more investigation is needed, sorry.

Run on 9.0 or 9.2 as

--rhel 9.2 --arch x86_64 --test /hardening/host-os/ansible/anssi_bp28_minimal

SCAP Security Guide Version:

master @ 60a184a

Operating System Version:

RHEL-9.0 EUS and RHEL-9.2 EUS

Additional Information/Debugging Steps:

Can't attach report html / ARF xml here since both contain internal hostnames and IP addresses, given that the scan was done on the Beaker system itself, not in a nested VM. Please use the reproducer above.

@comps comps added the productization-issue Issue found in upstream stabilization process. label Jan 15, 2025
@Mab879 Mab879 changed the title timer_dnf-automatic_enabled doesn't remediate via Ansible on RHEL 9.0/9.2 timer_dnf-automatic_enabled doesn't remediate via Ansible on RHEL 9 Jan 22, 2025
@Mab879
Copy link
Member

Mab879 commented Jan 22, 2025

This appears to now be happening on RHEL 9.6 as well.

@jan-cerny
Copy link
Collaborator

This issue is a duplicate of #12119.

The problem is that starting the dnf-automatic.timer takes a lot of time. At the moment of final scan the timer isn't started yet but it starts a few seconds after that. So the final state is that the timer is active and the rule passes.

I reproduced this sucessfully. In my example, the rule evaluated as false at 15:49:46 and according to journalctl the timer started at 15:49:48.

@jan-cerny
Copy link
Collaborator

I forgot to mention that there is a reboot in the /hardening/host-os/ansible contest test between the Ansible remediation and the oscap scan. The oscap scan starts sooner after the reboot that the dnf-automatic.timer starts.

We think that there are these solutions:

  • waive it as failed rule
  • add some delay or sleep to the contest test so that contest wouldn't start the oscap scan right after the reboot but after some time
  • add some waiting condition to the contest test to ensure the timer is started before the scan

@comps
Copy link
Collaborator Author

comps commented Feb 6, 2025

I'm not sure that's the issue - on my RHEL-9 VM, it starts in a fraction of a second (under 100ms) and the delays from tmt waiting for the host to reboot + ssh-ing into it + running the test again definitely add up to more than 2secs. Measuring it now (tmt takes a long time to realize a host has booted), it's about 20-30 seconds after sshd becomes accessible.

That said, dnf-automatic.service does take a few seconds, but we shouldn't be checking that.

However, while dnf-automatic.timer waits for network-online.target, sshd waits only for network.target, so depending on how systemd figures out whether it's online (contacting a 3rd party service on the Internet?), that may account for the difference.

Maybe we could add a sshd.service dependency on network-online.target in the testing environment, to delay the startup a bit more and get a more representative OS state upon ssh connection.

@jan-cerny
Copy link
Collaborator

Maybe a VM is different than the remote VM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
productization-issue Issue found in upstream stabilization process.
Projects
None yet
Development

No branches or pull requests

3 participants