Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kmods fact produces null byte in parameters datastructure #98

Closed
phihos opened this issue Jul 2, 2023 · 15 comments · Fixed by #107
Closed

kmods fact produces null byte in parameters datastructure #98

phihos opened this issue Jul 2, 2023 · 15 comments · Fixed by #107
Labels
bug Something isn't working

Comments

@phihos
Copy link

phihos commented Jul 2, 2023

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 6.28.0
  • Ruby: 2.5.9
  • Distribution: OSS
  • Module version: 3.2.0

How to reproduce (e.g Puppet code you use)

In my case Puppet ran in a libvirt VM where the parameter "virtio_mmio/device" has apparently no value. Running puppet facts produced:

[...]
      "virtio_mmio": {
        "parameters": {
          "device": "\u0000"
        },
        "used_by": [

        ]
      },
[...]

This produced the following error when uploading the fact to Foreman:

'Action failed' error (ArgumentError): string contains null byte

I know Foreman could handle such input more gracefully, but \u0000 does not look like a correct value either.

What are you seeing

See above.

What behaviour did you expect instead

An empty string like this:

[...]
      "virtio_mmio": {
        "parameters": {
          "device": ""
        },
        "used_by": [

        ]
      },
[...]

Output log

See above.

Any additional information you'd like to impart

@kenyon kenyon added the bug Something isn't working label Jul 2, 2023
@saz
Copy link
Contributor

saz commented Aug 24, 2023

I'm seeing a different error message since yesterday:

[2023-08-23 22:36:26.420436 ] ERROR Facter - Fact resolution fact='kmods', resolution='<anonymous>' resolved to an invalid value: String "\u0000" contains a null byte reference; unsupported for all facts.

It's also virtio_mmio in my case.

Puppet: 7.26.0
Ruby: 2.7.8p225
Distribution: OSS
Module version: 3.2.0

@saz
Copy link
Contributor

saz commented Aug 24, 2023

Changing the line

kmod[directory]['parameters'][param] = File.read("/sys/module/#{directory}/parameters/#{param}").chomp

to

kmod[directory]['parameters'][param] = File.read("/sys/module/#{directory}/parameters/#{param}").chomp.delete("\u0000")

fixes the issue, but I'm not sure, if that's the best way.

@elfranne
Copy link

The new error seems to be caused by new version of facter:
works fine with facter 4.4.1 but not 4.4.2. The new version is included in Puppet agent 7.26

@waipeng
Copy link

waipeng commented Aug 25, 2023

@elfranne thanks for finding this. This is very helpful, as Puppet release notes does not say about the Facter bump, nor Facter release notes have a note about this behaviour.

@waipeng
Copy link

waipeng commented Aug 25, 2023

@saz can confirm that patch is good. You should send a PR.

@smortex
Copy link
Member

smortex commented Aug 25, 2023

The new error seems to be caused by new version of facter: works fine with facter 4.4.1 but not 4.4.2. The new version is included in Puppet agent 7.26

It looks like the check was added in puppetlabs/facter#2590.

So basically, if the problem was already there, it was not reported before.

Looking at https://elixir.bootlin.com/linux/latest/source/drivers/virtio/virtio_mmio.c#L784 it seems that this is supposed to be an empty string or multiple lines of text. What is your kernel version?

@waipeng
Copy link

waipeng commented Aug 25, 2023

maybe I can answer

# cat /etc/issue
Ubuntu 20.04.6 LTS \n \l
# uname -a
Linux monitoring-melbourne-qh2 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
irb(main):001:0> File.read('/sys/module/virtio_mmio/parameters/device')
=> "\u0000"

@saz
Copy link
Contributor

saz commented Aug 25, 2023

@smortex this issue has been reported two months ago, maybe the error message was different due to the usage of Foreman?

I'm seeing this error message on Ubuntu 20.04 and Ubuntu jammy. As the kernel version is still the same, as two days ago, it seems unrelated.

I'll check, when facter has been upgraded on those affected systems.

@saz
Copy link
Contributor

saz commented Aug 25, 2023

The Debian package release for Ubuntu Focal and Jammy happened on 2023-08-23. This perfectly matches the first time I've seen this issue. As @waipeng pointed out, there's no information about that change in any of the release notes.

@smortex I don't understand why, but the error isn't shown on Debian Bookworm with kernel 6.1. Looks like it's only happening for me on Ubuntu systems with kernel 5.x
If you don't see a better way to fix this, I'll send in a PR, as this issue is quite annoying 😃

waipeng added a commit to NeCTAR-RC/puppet-kmod that referenced this issue Aug 28, 2023
This can be removed when upstream issue [1] has a patch, and a version
we can use.

[1]: voxpupuli#98
waipeng added a commit to NeCTAR-RC/puppet-kmod that referenced this issue Aug 28, 2023
This can be removed when upstream issue [1] has a patch, and a version
we can use.

[1]: voxpupuli#98
@smortex
Copy link
Member

smortex commented Aug 29, 2023

All Debian nodes I have access to have no /sys/module/virtio_mmio/parameters/device (Debian 10.13, 11.7, 12.1).
All Ubuntu nodes I have access to have a /sys/module/virtio_mmio/parameters/device with the unexpected content \0 (Ubuntu 18.04, 20.04. 22.04).

All these nodes are the same kind of nodes provided by the same service provider, so the Ubuntu kernel seems to expose some crap that the Debian kernel does not expose. Ubuntu being a Debian derivative, it's maybe worth looking for the root cause in the Ubuntu patches for the kernel?

It's not clear what break when you have this error. Is it just some spam in the log (in which case I would expect Puppet not to hide it because it is indeed wrong)? Does it break things (in which case we can detect if a value read has null bytes and emit a warning and discard that value if so)?

@waipeng
Copy link

waipeng commented Aug 29, 2023

@smortex what are the kernel versions you tested with? I suspect the difference may be due to modules that gets loaded with the vendor default kernels.

The impact is that there is an error emitted when running Puppet. E.g.

root@example:~# puppet agent -t
Info: Using environment 'XXX'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Facter: Fact resolution fact='kmods', resolution='<anonymous>' resolved to an invalid value: String "\u0000" contains a null byte reference; unsupported for all facts.
Info: Caching catalog for example.example.com
Info: Applying configuration version '12345678'
Notice: Applied catalog in 4.80 seconds

Puppet compilation still succeeds. However, this error makes it into /opt/puppetlabs/puppet/cache/state/last_run_report.yaml, triggering a Nagios alert for us (we monitor errors from puppet run).

nectar-gerrit pushed a commit to NeCTAR-RC/puppet-kmod that referenced this issue Aug 29, 2023
This can be removed when upstream issue [1] has a patch, and a version
we can use.

[1]: voxpupuli#98

Change-Id: Ia6471d558e1207aa29c0ef3fbcbab5556cd3ba87
@smortex
Copy link
Member

smortex commented Aug 29, 2023

@smortex what are the kernel versions you tested with?

Debian:

  • 4.19.0-24-cloud-amd64
  • 4.19.0-25-amd64
  • 4.19.0-25-cloud-amd64
  • 5.10.0-22-cloud-amd64
  • 5.10.0-23-amd64
  • 5.10.0-23-cloud-amd64
  • 5.10.0-25-amd64
  • 6.1.0-0.deb11.7-amd64

Ubuntu:

  • 4.15.0-213-generic
  • 5.15.0-67-generic
  • 5.19.0-43-generic

The Debian nodes have a lot of virtio modules loaded (the machines with a cloud kernel, the dedicated servers have no virtio modules):

$ facter kernelrelease
5.10.0-23-cloud-amd64
$ lsmod | grep virtio
virtio_console         40960  1
virtio_balloon         24576  0
virtio_net             61440  0
net_failover           24576  1 virtio_net
virtio_scsi            24576  2
scsi_mod              258048  4 virtio_scsi,sd_mod,libata,sg
virtio_pci             28672  0
virtio_ring            36864  5 virtio_console,virtio_balloon,virtio_scsi,virtio_pci,virtio_net
virtio                 16384  5 virtio_console,virtio_balloon,virtio_scsi,virtio_pci,virtio_net

The Ubuntu VMs have less modules, but these modules are loaded on Debian where we don't have this issue:

$ facter kernelrelease
4.15.0-213-generic
$ lsmod | grep virtio
virtio_scsi            20480  2
virtio_net             49152  0

@waipeng
Copy link

waipeng commented Aug 29, 2023

@smortex more info

On Ubuntu the virtio_mmio appears to be compiled into kernel

root@ubuntu:~# uname -a
Linux ubuntu 5.4.0-159-generic #176-Ubuntu SMP Mon Aug 14 12:04:20 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu:~# grep virtio_mmio /lib/modules/5.4.0-159-generic/modules.builtin
kernel/drivers/virtio/virtio_mmio.ko

On Debian 11, it is a module

root@debian:~# uname -a
Linux debian 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux
root@debian:~# modprobe virtio_mmio
root@debian:~# lsmod | grep virtio_mmio
virtio_mmio            16384  0
virtio_ring            36864  7 virtio_rng,virtio_mmio,virtio_console,virtio_balloon,virtio_pci,virtio_blk,virtio_net
virtio                 16384  7 virtio_rng,virtio_mmio,virtio_console,virtio_balloon,virtio_pci,virtio_blk,virtio_net

After loading the module, it appears that the structure of /sys is different

root@debian:~# ls /sys/module/virtio_mmio/
coresize  drivers  holders  initsize  initstate  notes  refcnt  sections  taint  uevent

It doesn't have parameters/device, and hence no null value.

Whether this is difference in kernel version or other compilation options, I have no idea.

@sid3windr
Copy link

The impact is that there is an error emitted when running Puppet. E.g. (...)

Puppet compilation still succeeds. However, this error makes it into /opt/puppetlabs/puppet/cache/state/last_run_report.yaml, triggering a Nagios alert for us (we monitor errors from puppet run).

This is not the only impact, the kmods fact is empty, making any code relying on it useless.

I have this error on "Debian 11" machines as well but they're actually running a Proxmox VE kernel (which is newer than the Debian ones, and likely comes from the same source as Ubuntu's kernels).

@saz
Copy link
Contributor

saz commented Aug 31, 2023

As the kmods fact is currently broken, I guess we should just strip the null value and that's it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants