Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop /dev/pts from bind mount locations #2692

Merged
merged 1 commit into from
Dec 11, 2024
Merged

Drop /dev/pts from bind mount locations #2692

merged 1 commit into from
Dec 11, 2024

Conversation

Conan-Kudo
Copy link
Member

@Conan-Kudo Conan-Kudo commented Dec 6, 2024

This has created havoc in the Fedora build environments by fully unmounting /dev/pts and breaking the builders for subsquent tasks.

This is a partial revert of commit daf1323.

@Conan-Kudo Conan-Kudo enabled auto-merge December 6, 2024 22:13
@Conan-Kudo Conan-Kudo requested a review from schaefi December 6, 2024 22:13
@schaefi
Copy link
Collaborator

schaefi commented Dec 6, 2024

@Conan-Kudo Hmm, this is an unexpected effect of the change that I don't see with the builds in obs. Can you give me a bit more details what exactly is causing trouble ? Thanks

@Conan-Kudo
Copy link
Member Author

Conan-Kudo commented Dec 6, 2024

@Conan-Kudo Hmm, this is an unexpected effect of the change that I don't see with the builds in obs. Can you give me a bit more details what exactly is causing trouble ? Thanks

According to @nirik, it is unmounting the real /dev/pts and breaking the builders for other tasks. Unlike OBS, Koji doesn't spawn ephemeral VMs for every build, so there are shared resources being manipulated here.

@Conan-Kudo
Copy link
Member Author

For what it's worth, this would be a problem with OBS deployments in the public cloud, since nested virtualization isn't a thing there and OBS falls back to chroots in that case and works just like Koji does.

@schaefi
Copy link
Collaborator

schaefi commented Dec 9, 2024

According to @nirik, it is unmounting the real /dev/pts and breaking the builders for other tasks. Unlike OBS, Koji doesn't spawn ephemeral VMs for every build, so there are shared resources being manipulated here.

Hmm, ok but that would then be a problem for everybody building with kiwi locally right ? I do not see this behavior when I build an image just with kiwi on my system. I tested this with

mountpoint /dev/pts/
/dev/pts/ is a mountpoint

kiwi-ng system build ... some image build

...
[ DEBUG   ]: 15:00:27 | Creating directory /tmp/mytest/build/image-root/dev/pts
[ DEBUG   ]: 15:00:27 | EXEC: [mountpoint -q /tmp/mytest/build/image-root/dev/pts]
[ DEBUG   ]: 15:00:27 | EXEC: [mount -n --bind /dev/pts /tmp/mytest/build/image-root/dev/pts]
[ DEBUG   ]: 15:00:38 | EXEC: [mountpoint -q /tmp/mytest/build/image-root/dev/pts]
[ DEBUG   ]: 15:00:38 | EXEC: [mountpoint -q /tmp/mytest/build/image-root/dev/pts]
[ DEBUG   ]: 15:00:38 | EXEC: [umount -l /tmp/mytest/build/image-root/dev/pts]

mountpoint /dev/pts/
/dev/pts/ is a mountpoint

So my system /dev/pts was not touched and I'm not aware that umounting a bind location will umount the origin too.
That would be actually fatal for any other bind mounts kiwi performs

So before we revert this I would really like to understand why kiwi is causing this trouble

Thanks

@nirik
Copy link

nirik commented Dec 9, 2024

Yes, this is very weird. :(

I have been unable to duplicate it here in just mock with old-chroot. :(
It definitely does seem caused by this though, as I untagged the new kiwi version and everything builds as expected now.

The /dev/pts mount is weird though. It's hard coded into systemd to mount it, it has no fstab entry or mount unit that I can find.

systemd-nspawn (the 'default' mock chroot) mounts it's own:

    * nspawn will now mount its own devpts file system instance
      into the container, in order not to leak pty devices from
      the host into the container.

I will try and duplicate in a old mock chroot with a kiwi task. I think thats the interaction that somehow causes it

@nirik
Copy link

nirik commented Dec 9, 2024

I can definitely duplicate it in koji / old-mock-chroot.

I wonder if it could be something around using lazy unmounts and /dev and /dev/pts are both umounted and there's some kind of race there with a submount?

Also, seems like mock is doing some umounting too...

DEBUG util.py:556:  Executing command: ['/bin/umount', '-n', '/var/lib/mock/f42-kiwi-build-55487250-6536159/root/dev'] with env {'TERM': 'vt100', 'SHELL': '/bin/sh', 'HOME': '/builddir', 'HOSTNAME': 'mock', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin', 'LANG': 'C.UTF-8'} and shell False
DEBUG util.py:556:  Executing command: ['/bin/umount', '-n', '/var/lib/mock/f42-kiwi-build-55487250-6536159/root/dev'] with env {'TERM': 'vt100', 'SHELL': '/bin/sh', 'HOME': '/builddir', 'HOSTNAME': 'mock', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin', 'LANG': 'C.UTF-8'} and shell False
DEBUG util.py:461:  [ DEBUG   ]: 20:35:17 | EXEC: [umount -l /builddir/result/image/build/image-root/dev/pts]
DEBUG util.py:461:  [ DEBUG   ]: 20:35:17 | EXEC: [umount -l /builddir/result/image/build/image-root/dev]
DEBUG util.py:461:  [ DEBUG   ]: 20:36:49 | EXEC: [umount /builddir/result/image/build/image-root/dev]
DEBUG util.py:461:  [ DEBUG   ]: 20:37:39 | EXEC: [umount /var/tmp/kiwi_mount_manager.5uf_ulnx/dev]
DEBUG util.py:461:  [ DEBUG   ]: 20:37:39 | EXEC: [umount /var/tmp/kiwi_mount_manager.8nxee3lk/dev]
DEBUG util.py:461:  [ DEBUG   ]: 20:37:42 | EXEC: [umount -l /builddir/result/image/build/image-root/dev]
DEBUG util.py:461:  [ DEBUG   ]: 20:37:42 | EXEC: [umount -l /builddir/result/image/build/image-root/dev]
DEBUG util.py:461:  [ DEBUG   ]: 20:37:43 | EXEC: [umount -l /builddir/result/image/build/image-root/dev]
DEBUG util.py:556:  Executing command: ['/bin/umount', '-n', '/var/lib/mock/f42-kiwi-build-55487250-6536159/root/dev'] with env {'TERM': 'vt100', 'SHELL': '/bin/sh', 'HOME': '/builddir', 'HOSTNAME': 'mock', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin', 'LANG': 'C.UTF-8'} and shell False

Happy to try and get more detailed info...

@schaefi
Copy link
Collaborator

schaefi commented Dec 10, 2024

@Conan-Kudo @nirik Thanks for digging into this. I'm fine with the revert to avoid issues on your side. I believe image builds that requires /dev/pts in some script coding could also temporary mount/umount it. The reason why we added it was that sudo in newer versions seems to require it and I did not expect such a massive impact.

@Conan-Kudo I saw you just reverted the complete commit, can you please make sure the test stays (reduced by the dev/pts mount) because I don't want to loose that.

Thanks

@Conan-Kudo Conan-Kudo disabled auto-merge December 10, 2024 18:14
@Conan-Kudo Conan-Kudo changed the title Revert "Added /dev/pts to bind mount locations" Drop /dev/pts from bind mount locations Dec 10, 2024
This has created havoc in the Fedora build environments by
fully unmounting /dev/pts and breaking the builders for
subsquent tasks.

This is a partial revert of commit daf1323.
@Conan-Kudo
Copy link
Member Author

Done!

Copy link
Collaborator

@schaefi schaefi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, looks good to me.

@nirik if you happen to find the root cause and it's not kiwi let's try to fix that part. If you or @Conan-Kudo believe it's better to revert and merge this PR I'm fine if you go ahead and merge it.

@Conan-Kudo Conan-Kudo merged commit 89959df into main Dec 11, 2024
13 checks passed
@Conan-Kudo Conan-Kudo deleted the revert-pr2687 branch December 11, 2024 16:01
@nirik
Copy link

nirik commented Dec 11, 2024

Yeah, will let you know. I am still not sure whats happening. ;(

I do note that with sudo you can drop in a file in /etc/sudoers.d/ with:

Defaults !requiretty

and it won't need a pty. But I suppose it depends on your use case there...

@Conan-Kudo
Copy link
Member Author

Well, you shouldn't need sudo in config.sh since the script runs privileged with the rest of the image build. So this is more of a "you're doing it wrong" kind of thing.

@schaefi
Copy link
Collaborator

schaefi commented Dec 12, 2024

Yeah, it was a customer use case and I did not took a closer look what they are doing. I just thought it can't hurt to map another common kernel filesystem to the build process... that assumption was pretty wrong as we see ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants