Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to create containers when cgroup subsystems are mounted in multiple places #1367

Closed
teddyking opened this issue Mar 9, 2017 · 6 comments

Comments

@teddyking
Copy link
Contributor

We recently ran into an issue whereby runc was not able to create containers and was failing with the following error:

EOF
container_linux.go:259: starting container process caused "process_linux.go:283: applying cgroup configuration for process caused \"open /sys/fs/cpuset.cpus: no such file or directory\""

This was happening due to an additional cpu,memory cgroup subsystem mounted somewhere on the system (at /cgroups in this case):

# cat /proc/self/mountinfo | grep cgroup
24 18 0:19 / /sys/fs/cgroup rw,relatime - tmpfs none rw
32 23 0:25 / /cgroups rw,relatime - cgroup cgroup rw,cpu,memory
33 24 0:26 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
34 24 0:27 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
35 24 0:25 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu,memory
36 24 0:25 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,cpu,memory

It seems that this is only a problem when two mounts of the same subsystem are performed in a certain order (i.e in the example above, the /cgroups mount appears first in /proc/self/mountinfo).

Is runc able/supposed to support this sort of setup?

Cheers,
Ed & @craigfurman

@cyphar
Copy link
Member

cyphar commented Mar 9, 2017

Yeah, this is related to #798 and several other issues we've had in the past. IMO we need to rework how mountpoint checking is done (the code is basically unreadable and is [in my mind] fragile). In particular this code also had similar issues with mounting different cgroup versions (which has now been fixed but I don't like the fix 😉).

In my mind, the nice way for this to be handled (so that it can also accomodate #774) is this:

  • Have a list of "implementations" of cgroup (or rather resource) handlers [which includes both cgroupv2 and cgroupv1].
  • Go through the list, trying to see if they can set the limit for $resource. continue if any of them succeeds.
  • If none of them could set the limit, check whether the limit is the "default" and output an error if it isn't.

The Set and Apply split probably won't work with the above idea, but I'm unsure whether exposing that cgroup interface in our Go interface makes much sense. Then there's the whole kmemcg fiasco.

The above idea could be extended to check (with syscall.Access) whether we have Apply privileges in the rootless context.

@teddyking
Copy link
Contributor Author

Ok so iiuc this a known issue and it looks like it may require a fairly large change in order to address it. Has anyone started to look at this? If not we could maybe try to get a PR organised. Is this something that would be likely to be accepted?

@cyphar
Copy link
Member

cyphar commented Mar 10, 2017

I had started to look at it, but I've been swamped by other things recently. IMO if you put together a patch that managed to clean up the mountpoint-finding magic we're currently using to be more resistant to different setups I'd be okay with merging it (though I can't speak for the other maintainers). I wouldn't recommend going straight for the entire rewrite that I proposed above -- simply because it would touch a lot of code that would make people nervous (kmemcg being a prime example).

Though I would like to point out that having multiple mountpoints of the same subsystem is a bit of an odd thing to do (they track the same information) -- however as I mentioned above this is a symptom of a bigger issue we have in the current cgroup manager implementation in runC.

@teddyking
Copy link
Contributor Author

Ok cool, that sounds reasonable to me.

Yeah I agree it is a bit strange... The background here is that a user was attempting to deploy Garden alongside some other software that was doing its own independent cgroup setup, which then caused runc to start erroring with the above message.

@craigfurman
Copy link

@cyphar Opened a PR for your review.

@hqhq
Copy link
Contributor

hqhq commented Aug 4, 2017

Fixed by #1372

@hqhq hqhq closed this as completed Aug 4, 2017
kolyshkin added a commit to kolyshkin/runc that referenced this issue Nov 24, 2020
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this issue Nov 24, 2020
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this issue Nov 25, 2020
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this issue Nov 27, 2020
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this issue Dec 4, 2020
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/runc that referenced this issue Jan 6, 2021
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
dqminh pushed a commit to dqminh/runc that referenced this issue Feb 3, 2021
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
dims pushed a commit to dims/libcontainer that referenced this issue Oct 19, 2024
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers/runc#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
dims pushed a commit to dims/libcontainer that referenced this issue Oct 19, 2024
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers/runc#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
dims pushed a commit to dims/libcontainer that referenced this issue Oct 19, 2024
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers/runc#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/containerd-cgroups that referenced this issue Nov 6, 2024
In cgroup v1, the parent of a cgroup mount is on tmpfs
(or maybe any other fs but definitely not cgroupfs).

Use this to traverse up the tree until we reach non-cgroupfs.

This should work in any setups (nested container, non-standard
cgroup mounts) so issue [1] won't happen.

Theoretically, the only problematic case is when someone mounts
cpuset cgroupfs into a directory on another cgroupfs. Let's
assume people don't do that -- if they do, they will get other
error (e.g. inability to read cpuset.cpus file).

[1] opencontainers/runc#1367

Signed-off-by: Kir Kolyshkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants