Unable to create containers when cgroup subsystems are mounted in multiple places #1367

teddyking · 2017-03-09T16:58:45Z

We recently ran into an issue whereby runc was not able to create containers and was failing with the following error:

EOF
container_linux.go:259: starting container process caused "process_linux.go:283: applying cgroup configuration for process caused \"open /sys/fs/cpuset.cpus: no such file or directory\""

This was happening due to an additional cpu,memory cgroup subsystem mounted somewhere on the system (at /cgroups in this case):

# cat /proc/self/mountinfo | grep cgroup
24 18 0:19 / /sys/fs/cgroup rw,relatime - tmpfs none rw
32 23 0:25 / /cgroups rw,relatime - cgroup cgroup rw,cpu,memory
33 24 0:26 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
34 24 0:27 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
35 24 0:25 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu,memory
36 24 0:25 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,cpu,memory

It seems that this is only a problem when two mounts of the same subsystem are performed in a certain order (i.e in the example above, the /cgroups mount appears first in /proc/self/mountinfo).

Is runc able/supposed to support this sort of setup?

Cheers,
Ed & @craigfurman

The text was updated successfully, but these errors were encountered:

cyphar · 2017-03-09T23:17:44Z

Yeah, this is related to #798 and several other issues we've had in the past. IMO we need to rework how mountpoint checking is done (the code is basically unreadable and is [in my mind] fragile). In particular this code also had similar issues with mounting different cgroup versions (which has now been fixed but I don't like the fix 😉).

In my mind, the nice way for this to be handled (so that it can also accomodate #774) is this:

Have a list of "implementations" of cgroup (or rather resource) handlers [which includes both cgroupv2 and cgroupv1].
Go through the list, trying to see if they can set the limit for $resource. continue if any of them succeeds.
If none of them could set the limit, check whether the limit is the "default" and output an error if it isn't.

The Set and Apply split probably won't work with the above idea, but I'm unsure whether exposing that cgroup interface in our Go interface makes much sense. Then there's the whole kmemcg fiasco.

The above idea could be extended to check (with syscall.Access) whether we have Apply privileges in the rootless context.

teddyking · 2017-03-10T11:00:53Z

Ok so iiuc this a known issue and it looks like it may require a fairly large change in order to address it. Has anyone started to look at this? If not we could maybe try to get a PR organised. Is this something that would be likely to be accepted?

cyphar · 2017-03-10T11:06:14Z

I had started to look at it, but I've been swamped by other things recently. IMO if you put together a patch that managed to clean up the mountpoint-finding magic we're currently using to be more resistant to different setups I'd be okay with merging it (though I can't speak for the other maintainers). I wouldn't recommend going straight for the entire rewrite that I proposed above -- simply because it would touch a lot of code that would make people nervous (kmemcg being a prime example).

Though I would like to point out that having multiple mountpoints of the same subsystem is a bit of an odd thing to do (they track the same information) -- however as I mentioned above this is a symptom of a bigger issue we have in the current cgroup manager implementation in runC.

teddyking · 2017-03-10T13:06:09Z

Ok cool, that sounds reasonable to me.

Yeah I agree it is a bit strange... The background here is that a user was attempting to deploy Garden alongside some other software that was doing its own independent cgroup setup, which then caused runc to start erroring with the above message.

craigfurman · 2017-03-15T12:00:03Z

@cyphar Opened a PR for your review.

hqhq · 2017-08-04T01:36:53Z

Fixed by #1372

In cgroup v1, the parent of a cgroup mount is on tmpfs (or maybe any other fs but definitely not cgroupfs). Use this to traverse up the tree until we reach non-cgroupfs. This should work in any setups (nested container, non-standard cgroup mounts) so issue [1] won't happen. Theoretically, the only problematic case is when someone mounts cpuset cgroupfs into a directory on another cgroupfs. Let's assume people don't do that -- if they do, they will get other error (e.g. inability to read cpuset.cpus file). [1] opencontainers#1367 Signed-off-by: Kir Kolyshkin <[email protected]>

In cgroup v1, the parent of a cgroup mount is on tmpfs (or maybe any other fs but definitely not cgroupfs). Use this to traverse up the tree until we reach non-cgroupfs. This should work in any setups (nested container, non-standard cgroup mounts) so issue [1] won't happen. Theoretically, the only problematic case is when someone mounts cpuset cgroupfs into a directory on another cgroupfs. Let's assume people don't do that -- if they do, they will get other error (e.g. inability to read cpuset.cpus file). [1] opencontainers/runc#1367 Signed-off-by: Kir Kolyshkin <[email protected]>

craigfurman mentioned this issue Mar 15, 2017

Handle container creation when cgroups have already been mounted in another location #1372

Merged

hqhq closed this as completed Aug 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to create containers when cgroup subsystems are mounted in multiple places #1367

Unable to create containers when cgroup subsystems are mounted in multiple places #1367

teddyking commented Mar 9, 2017

cyphar commented Mar 9, 2017 •

edited

Loading

teddyking commented Mar 10, 2017

cyphar commented Mar 10, 2017

teddyking commented Mar 10, 2017

craigfurman commented Mar 15, 2017

hqhq commented Aug 4, 2017

Unable to create containers when cgroup subsystems are mounted in multiple places #1367

Unable to create containers when cgroup subsystems are mounted in multiple places #1367

Comments

teddyking commented Mar 9, 2017

cyphar commented Mar 9, 2017 • edited Loading

teddyking commented Mar 10, 2017

cyphar commented Mar 10, 2017

teddyking commented Mar 10, 2017

craigfurman commented Mar 15, 2017

hqhq commented Aug 4, 2017

cyphar commented Mar 9, 2017 •

edited

Loading