-
Notifications
You must be signed in to change notification settings - Fork 18
Implement workaround to clean up leaking cgroups #570
Conversation
This change implements a cleaner, that scans for cgroups created by systemd-run --scope that do not have any pids assigned, indicating that the cgroup is unused and should be cleaned up. On some systems either due to systemd or the kernel, the scope is not being cleaned up when the pids within the scope have completed execution, leading to an eventual memory leak. Kubernetes uses systemd-run --scope when creating mount points, that may require drivers to be loaded/running in a separate context from kubelet, which allows the above leak to occur. kubernetes/kubernetes#70324 kubernetes/kubernetes#64137 gravitational/gravity#1219
|
||
var paths []string | ||
|
||
baseTime := time.Now().Add(-time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think a 1 minute interval is enough? Maybe, to be on a safe side, make it like an hour, or at least 10 minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the likeliness of a race here is something like a few ns, maybe a few ms if the system is busy. This should be systemd creating a scope, which still start empty, and then placing the launched process into the particular cgroup. So the window is extremely tiny, and a minute is already overkill. If the logic here is incorrect and this stops scopes it shouldn't, I don't see much difference between stopping a scope that is 1 minute old and 1 hour old to cause problems, except that 1 minute old is hopefully a lot more apparent.
I suppose the potential for a false positive, is if a process inside planet specifically creates a scope, and then doesn't use it for a few minutes, that some other process will have manually come in and cleaned it up.
It would be possible to get the scope object from systemd first, and match specifically against Kubernetes mounts, but I was trying to keep the implementation relatively simple.
* Implement workaround to clean up leaking cgroups This change implements a cleaner, that scans for cgroups created by systemd-run --scope that do not have any pids assigned, indicating that the cgroup is unused and should be cleaned up. On some systems either due to systemd or the kernel, the scope is not being cleaned up when the pids within the scope have completed execution, leading to an eventual memory leak. Kubernetes uses systemd-run --scope when creating mount points, that may require drivers to be loaded/running in a separate context from kubelet, which allows the above leak to occur. kubernetes/kubernetes#70324 kubernetes/kubernetes#64137 gravitational/gravity#1219 * change logging level for cgroup cleanup * address review feedback * address review feedback (cherry picked from commit 00ed8e6)
* Implement workaround to clean up leaking cgroups This change implements a cleaner, that scans for cgroups created by systemd-run --scope that do not have any pids assigned, indicating that the cgroup is unused and should be cleaned up. On some systems either due to systemd or the kernel, the scope is not being cleaned up when the pids within the scope have completed execution, leading to an eventual memory leak. Kubernetes uses systemd-run --scope when creating mount points, that may require drivers to be loaded/running in a separate context from kubelet, which allows the above leak to occur. kubernetes/kubernetes#70324 kubernetes/kubernetes#64137 gravitational/gravity#1219 * change logging level for cgroup cleanup * address review feedback * address review feedback (cherry picked from commit 00ed8e6)
* Implement workaround to clean up leaking cgroups This change implements a cleaner, that scans for cgroups created by systemd-run --scope that do not have any pids assigned, indicating that the cgroup is unused and should be cleaned up. On some systems either due to systemd or the kernel, the scope is not being cleaned up when the pids within the scope have completed execution, leading to an eventual memory leak. Kubernetes uses systemd-run --scope when creating mount points, that may require drivers to be loaded/running in a separate context from kubelet, which allows the above leak to occur. kubernetes/kubernetes#70324 kubernetes/kubernetes#64137 gravitational/gravity#1219 * change logging level for cgroup cleanup * address review feedback * address review feedback (cherry picked from commit 00ed8e6)
* Implement workaround to clean up leaking cgroups This change implements a cleaner, that scans for cgroups created by systemd-run --scope that do not have any pids assigned, indicating that the cgroup is unused and should be cleaned up. On some systems either due to systemd or the kernel, the scope is not being cleaned up when the pids within the scope have completed execution, leading to an eventual memory leak. Kubernetes uses systemd-run --scope when creating mount points, that may require drivers to be loaded/running in a separate context from kubelet, which allows the above leak to occur. kubernetes/kubernetes#70324 kubernetes/kubernetes#64137 gravitational/gravity#1219 * change logging level for cgroup cleanup * address review feedback * address review feedback (cherry picked from commit 00ed8e6)
This change implements a cleaner, that scans for cgroups created by
systemd-run --scope that do not have any pids assigned, indicating
that the cgroup is unused and should be cleaned up. On some systems
either due to systemd or the kernel, the scope is not being cleaned
up when the pids within the scope have completed execution, leading
to an eventual memory leak.
Kubernetes uses systemd-run --scope when creating mount points,
that may require drivers to be loaded/running in a separate context
from kubelet, which allows the above leak to occur.
kubernetes/kubernetes#70324
kubernetes/kubernetes#64137
Updates gravitational/gravity#1219