Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The state.json should be generated prior to the creation of the cgroup. #4535

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jianghao65536
Copy link

Fix 4534

Make sure that the state.json is in place before setting up the cgroup or writing 'THAWED' into the freezer.state. This way, the 'runc delete --force' command will work as expected.

@wxx213
Copy link

wxx213 commented Nov 19, 2024

@kolyshkin Could you help to check this?

@@ -561,6 +561,13 @@ func (p *initProcess) start() (retErr error) {
}
}()

// A SIGKILL can happen at any time, and without the state.json,
// the 'runc delete --force' command won't be able to clear the cgroup.
_, err = p.container.updateState(p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
We should know that we will write the pid of init process to state.json, so when we do delete -f, once this init process has dead, but runc stage 2 process is alive, I think maybe we still can't remove this cgroup path either because there is still one process in this cgroup.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing, I discovered that when systemd is used to handle cgroup, it gets cleaned up once runc init has exited. However, if we don't use systemd, some remnants of cgroup are left behind.

If we're not using systemd to manage, we might need to do a check in runc delete to see if runc init is still listed in cgroups.procs. This shouldn't take too much time since runc init is bound to exit due to an error. This error happens because when runc init gets to parts like procHooks that require synchronization with the parent process, it fails as the parent process has already been terminated, leading to errors when runc init tries to write or read values from the pipeline, and consequently, it exits.

{BF5FEB57-6023-4623-A04F-7C96FC189661} {F284436E-31E8-4967-AE89-18A124B1C24C}

@lifubang
Copy link
Member

I paste the CI error msgs here, you can refer it if you can't see the logs.

  • libcontainer/process_linux.go:564: File is not gofumpt-ed (gofumpt)
    // A SIGKILL can happen at any time, and without the state.json,
  • libcontainer/process_linux.go:564: // A SIGKILL can happen at any time, and without the state.json,
    ^^^ extra whitespace at EOL, please fix
  • Commit sha: 2026161, Author: jianghao53, Committer: jianghao53; The sign-off is missing.

To add your Signed-off-by line to every commit in this branch:

Ensure you have a local copy of your branch by checking out the pull request locally via command line.
In your local branch, run: git rebase HEAD~1 --signoff
Force push your changes to overwrite the branch: git push --force-with-lease origin generate-state-before-procHooks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants