-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix OOM slice removal race #2353
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2353 +/- ##
==========================================
- Coverage 37.53% 37.34% -0.20%
==========================================
Files 15 15
Lines 1268 1264 -4
Branches 414 420 +6
==========================================
- Hits 476 472 -4
+ Misses 526 524 -2
- Partials 266 268 +2 |
@haircommander @rphillips PTAL |
9cad604
to
bbca1a0
Compare
1018592
to
d6d9a93
Compare
ff0615e
to
db24515
Compare
If the slice is already removed then we mostly encounter two different errors: - `get next line: No such device (os error 19)` - `open memory events file: /sys/fs/cgroup/test.slice/crio-$ID.scope/memory.events: No such file or directory (os error 2)` To avoid such a race we now check after the errors if the file still exists. If not, then we assume an OOM. Signed-off-by: Sascha Grunert <[email protected]>
Nice! |
What type of PR is this?
/kind bug
What this PR does / why we need it:
If the slice is already removed then we mostly encounter two different errors:
get next line: No such device (os error 19)
open memory events file: /sys/fs/cgroup/test.slice/crio-$ID.scope/memory.events: No such file or directory (os error 2)
To avoid such a race we now check after the errors if the file still exists. If not, then we assume an OOM.
Which issue(s) this PR fixes:
Fixes cri-o/cri-o#8411
Special notes for your reviewer:
None
Does this PR introduce a user-facing change?