-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kpatch-build: incorrect or missing correlation of references to .rodata.*.str1_1 leads to kernel crashes #1225
Comments
Tough problem. As I understand it, the patch inadvertently introduced "patched" code that now sets up pointers to the patch module, pointers that may outlive the patch module lifetime :( Atomic replace doesn't help, and in fact, may only encourage more kpatch module unloading. I don't know of any 100% perfect way to fix this problem. Substituting the relocations as suggested should solve this specific instance... but the general problem may manifest in other ways depending on the pointer type (thinking of function pointers, or did we do the same trick for those?). Perhaps it's worth kpatch-build highlighting these changes as a non-fatal warning? Could they reasonably be checked by reviewing the changed function list vs. the input patch? |
Exactly.
We dealt with function pointers in 495e619, to some extent. It relies on a couple heuristics, but has worked so far. Besides, processing functions is easier, because create-diff-object detects which ones are new or changed. As for
|
I wonder, would it help if create-diff-object used dynrelas for the references to |
Hmm, the idea is that the dynrelas would resolve them to their originals? Are there cases in which the string contents really do need to change? And in which case, a livepatch that starting proliferating pointers to livepatch provided string contents could never be safely disabled. Maybe I'm not thinking this through, but it feels like an impossible to solve problem, for all cases anyway. Using the original relocation/contents sounds reasonable, I just wonder how to catch any instances where it's not. Those seem like inherently unsafe patches that maybe we could bring attention to somehow. |
So far, I have found only 2 places in the kernel where the problem might happen: the one in sctp I described above and also one location in the module loader code (add_sect_attrs()), where I found the issue initially: kernel/module.c:
In both cases, it makes no sense to change the string in the patch. Most of the time, the kernel code copies the needed strings rather than refers to them directly - it is safer this way. |
…Stream The list of assignments of string pointers to non-local variables. I filtered out some records, not interesting for livepatch, e.g. the ones from lib/ and a few other places livepatch cannot handle. Note that there are still many records in the log, where such assignments are not a problem, i.e. false positives. '*error = ' and such in dm/* seem to be one common example: the assigned string seems "short-lived", it is likey only used by the callers of the given function rather than stored for later. On the other hand, the detected cases in crypto/asymmetric_keys/*.c, fs/nfs/client.c and some other might be problematic for livepatch. See dynup/kpatch#1225 for details.
Out of curiosity, I wrote a small plugin for GCC, which reports assignment of const string pointers to non-local variables: https://github.com/euspectre/stored_strings/ . I used it when rebuilding kernel 5.14.0-80.el9.x86_64 from CentOS 9 Stream, and it had show lots of such assignments in the kernel code. Here are the (slightly filtered) results: https://github.com/euspectre/stored_strings/blob/master/results/centos9-5.14.0-80.el9.x86_64/filtered01.log. About 1400 locations total. It seems that many such constructs create no problems for livepatch, though.
Unfortunately, it seems difficult to filter out such cases automatically. However, there are many locations where livepatching might still run into the issue. Examples:
The GCC plugin sometimes produces false positives, so one cannot just add it to KCFLAGS in kpatch-build as is. Still, it might give hints which kernel functions might require extra care when preparing livepatches. |
As for my original ideas how to fix the issue - they won't work that easily, I'm afraid. The problem is not because the contents of .rodata.*.str1.1 change. kpatch_include_standard_elements() in create-diff-object always includes .rodata.*.str1.* sections, even if they are the same. So, even a livepatch that adds a "nop" in proc_sctp_do_hmac_alg() would have the issue. proc_sctp_do_hmac_alg() from the original kernel refers to .rodata.str1.1, while the one in the patched module - to .rodata.proc_sctp_do_hmac_alg.str1.1. create-diff-object does not have access to the original .rodata.str1.1 section. Detection of such unsafe assignments in the binary code also seems difficult: they look the same as the assignments to locals, etc. Not sure what to do with this. |
This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added. |
This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added. |
kpatch: 0.9.4
OS: CentOS 8 x86_64
Kernel: 4.18.0-305.19.1.el8_4.x86_64
While investigating an unrelated bug, I found that kpatch-build/create-diff-object probably does not correlate references to strings from .rodata.*.str1.1 properly and that causes difficult to debug kernel crashes.
Consider the following simple patch for sctp:
Note that, the locations of strings "md5" and "sha1" in .rodata.proc_sctp_do_hmac_alg.str1.1 changed in the patched code, because the printk format string was added:
The strings are used as follows in proc_sctp_do_hmac_alg():
Although this part of the source code remains unchanged, the binary code of the patched file is different because it refers to different locations in .rodata.proc_sctp_do_hmac_alg.str1.1.
The patch module has its own "md5" and "sha1" strings now, and the patched
proc_sctp_do_hmac_alg()
refers to them rather than to the original ones.This is problematic because the addresses of the strings are stored in
net->sctp.sctp_hmac_alg
for later use. If we unload the patch module at the appropriate moment, the kernel will still keep the dangling addresses of these strings in the patch module and will crash when it tries to access them.From vmcore-dmesg.txt:
Note that ffffffffc075f60d mentioned in the "BUG" message is
.rodata.proc_sctp_do_hmac_alg.str1.1+0x18
, the address of "md5" string in the patch module.As you can see, it is quite easy to miss such string references when preparing patches. They are relatively rare in the kernel but can still be found.
I guess, create-diff-object could detect that a string did not change in the patched object and replace the relocation with the one for the original data. Or, perhaps, you have better ideas how to fix it.
The text was updated successfully, but these errors were encountered: