-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform a checked add in the ptrace dumper ModuleMemory impl. #129
Conversation
I'm augmenting this with a bunch of other arithmetic checks throughout the module_reader code. |
Do you have the /proc/pid/maps from when you saw the incorrect / corrupted state? I wonder if it's not corruption but something else going on. |
I do not have the inputs, though I would like to have them. It was on Android in CI testing: https://treeherder.mozilla.org/logviewer?job_id=469290206&repo=autoland&lineNumber=8170. |
This is bizarre and might be hiding a deeper problem. Is the issue reproducible? Can you check what values are involved and where did they come from? |
I am going to try some runs with debug logging to try to track it down. |
Regardless of the underlying issue, we are doing arithmetic with values derived from inputs, so I think we should not panic if overflow occurs. |
Yes, I agree. |
The problem is reproducible, now I'm trying to figure out how to exfiltrate some debug information. |
|
|
I've still been unable to figure out why this fails, however the fix here does remedy the error. While I'd like to know the root cause, it'd be nice to get this merged. |
I just re-checked the failed workload and I think there might be a way to surface the contents of the memory map. You could try using the |
I'm not sure whether |
Worst case |
|
Okay, so it looks like |
The overflow occurs here:
This offset occurs when trying to get the build id of |
The file works as expected when our code runs on it directly, so something odd is happening when it is loaded in memory. I noticed that the section header offset in the file is |
Could you try checking if the issue presents itself with elfhack disabled? |
I've failed to reproduce the test failure locally (which is fairly annoying, I'm not sure what's different), so I'm going to do a terrible thing to get the memory contents of that library on try. |
It appears as if the section header table is not loaded (but a bunch of junk memory is there). This is not surprising, I've seen this before, though it's uncommon. I'll double-check the program headers to make sure this is expected. So in this case, there's two things we can do: verify the type of section headers we read/use (it's likely to be junk) before doing anything with them, and do checked arithmetic to avoid garbage values (which is already implemented). |
The program headers confirm that the section header table wasn't expected to be loaded. What's disappointing is that the program header PT_NOTE segment doesn't contain the build id (which is what we're trying to get), only the section has it. The section is loaded into memory: it happens to be right after the PT_NOTE data. |
Can you paste the program header table and the section table here? (I mean the output of readelf that displays those tables) |
The section header table is not in a program header (which is the problem). The It seems like this is the common case, I was wrong earlier to say that it's uncommon (though on linux I definitely recall having seen some binaries loaded wholly into memory, including the section header table). I think we normally succeed in getting the build id from the program headers, so we don't attempt the section headers (or we do but just get an access violation which we handle fine, rather than this overflow). |
Maybe if we can't get the build id by reading the in-memory binary we could try to read it from the file (if present)? We changed to reading in-memory because in some cases the files are not present, but it's not an unreasonable fallback. It might incur a lot of extra IO though, I'm unsure how commonly we fail to find the build id. Note that we don't have this issue for SONAME because we can get everything we need from the PT_DYNAMIC segment. The other no-good dirty-rotten hack we could do is just look ahead after the PT_NOTE segment(s) if we don't find the build id in the hopes that the note section is there (I'm willing to bet it's commonly found there). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small change to make the code leaner.
This also bumps the version of `goblin` to at least 0.8.2, which allows us to clean up some code.
This avoids continuing the logic if the section header table isn't present (except for the contrived case where the random bytes happen to be the correct section type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Without this, integer overflow will cause a panic when we should gracefully fail. It's very possible the start address or offset could be incorrect/corrupted.