Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled mmap for a region without protection flags (PROT_READ | PROT_WRITE) #8

Open
marxin opened this issue Jan 15, 2024 · 5 comments

Comments

@marxin
Copy link

marxin commented Jan 15, 2024

Hi! It's me again ;) Well, it's been a couple of weeks since I've been struggling with finding out why is my Rust debuginfod consuming an unexpected amount of resident memory (even after malloc_trim) and I thought your tool might help me. The project is not so complex, but it does a multithreaded inspection of RPM files.

For my demo test-case I get to something like:

$ RAYON_NUM_THREADS=100 mevi ./target/release/debuginfod-rs ~/Downloads/RPM-part
...
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
marxin   10970 21.7  0.1 8392676 105408 pts/4  Sl+  20:46   0:00 ./target/release/debuginfod-rs /home/marxin/Downloads/RPM-part

So my app consumes ~100MiB, while MEVI reports only < 1MiB which is my expected output:
Screenshot from 2024-01-15 20-47-56

Full MEVI log:
mevi-log.txt

Looking closer to pmap, there are really ranges where there are tens of mebibytes mapped:

Address           Kbytes     RSS     PSS   Dirty    Swap Mode  Mapping
00007f8e88000000   17544   14340   14340   14340       0 rw-p-   [ anon ]
00007f8e89122000   47992       0       0       0       0 ---p-   [ anon ]
00007f8e8c000000     132       4       4       4       0 rw-p-   [ anon ]
00007f8e8c021000   65404       0       0       0       0 ---p-   [ anon ]
00007f8e90000000   25548   24048   24048   24048       0 rw-p-   [ anon ]
00007f8e918f3000   39988       0       0       0       0 ---p-   [ anon ]
00007f8e94000000   12944    6164    6164    6164       0 rw-p-   [ anon ]
00007f8e94ca4000   52592       0       0       0       0 ---p-   [ anon ]
00007f8e98000000   11556    6152    6152    6152       0 rw-p-   [ anon ]
00007f8e98b49000   53980       0       0       0       0 ---p-   [ anon ]
00007f8e9c000000   10992    4128    4128    4128       0 rw-p-   [ anon ]
00007f8e9cabc000   54544       0       0       0       0 ---p-   [ anon ]
00007f8ea0000000   16240   16156   16156   16156       0 rw-p-   [ anon ]
00007f8ea0fdc000   49296       0       0       0       0 ---p-   [ anon ]
00007f8ea4000000   11312    4132    4132    4132       0 rw-p-   [ anon ]
00007f8ea4b0c000   54224       0       0       0       0 ---p-   [ anon ]
00007f8ea8000000    9860    4112    4112    4112       0 rw-p-   [ anon ]
00007f8ea89a1000   55676       0       0       0       0 ---p-   [ anon ]
00007f8eac000000   11980    6160    6160    6160       0 rw-p-   [ anon ]
00007f8eacbb3000   53556       0       0       0       0 ---p-   [ anon ]
00007f8eb0000000    9528    1016    1016    1016       0 rw-p-   [ anon ]
00007f8eb094e000   56008       0       0       0       0 ---p-   [ anon ]
00007f8eb4000000    9636    1612    1612    1612       0 rw-p-   [ anon ]
00007f8eb4969000   55900       0       0       0       0 ---p-   [ anon ]
00007f8eb8000000    8980     364     364     364       0 rw-p-   [ anon ]
00007f8eb88c5000   56556       0       0       0       0 ---p-   [ anon ]
00007f8ebc000000    8988     572     572     572       0 rw-p-   [ anon ]
00007f8ebc8c7000   56548       0       0       0       0 ---p-   [ anon ]
00007f8ec0000000     176     176     176     176       0 rw-p-   [ anon ]
00007f8ec002c000   65360       0       0       0       0 ---p-   [ anon ]
00007f8ec41e1000       4       0       0       0       0 ---p-   [ anon ]
00007f8ec41e2000    2048       8       8       8       0 rw-p-   [ anon ]

Again, I know the tool is just a hobby project, but if you are interested, I can debug the problem. But any hint would be appreciated.

Thanks!

@marxin
Copy link
Author

marxin commented Jan 15, 2024

I isolated that deeper a bit (a single rayon thread is sufficient) for the demonstration:

[2024-01-15T20:26:52.574Z INFO  debuginfod_rs] walking 1917 RPM files (5.6 GB) <- here the rayon thread is created
2024-01-15T20:26:52.574397Z  INFO mevi::tracer: [20459] cloned into [20465] (with SIGTRAP)
2024-01-15T20:26:52.574642Z  INFO mevi::tracer: [20465] => [20459] mapping 7f26248e0000..7f26248e3000 (12 KiB) with NotResident
2024-01-15T20:26:52.574729Z  INFO mevi::tracer: [20465] => [20459] unmapping 7f261c1ff000..7f2620000000 (62.00 MiB)
2024-01-15T20:26:52.574751Z  INFO mevi::tracer: [20465] => [20459] unmapping 7f2624000000..7f26241ff000 (2.00 MiB)
2024-01-15T20:26:52.576913Z  INFO mevi::tracer: [20465] => [20459] mapping 7f262479e000..7f26247bf000 (132 KiB) with NotResident
2024-01-15T20:26:52.576955Z  INFO mevi::tracer: [20465] => [20459] mapping 7f261f79f000..7f2620000000 (8.38 MiB) with NotResident
2024-01-15T20:26:52.577456Z  INFO mevi::tracer: [20465] => [20459] unmapping 7f262479e000..7f26247bf000 (132 KiB)
2024-01-15T20:26:52.577498Z  INFO mevi::tracer: [20465] => [20459] unmapping 7f261f79f000..7f2620000000 (8.38 MiB)

That's what I get if I add info message also for munmap (there's not remmap or pageout - assert added).

Let's compare it with strace:

RAYON_NUM_THREADS=1 strace -s0 -f -e mmap,munmap ./target/release/debuginfod-rs ~/Downloads/RPM-part
[2024-01-15T20:29:33.335Z INFO  debuginfod_rs] walking 1917 RPM files (5.6 GB)
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fa62ebff000
strace: Process 20665 attached
[pid 20665] mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fa62f168000
[pid 20665] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa626bff000
[pid 20665] munmap(0x7fa626bff000, 20975616) = 0
[pid 20665] munmap(0x7fa62c000000, 46133248) = 0
[pid 20665] mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa62f026000
[pid 20665] mmap(NULL, 8785920, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa62e39e000
[pid 20665] munmap(0x7fa62f026000, 135168) = 0
[pid 20665] munmap(0x7fa62e39e000, 8785920) = 0

(Note the output comes from 2 runs of the application, but the symptoms are the same).

What happens: after the stack is created for the thread, mmap for 128MiB happens followed by unmmap of ~64MiB. The problem is MEVI can't get the mmap:
[pid 20665] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa626bff000
(while the 2 unmaps of the particular region are properly covered).

Any idea what can happen?

@marxin
Copy link
Author

marxin commented Jan 15, 2024

Got it! Oh, it's smart:

[pid 24127] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6d743ff000
[pid 24127] munmap(0x7f6d743ff000, 62918656) = 0
[pid 24127] munmap(0x7f6d7c000000, 4190208) = 0
[pid 24127] mprotect(0x7f6d78000000, 135168, PROT_READ|PROT_WRITE) = 0

So it first calls mmapfor a chunk without any protection level, then it shrinks it and only after that the protection is set to PROT_READ|PROT_WRITE!

@marxin marxin changed the title RSS memory untracked by MEVI Unhandled mmap for a region without protection flags (PROT_READ | PROT_WRITE) Jan 15, 2024
@marxin
Copy link
Author

marxin commented Jan 15, 2024

glibc implementation proves my theory: https://github.com/bminor/glibc/blob/master/malloc/arena.c#L404-L441

@fasterthanlime
Copy link
Owner

That makes sense to me! Are you up for writing a PR for this? I don't have a lot of time to look at mevi right now but I'd happily review something.

marxin added a commit to marxin/mevi that referenced this issue Jan 31, 2024
Newly created heap (alloc_new_heap) is created by glibc without any protection flags
and they are adjusted by `mprotect` once the `mmap` succeeds:
https://github.com/bminor/glibc/blob/master/malloc/arena.c#L404-L441

Fixes: fasterthanlime#8
@marxin
Copy link
Author

marxin commented Jan 31, 2024

Sure, I've just created a pull request for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants