-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap snapshot #42286
Heap snapshot #42286
Conversation
8d51726
to
b391f44
Compare
It would be great if we had a Julia-native parser and renderer for these snapshots so that we don't have to rely on a browser. Is there any intention of implementing that before this PR is merged? Or is it too complicated to be implemented directly in Julia Base/Stdlibs? |
@jpsamaroo the snapshots are written as JSON (we'll write some notes about the format; had to reverse-engineer it); so a separate Julia package could parse and render them somehow. cc @NHDaly |
+1 yeah that's the way to go. It was too weird to try to figure out how to
allocate the results in a way we could pass back to Julia, so I think just
reading the file back in would be the easiest way to do it. 👍
I also wanna do that and write a conversation to a flamegraph 😁
…On Wed, Sep 22, 2021, 4:53 PM Pete Vilter ***@***.***> wrote:
@jpsamaroo <https://github.com/jpsamaroo> the snapshots are written as
JSON (we'll write some notes about the format; had to reverse-engineer it);
so a separate Julia package could parse and render them somehow. cc
@NHDaly <https://github.com/NHDaly>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#42286 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMCIELFCONAE2XZJ3GPWG3UDI637ANCNFSM5EFZZY6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Looks like the Clang analyzer itself is crashing in Also: what performance benchmarks can we run on this to verify that it's not slowing down GC? |
Looks like the snapshots can't be processed by https://heapviz.com/ yet. Somewhat related: once this works, it would be nice to look into generating dumps for massif too, which has a pretty powerful visualizer that supports analyzing the heap over time (https://apps.kde.org/massif-visualizer/). |
I think it might be confused that
|
Devtools Heap Snapshot viewer (.heapsnapshot extension), to the given | ||
IO stream. | ||
""" | ||
function take_heap_snapshot(io) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function take_heap_snapshot(io) | |
function take_heap_snapshot(io::IOStream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried that earlier, but it seems not to be defined yet:
$ make
...
LoadError(at "sysimg.jl" line 3: LoadError(at "Base.jl" line 97: LoadError(at "gcutils.jl" line 119: UndefVarError(var=:IOStream))))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this made the builds fail, e.g. https://build.julialang.org/#/builders/63/builds/3778/steps/8/logs/stdio
either we need to get IOStream in scope somehow, or take the annotation back off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking the annotation back off until we can figure out how to get it in scope in this file… I guess we have to figure out what file IOStream
is defined in and make sure it gets included before this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this one has been resolved, yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still may be good to acquire the internal lock on io
here first also
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(partial review)
else if (flags.how == 3) { | ||
jl_value_t *owner = jl_array_data_owner(a); | ||
uintptr_t nptr = (1 << 2) | (bits & GC_OLD); | ||
// TODO: Keep an eye on the edge type here, we're _pretty sure_ it's right.. | ||
gc_heap_snapshot_record_internal_edge(new_obj, owner); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it is a reshape of a different array (the "owner"), so we can draw the graph * -> a -> owner -> data -> *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, so maybe this should just be a normal edge then? We probably need to figure out more precisely what all the edge types mean.
|
||
|
||
struct StringTable { | ||
typedef unordered_map<string, size_t> MapType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typedef unordered_map<string, size_t> MapType; | |
typedef llvm::StringMap<size_t> MapType; |
(gives higher performance https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h)
public: | ||
|
||
// private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public: | |
// private: |
// private:
b95820a
to
e9960d5
Compare
} else if (type == (jl_datatype_t*)jl_malloc_tag) { | ||
name = "<malloc>"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a sentinel value for a
. It cannot show up here.
} else if (type == (jl_datatype_t*)jl_malloc_tag) { | |
name = "<malloc>"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that true? We saw it's checked in objprofile_print
, which is what we copied it from:
Lines 688 to 689 in e9960d5
else if (ty == jl_malloc_tag) | |
jl_safe_printf("#<malloc>"); |
@@ -2403,12 +2422,16 @@ module_binding: { | |||
} | |||
void *vb = jl_astaggedvalue(b); | |||
verify_parent1("module", binding->parent, &vb, "binding_buff"); | |||
// Record the size used for the box for non-const bindings | |||
gc_heap_snapshot_record_internal_edge(binding->parent, b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b
isn't a jl_value_t, so this should fail to compile here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weirdly, it prints a warning here, not an error. We're not sure why. It's on our todo list to deal with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(hehe i think we were just gonna cast it to (jl_value_t*)
though)
@@ -2540,6 +2563,7 @@ mark: { | |||
if (flags.how == 1) { | |||
void *val_buf = jl_astaggedvalue((char*)a->data - a->offset * a->elsize); | |||
verify_parent1("array", new_obj, &val_buf, "buffer ('loc' addr is meaningless)"); | |||
gc_heap_snapshot_record_internal_edge(new_obj, jl_valueof(val_buf)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val_buf
is not a jl_value_t. You can probably charge this size directly against a
(e.g. with gc_heap_snapshot_record_hidden_edge
, or maybe better yet, when computing the summarysize of a
in record_node_to_gc_snapshot
)
} else if (type == (jl_datatype_t*)jl_buff_tag) { | ||
name = "<buffer>"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can only show up in very specific contexts (jl_binding_t and jl_array_t->data), so it might be better to just avoid passing those here and instead charge their size directly to their parent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we considered doing that, too, but we figured it makes more sense to report them as "internal" edges, which (we think) tells the javascript frontend that there's some object there, but it's not a normal object and it's opaque. That seems like it's the right fit for this kind of value, too? That way we can still ask questions like "what is the size of all the array buffers," in case that was something we were interested in, i guess?
@jpsamaroo I wrote a small pure-Julia parser here :) https://github.com/vilterp/HeapSnapshotParser.jl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapshot->node_types.find_or_create_string_id("synthetic"), | ||
"(stack frame)", // name | ||
(size_t)frame, // id | ||
1, // size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vchuravy how should I get the size of a stack frame? (ofc it's jl_gcframe_t
, not the whole stack frame). frame->nroots
times the size of a pointer or something? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, want to attribute the size of the stack overall to the task… how do we get the size of the entire stack? 🤔 you said it's fixed now, but it may not always be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get the size of the stack from when copystack
. See the gc mark part for task.
I think size of the jl_gcframe_t
is (frame->nroots + 2)*sizeof()void*
, but you could be double counting if you also measure size of stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but you could be double counting if you also measure size of stack
Yeah i was gonna say the same thing. Just doing the size of the whole stack seems reasonable, yeah
gc_setmark_buf_(ptls, stkbuf, bits, ta->bufsz); | ||
// TODO: attribute size of stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you can add a internal edge to count the ta->bufsz
against the task.
@@ -3065,8 +3077,10 @@ static int _jl_gc_collect(jl_ptls_t ptls, jl_gc_collection_t collection) | |||
// 2.1. mark every object in the `last_remsets` and `rem_binding` | |||
jl_gc_queue_remset(gc_cache, &sp, ptls2); | |||
// 2.2. mark every thread local root | |||
// TODO: treat these as roots |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO: treat these as roots |
@@ -3065,8 +3077,10 @@ static int _jl_gc_collect(jl_ptls_t ptls, jl_gc_collection_t collection) | |||
// 2.1. mark every object in the `last_remsets` and `rem_binding` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we could add:
gc_heap_snapshot_record_root_other(&jl_all_tls_states, "jl_all_tls_states");
for (int t_i = 0; t_i < jl_n_threads; t_i++) {
jl_ptls_t ptls2 = jl_all_tls_states[t_i];
gc_heap_snapshot_record_edge_other(&jl_all_tls_states, &ptls2);
....
Or something like that.
@@ -646,7 +646,7 @@ static int mark_reset_age = 0; | |||
|
|||
static int64_t scanned_bytes; // young bytes scanned while marking | |||
static int64_t perm_scanned_bytes; // old bytes scanned while marking | |||
static int prev_sweep_full = 1; | |||
int prev_sweep_full = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that this is always set to 1? i.e. set to 1 here and never changed anywhere else? so it'd have no effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no i think it's set, here:
https://github.com/vilterp/julia/blob/2b2c3ab4db8c35f4deb8a44330dc765e73e61647/src/gc.c#L3261
Jameson suggested this change. the idea is to make sure that we're only doing the GC Mark edge-reporting during the second GC, since our call to jl_gc_collect(JL_GC_FULL)
does two mark+sweeps: 1. mark, 2. sweep FULL, 3. mark, 4. sweep INCREMENTAL. Step 3 will be a full mark, because all mark bits have been reset. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, this variable was already defined here (and set elsewhere, before this commit), but you added an extern declaration to heap-snapshot.h so we could check it there… gotcha. wasn't reading the diff close enough. nice change!
2c39875
to
6ab9e15
Compare
Rebased onto master; appears to still work (we need more tests lol) |
@NHDaly seems like the rebase broke
|
Any chance this could sneak into 1.8? |
x86-64 uses 48 bits virtual address space and ARM uses up to 52 bits, IIUC https://developer.arm.com/documentation/101811/0101/Address-spaces-in-AArch64 So, isn't JS number enough for the memory addresses (for now)? |
Just a quick note. I just tried rebasing this on current master, and gave it a go, and it worked at first after boot, but failed with similar errors to #42286 (comment) after running some intensive code. It might be helpful to note that I did have to use Chrome Canary, because the "load" button on the memory tab did nothing on latest chrome dev tools. I guess the segfault might be due to type mismatches reported during build:
Update: It didn't segfault with a minimal MWE while investigating the issue here #42566 (comment) |
@IanButterworth there seems to be a moderate number of TODO checkboxes left, and it needs to be gotten to a non-draft state where it is ready for review. There is about a week until the branch for that to get in though. |
Hey @IanButterworth, glad you were able to get it to run, at least in a simple case! @NHDaly and I have been focusing on finishing the allocation profiler (#43868) for 1.8, so we've put this on the back burner. There's the segfault we've been running into in the field path code (having to do with inlined structs), plus some other check boxes on the list. |
I'm being greedy. Both are really helpful. Thanks for taking them both on! If I can spare free time to helping debug the segfault here I'll try. |
Jeff told me that Christine was very interested in this too. @chflood: if you would like to collaborate on the Heap Snapshot tool, we'd love to get some help with it! There are just so many code paths in GC and little edge cases that kept us stuck on this PR. @IanButterworth: yay, thanks for the help! Excited to get more eyes and arms on it! 💪 |
Hey @chflood, thanks for checking it out :) Your invocation looks fine — did your |
I've added this idea to the PR description:
|
gc_mark_queue_obj(gc_cache, sp, ptls2->previous_exception); | ||
gc_heap_snapshot_record_root(ptls2->current_task, "previous exception"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be gc_heap_snapshot_record_root(ptls2->previous_exception, "previous exception");
. Similar comment for the lines above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i think you're right! 👍 👍 👍 Thanks @whatsthecraic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Fixed in e732d48.
…_thread_local Co-Authored-By: Dean De Leo <[email protected]> Co-Authored-By: @whatsthecraic
Closed in favor of #46862 |
We expose a function
GC.take_heap_snapshot(stream)
, which writes a heap snapshot in Chrome's.heapsnapshot
JSON format to the given IO stream.This can be loaded into Chrome Devtools' snapshot viewer to explore the heap and find memory leaks.
Usage
Then open the Chrome Devtools, go to the "Memory" tab, hit the "load" button, choose the file, and you should see something like:
Implementation:
Piggybacks on the mark phase of the garbage collector, which is already traversing all the live objects in the heap. Creates (in C++) a
Node
object for every object in the heap, and anEdge
object for every pointer from one object to another.Mimicks Node.js/V8's heap snapshotter, which dumps into the same format: https://github.com/nodejs/node/blob/5fd7a72e1c4fbaf37d3723c4c81dce35c149dc84/deps/v8/src/profiler/heap-snapshot-generator.h#L509-L509
TODOS
Core.live_bytes()
take_gc_snapshot
callGC.gc
within itselftypeof
Problems:
<malloc>
objects:387::<malloc>@3735923201
Co-Authored-By: @vchuravy
Co-Authored-By: @NHDaly