-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Win32 std::bad_alloc during make testall1 #11083
Comments
is this possible just an OOM exception coming from C++? any idea what the memory usage profile looks like by that point in the test? |
In task manager it's on the order of 500-600 MB, but it doesn't look like it's obviously exploding into hitting swap. |
memory fragmentation could start to be a big issue at that size. if you have 3-4GB of memory in the machine, there's little reason (or ability) for the machine to use swap space, since the bigger problem is that the program is running out of virtual memory addresses is that 500-600 MB total or per process? |
Per process. Gets to maybe 80% usage in the 3-workers case, but it's under 70% when doing edit: machine has 6 gb total, background other junk I have running takes maybe half of that though |
but since its x32 julia, it can only access up to 3 GB before it runs out of virtual address for that process. if llvm wants to allocate a big array, there may simply not be enough consecutive free addresses left to hand out |
Mis-remembered, I'm running Do we never free JIT-ed code? This hadn't really been an issue with LLVM 3.3 that I can remember, but presumably the memory layout of JIT-ed code is much different with MCJIT. |
JIT code is probably a bit more expensive in LLVM 3.5 since it also comes with debug info. And no, we never free JIT code. It's pretty hard to analyze whether a particular function might have a pointer alive somewhere, so we don't bother. Generating the JIT code in the first place is pretty expensive, so the benefits of being able to free it eagerly are somewhat doubtful. |
Like not running out of memory in a long-running process? edit: doesn't necessarily have to be freed eagerly of course, but some mechanism for stale code cleanup seems like it might be needed |
the code is never really stale, since there is always the possibility it would be used again (for example, if you ran the same test again). but running a full coverage test suite isn't really a typical workload since it is doing a lot of one-off work, for the purpose of seeing if the code generation is correct, then moving on. |
True, the unit test suite is not a typical workload, but there's some major regression here. And I can reproduce the abysmal performance at least on a 32-bit Linux build, so this might not entirely be Windows-specific. edit: see https://gist.github.com/tkelman/52e8849048b9e32abf7b - pretty old machine, but the performance is unacceptably worse on llvm 3.6.0 vs 3.3 for 32 bit. No bad_alloc, but not sure if running a 32 bit executable in 64 bit Linux behaves differently here than running a 32 bit executable in 64 bit Windows does. |
what type of machine is this on? tests for me are getting much further and running faster, on a low-end 64-bit Atom (executing a cross-compiled julia on a smb mount inside cygwin):
|
The Windows runs are on a Sandy Bridge i7-2630QM laptop from 2011. The Linux runs are on an older Penryn Core 2. I've got the following in my Make.user, will try changing some of this to see if makes a difference:
update: removed the |
oops, that explains it -- I had commented out the LLVM_VER in my Make.user file and forgot to put it back |
I do suspect this isn't Windows-specific, the LLVM-svn nightly 32 bit Linux buildbot shows similar awful performance and an |
No change on llvm 3.6.1. edit: Or 3.7.0. |
Unfortunately, even with the memory problems fixed, this problem does not go away, so something else must be going on. The one thing that did not change is the amount of executable memory required, so maybe there's a problem with that memory allocator, rather than total memory allocation? |
Indeed it seems like LLVM's memory allocator is blowing through virtual memory (without actually using it). Will see if I can fix that. |
What's the linux/osx command for getting the amount of allocated virtual address space? I'd like to print this in the tests as well. |
You can probably use the |
Or |
Fix pending as http://reviews.llvm.org/D15202. |
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366898(v=vs.85).aspx |
So you're saying the merging to reduce syscall overhead is invalid? |
yeah (on windows anyhow, posix doesn't say anything about it) |
Ok, thanks, I'll update the patch accordingly. |
Confirmed fixed by llvm-3.7.1_2.patch, when it actually gets applied instead of eaten by a parallel-make race condition. I'm seeing the socket test hang locally (not every time though), but appveyor didn't on #14623 so that's probably a different issue. |
Reopening since #15632 brought this back. Now as
|
needs retesting with #16777, hopefully fixed but we'll see |
I was briefly hitting a strange inference issue
but that seems fixed by some commits made within the last day.
|
Does malloc with large size use |
While testing #16777, I think we are "only" wasting a few hundred MB (~100MB for subarray test) of memory for code, which is usually less than 20-30% of the memory we use (10%-15% for a fresh subarray test run). I also noticed that the max memory depends on the GC threshold a lot. @carnaval any update from your heap profiling? Any idea what's the partition of the total memory usage? (JIT, llvm malloc, GC aware malloc, GC bigobj, GC pool)? |
this seems to work again. yay. |
happening again |
since #18357 this doesn't happen with |
Seems stale, let's reopen if it happens again. |
I got the same error message either with ATOM and VS Code when I try to solve a MILP with Cbc. How can I workaround this problem? P r o g r a m : C : \ U s e r s \ j o y c e . r a m o s - a r a u j o \ A p p E x p r e s s i o n : s t a t u s ! = C o i n W a r m S t a r t B a s i s This application has requested the Runtime to terminate it in an unusual way. |
That is not the same issue. Your error message is an assertion failure inside the silver. I would recommend filing an issue with a reproducible example with the Cbc.jl package. |
ref #10394 (comment)
When run one at a time via something like
include("test/choosetests.jl"); testlist, net_on = choosetests(); for i in testlist; try Base.runtests(i, 1); catch; end; end
, all tests pass in this configuration at b8aa30d. However running standardmake testall1
results in (also note the really awful performance)If I do
make test
with 3 simultaneous workers, I get the same exception at a different place.The text was updated successfully, but these errors were encountered: