-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update OpenBLAS to v0.3.5 #30583
Update OpenBLAS to v0.3.5 #30583
Conversation
I think the right place is https://github.com/JuliaPackaging/Yggdrasil/tree/master/OpenBLAS |
I've started a build of |
We should increase the number of MAX THREADS to 40 or perhaps even 64 (on x86-64). |
We set that option low because it seemed to improve performance on small matrices and limits virtual memory usage (iirc, it reserves 256MB * MAX_THREADS). We can reconsider (esp. if OpenBLAS is improved now to initialize this lazily), but that should be a separate PR. edit: cf 716151d |
Yes, but the flip side now is that people complain they have bigger machines and Julia does not use all the cores. |
The small matrix performance is driven by the gemm threshold (which is 50, so multi-threading is not used for smaller matrices). The main driver was memory when we did that change in 2013. I say that core counts and memories have grown enough that doubling the max threads now should be ok (ideally only on linux x86-64). |
From the openblas 0.3.4 release notes: OpenBLAS will now provide enough buffer space for at least 50 threads by default. I am sure we override this with our setting. |
BB tarballs are good to go. @ViralBShah let's open a separate issue, do some performance measurements, then decide the new value to default to. |
Yes, agree that the max threads should not hold up the version update. |
Thanks for taking care of the BinaryBuilder stuff, Elliot! |
Travis macOS is trying to build GCC from source which filled up the max log length so Travis killed the build. |
The macOS failure should be fixed by #30599. |
1ae984a
to
5d46f2a
Compare
5d46f2a
to
05f874b
Compare
I'm seeing a substantial number of BLAS tests failing now on Intel Xeon Silver 4114 (none are present if I revert this PR): https://gist.github.com/vtjnash/c4f09f3019b335a690862134807da41c
|
Yep, looks like an OpenBLAS bug. Let's open an issue on OpenBLAS. I'll take a look to see if there's anything suspicious they've merged against |
Why not just move to an older version of openblas and adopt a new version when they fix things upstream? Would be lesser work overall. |
There might be some important bug fixes in the latest release #30460 (comment) |
Sure, we could revert back to 0.3.3 (0.3.4 won't build on all our platforms) but there are some nice things to have in 0.3.5, lots of bugs fixed, so it's better IMO to disable a few optimized kernels instead. |
I see pinv, bunchkaufman, qr, and triangular failing on skylake. I've been trying to find a reproducible test case for OpenMathLib/OpenBLAS#1955 The openblas failure is strange. On skylake, the tests that fail in |
Note that we're currently using v0.3.3 and this PR goes directly to the newest version, v0.3.5. See the release notes for both 0.3.4 and 0.3.5 for a more complete summary of the changes: https://github.com/xianyi/OpenBLAS/releases.
@staticfloat, we'll need some kind of BinaryBuilder magic for this as well, right? How do?