Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to turn off SIMD to enable the demonstration of larger point counts? #311

Closed
ctessum opened this issue Sep 16, 2023 · 2 comments

Comments

@ctessum
Copy link
Contributor

ctessum commented Sep 16, 2023

Hello!

I'm working on a demo of a model built on top of MTK and MOL. It's an Earth-Science-type model, and I'm currently able to get it to work in 1D, as shown here: https://earthsci.dev/tutorials/example/ .

However, what I've been finding is that a 1D demonstration model doesn't really seem to capture the imagination of the people that I'm trying to impress so that they'll give us money to build the real thing, and when I try to run it in 2d or 3d I run into the large-point-count issue described here:

At the moment, MOL effectively generates an assignment statement to a calculation in its generated code for all points in space, for all variables. Due to the configuration of LLVM for Julia, this leads the compiler to check all operations against all other operations to see what to [SIMD](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data), leading to bad scaling properties with increasing point count.

Even on an HPC node with 190GB of memory, I still get an OOM error when trying to discretize with enough points to give a passable 3D model.

The description above gave me the idea that maybe it would be possible to turn SIMD optimizations off and therefore get a model that—while it may not run fast—would hopefully at least compile.

So I tried running Julia with the -O0 flag to see if that would help, but as shown below it actually makes things worse compared to the -O2 default. (The plot below shows compile time in seconds vs. the number of grid points for the system linked above.)

After some googling, however, I tried export JULIA_LLVM_ARGS=-vectorize-loops=false and that did improve things a little, as you can see below.

My question is, are there other flags that I could try to remove additional optimizations to further reduce the compile time, hopefully making it linear with point count instead of quadratic? I also tried adding --slp-vectorize-hor=false but that didn't seem to provide any additional help.

Any ideas you can offer would be greatly appreciated!

image

@ChrisRackauckas
Copy link
Member

The issue here isn't SIMD, it's the big codegen. You still get O(n^4) behavior without that being solved. You can try the new JuliaSimCompiler though using this branch (#298) which "mostly" fixes it, I say mostly because there are still a few passes missing.

@ctessum
Copy link
Contributor Author

ctessum commented Sep 20, 2023

Thank you, this is great!

@ctessum ctessum closed this as completed Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants