Minimizing and Testing Memory Usage in Low-Memory RK Methods #640

ChrisRackauckas · 2019-02-03T23:00:42Z

@ranocha tasked us at first with fixing memory usage in #178 . We now have all of the pieces in place to do it correctly (see #178 (comment) ). What remains to be done are some calck optimizations on the low-memory methods, and testing to make sure it works out.

How to do it is already described, so let's talk about testing. Memory tracking makes sense in the asymptotic regime where u0 is large. When u0 is large, say 100,000, then, since the size of the integrator is constant, the size of the uType and rateType vectors will completely dominate. In that case, we can measure the integrator's size in order to see how many registers we are actually using. It's quite accurate:

using OrdinaryDiffEq
u0 = rand(1000,1000)
prob = ODEProblem((du,u,p,t)-> du .= u,u0,(0.0,1.0))
integ = init(prob,Tsit5(),save_start=false,save_end=false,
                          save_everystep = false) # this sets dense = calck = false
Base.summarysize(integ)/Base.summarysize(u0) # 13.00020737396313

and understands aliasing:

Base.summarysize([u0 for i in 1:7]) / Base.summarysize(u0) # 1.0000119999400003

So with this, our goal should be to get these methods down to exactly the amount they require. Some things to note are:

When we first start out in solve.jl, we copy u0. We should add a new argument to the common interface for copy_start = true. True for safety, since if you change u0 all sorts of wonky things will happen to people's code. But true performance purists may want us to re-use that array, so let's do it.
Low-memory RK methods, when calck is false, don't need the k values. They can make the algorithm work out just by aliasing those to another array instead. Note that these are crucial to common use though, since saveat is a widely used thing, so we should try to see if we can do this in a way that eliminates allocations even when calck=true (which is the saveat case)
Low-memory RK methods papers say how many registers it should be using, so we better be matching that integer. If we aren't, we should dig in and find out why.

The text was updated successfully, but these errors were encountered:

ranocha · 2019-02-04T04:59:58Z

Most low-storage methods make some assumption on the possible form of assignments using only two arrays, cf. Ketcheson, David I. "Runge–Kutta methods with minimum storage implementations." Journal of Computational Physics 229.5 (2010): 1763-1773:

2N methods (Williamson): Assignemnts of the form u2 = a*u2 + dt*f(u1) (or u2 = a*u2 + b*dt*f(u1)) can be made using only two arrays.
2R methods (van der Houwen): Assignments of the form u1 = f(u1) can be made using only one array.
2S methods (Ketcheson): Assignemnts of the form u1 = a*u1 + b*u2 + dt*f(u1) can be made using only two arrays.

As far as I know, the current setup of OrdinaryDiffEq.jl does not allow any of these assumptions, since function evaluations are of the form u2 = f(u1) (possibly in-place, of course). Thus, we need one additional array compared to the memory requirements advertised in research articles.

For the user, it might be relatively easy to write the RHS f such that the 2N assumption is fulfilled. The other ones are more complicated in general.

ChrisRackauckas · 2019-02-04T09:24:20Z

We might want to support alternative AbstractODEProblems for these then.

ranocha · 2019-02-04T10:21:56Z

If we really want to minimise the memory usage this seems to be the way to go. This can also speed up the solution of the new specialised problems, e.g. if a linear ODE is to be solved using mul! for sparse arrays (cf. JuliaLang/julia#26117) or something similar if JuliaLang/julia#29634 is merged. An API might be similar to the five argument syntax of mul!.

ChrisRackauckas · 2019-02-04T10:31:02Z

We may just want to add these as possibilities to ODEFunction. The whole point is to allow other forms of the function for optimizations there. The documentation will need some work for it though, and then the methods will need to handle the existence/non-existence of such optimized form.

ranocha · 2019-02-07T08:58:56Z

I'm working on this and will soon push a PR.

Another possibility to reduce the memory would be to set uprev === u for some methods. In the current setup, this does not seem to be possible, since uprev is set in https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/ccd2287a2845a093638875762b74d4fce0bf14eb/src/solve.jl#L213
@ChrisRackauckas: Could it be safe if the algorithm/cache decided whether uprev is a new array or not?

ChrisRackauckas · 2019-02-07T09:06:25Z

@ChrisRackauckas: Could it be safe if the algorithm/cache decided whether uprev is a new array or not?

uprev===u breaks interpolation, so that is only safe when calck==false. Think of calck as meaning that we know the user only wants the solving to work and no interpolation.

http://docs.juliadiffeq.org/latest/basics/common_solver_opts.html#Miscellaneous-1

It's set by:

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/solve.jl#L27-L29

So note that if you have a saveat without a tstop there, you will be using interpolations. But for that size of PDEs, you probably can't save any intermediate values anyways? If you can, you can drop the dt for a step.

ranocha · 2019-02-07T09:09:40Z

Yes, I would not save intermediate values for big PDEs (or would control that explicitly). There are some algorithms that use uprev and some that do not need it. We could reduce the size if the algorithm could tell that. Maybe something like alies_uprev(alg) which defaults to false and can be set to true if this is safe? Or move the initialisation of uprev to the cache?

ChrisRackauckas · 2019-02-07T09:12:57Z

Or move the initialisation of uprev to the cache?

We really should do that. It's just waiting to be done, look at the proximity:

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/solve.jl#L213-L220

Well actually, why not just universally do calck ? uprev = recursivecopy(u) : uprev = u?

ranocha · 2019-02-07T09:17:51Z

We could do that. But there are some methods that use uprev explicitly, e.g. https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/ccd2287a2845a093638875762b74d4fce0bf14eb/src/perform_step/low_storage_rk_perform_step.jl#L118
We would have to adapt them, but that's okay for me.

ChrisRackauckas · 2019-02-07T09:20:08Z

Yeah thinking about it more, most RK methods need uprev in the b step at the end, so that won't work. uprev is also needed when adaptive since it's required for step rejections. It's just the low-storage methods. So yeah, it would make sense moving it into the cache for the algorithms to take control of how that's done.

ranocha · 2019-02-07T09:22:32Z

I would leave that task open for another PR and just submit some first improvements and checks soon.

ChrisRackauckas · 2019-02-07T09:23:16Z

Yup that sounds good.

ranocha · 2019-02-08T05:47:14Z

If we move the initialisation of uprev to the cache, we should do the same with uprev2, don't we? In that case, we can either forward the keyword argument allow_extrapolation from
https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/7dd04a406c9097f14542317684d26d2ddf0e193f/src/solve.jl#L58
to the cache or let the cache decide what to do. Which possibility do you prefer, @ChrisRackauckas?

ranocha · 2019-02-08T08:14:36Z

Thinking more about it, I do not really like the idea to more uprev and uprev2 to the caches. Since are lots of methods which do not need at least one of them, the structure of the caches would not reflect theirs needs.

ranocha · 2019-02-09T12:21:14Z

Short summary (since I don't have time to work on that part right now): After all the changes linked above, the remaining change is to add new forms of ODEFunctions:

We may just want to add these as possibilities to ODEFunction. The whole point is to allow other forms of the function for optimizations there. The documentation will need some work for it though, and then the methods will need to handle the existence/non-existence of such optimized form.

kanav99 · 2019-03-05T13:49:39Z

I'd like to work on this issue if its not done yet.
According to my understanding, making an option rewrite_u in ODEProblem would do our work. For eg. if the function is f(u, p, t) = A*u where A is an upper triangular matrix. In this case u is rewritable. If rewrite_u is set, we could alias du with u to carry out the function evaluations of form u1 = f(u1). Similarly all the three forms could be incorporated using this option.

ranocha · 2019-03-05T14:10:16Z

There is some current discussion on the matrix multiplication API. Hopefully, some decision on the API of inplace multiplication of the form we need for 2N methods will be made soon, cf.
JuliaLang/LinearAlgebra.jl#473.

ChrisRackauckas · 2019-03-05T14:17:34Z

According to my understanding, making an option rewrite_u in ODEProblem would do our work. For eg. if the function is f(u, p, t) = A*u where A is an upper triangular matrix. In this case u is rewritable. If rewrite_u is set, we could alias du with u to carry out the function evaluations of form u1 = f(u1). Similarly all the three forms could be incorporated using this option.

I think this needs to be in-method. If a specific method can make good reuse of u, it should, and then it can likely delete a cache. Of course, not all can, and it can only be done when units aren't used.

I don't think the in-place matrix multiplication API is necessary to create the right ODEFunction infrastructure, but for a user to actually define the function without allocations it'll be needed.

ranocha · 2019-03-05T14:47:12Z

but for a user to actually define the function without allocations it'll be needed.

That's what I meant. We might want to use an API similar to the final decision for the matrix multiplication API. If that's mul!(C, A, B, α=1, β=0) for C = α*A*B + β*C, we can use f!(du, u, p, t, α, β) for du = α*f(u, p, t) + β*du, needed for 2N methods.

kanav99 · 2019-03-05T15:28:12Z

Yes it would be dependent on both the problem and the algorithm. There may be a function where s = f(s) is not possible using a single register. So it should be specified by user that this transformation is possible for this function.
If this is possible, then the method (infact it may be possible to use this extra register in any method) may exploit this fact. I will make a PR which implements this to show the example.

ChrisRackauckas · 2019-03-05T15:32:36Z

Oh I see where you're going with this. Yeah, this would be a neat solution without requiring additional function interfaces. Great idea!

ranocha mentioned this issue Feb 7, 2019

Low Storage Methods and Memory Consumption #651

Merged

This was referenced Feb 8, 2019

alias uprev=u if calck==false for some low storage methods #654

Merged

add keyword alias_u0 with default value false #655

Merged

ChrisRackauckas closed this as completed Jun 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimizing and Testing Memory Usage in Low-Memory RK Methods #640

Minimizing and Testing Memory Usage in Low-Memory RK Methods #640

ChrisRackauckas commented Feb 3, 2019 •

edited

Loading

ranocha commented Feb 4, 2019 •

edited

Loading

ChrisRackauckas commented Feb 4, 2019

ranocha commented Feb 4, 2019

ChrisRackauckas commented Feb 4, 2019

ranocha commented Feb 7, 2019 •

edited

Loading

ChrisRackauckas commented Feb 7, 2019

ranocha commented Feb 7, 2019

ChrisRackauckas commented Feb 7, 2019 •

edited

Loading

ranocha commented Feb 7, 2019 •

edited

Loading

ChrisRackauckas commented Feb 7, 2019

ranocha commented Feb 7, 2019

ChrisRackauckas commented Feb 7, 2019

ranocha commented Feb 8, 2019

ranocha commented Feb 8, 2019

ranocha commented Feb 9, 2019 •

edited

Loading

kanav99 commented Mar 5, 2019

ranocha commented Mar 5, 2019

ChrisRackauckas commented Mar 5, 2019

ranocha commented Mar 5, 2019

kanav99 commented Mar 5, 2019

ChrisRackauckas commented Mar 5, 2019

Minimizing and Testing Memory Usage in Low-Memory RK Methods #640

Minimizing and Testing Memory Usage in Low-Memory RK Methods #640

Comments

ChrisRackauckas commented Feb 3, 2019 • edited Loading

ranocha commented Feb 4, 2019 • edited Loading

ChrisRackauckas commented Feb 4, 2019

ranocha commented Feb 4, 2019

ChrisRackauckas commented Feb 4, 2019

ranocha commented Feb 7, 2019 • edited Loading

ChrisRackauckas commented Feb 7, 2019

ranocha commented Feb 7, 2019

ChrisRackauckas commented Feb 7, 2019 • edited Loading

ranocha commented Feb 7, 2019 • edited Loading

ChrisRackauckas commented Feb 7, 2019

ranocha commented Feb 7, 2019

ChrisRackauckas commented Feb 7, 2019

ranocha commented Feb 8, 2019

ranocha commented Feb 8, 2019

ranocha commented Feb 9, 2019 • edited Loading

kanav99 commented Mar 5, 2019

ranocha commented Mar 5, 2019

ChrisRackauckas commented Mar 5, 2019

ranocha commented Mar 5, 2019

kanav99 commented Mar 5, 2019

ChrisRackauckas commented Mar 5, 2019

ChrisRackauckas commented Feb 3, 2019 •

edited

Loading

ranocha commented Feb 4, 2019 •

edited

Loading

ranocha commented Feb 7, 2019 •

edited

Loading

ChrisRackauckas commented Feb 7, 2019 •

edited

Loading

ranocha commented Feb 7, 2019 •

edited

Loading

ranocha commented Feb 9, 2019 •

edited

Loading