Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/adjoint odes #1905

Merged
merged 236 commits into from
May 17, 2021
Merged

Feature/adjoint odes #1905

merged 236 commits into from
May 17, 2021

Conversation

bbbales2
Copy link
Member

@bbbales2 bbbales2 commented May 22, 2020

This PR implements the design as described in

https://github.com/stan-dev/design-docs/blob/master/designs/0027-adjoint-ode.md

The goal is to make reverse mode autodiff faster for large ODE systems.

Checklist

  • Math issue implement adjoint ODE solver #2486

  • Copyright holder: Columbia University, Sebastian Weber

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@bbbales2 bbbales2 mentioned this pull request May 22, 2020
5 tasks
@bbbales2
Copy link
Member Author

There are some questions to answer and things to do for this:

  • How to handle tolerances? There are three sets of tolerances now.

a. Tolerances for the forward solve
b. Tolerances for the backwards solve (sensitivities with respect to initial condition and t0)
c. Tolerances for the backwards quadrature (sensitivities with respect to parameters)

These can be parameterized in absolute and relative tolerances, or the absolute tolerances of the backwards stuff can be parameterized by a vector. We gotta decide what to expose.

  • We can use a different solver for the forward and backward problems (solve the forward with BDF and the backwards with adams)

  • If only sensitivities with respect to ts are needed, then we don't need to do the full backwards solve (this is an optimization to implement).

  • The implicit solves for the backwards solve can be done using iterative methods. In this case we only need to provide adjoint^T * Jacobian operations, not a full Jacobian calculation. The operations are there, but we need to test this to see if it is faster than computing the full Jacobian.

  • This hasn't been tested to be faster than the original implementation, it only just works

  • The current tests say there is a problem with precision in the lorenz tests here:

    test_ode_cvode(lorenz, t0, ts, y0, theta, x, x_int, 1e-6, 1e-1);
    . Need to track down if the finite differences are causing problems or the backwards solves

  • Figure out how many integration steps between checkpoints or expose this to the user (see docs for CVodeAdjInit)

@bbbales2
Copy link
Member Author

Basically the outline of the code is section 6.1 here: https://computing.llnl.gov/sites/default/files/public/cvs_guide.pdf

The set of backwards ODEs is something like:

lambda^T' = lambda^T * J_y
u^T' = lambda^T * J_theta

lambda(T) = adj_y(T)
u(T) = 0

The way to think about this is we're actually solving a single adjoint program between each output.

So integrating from ts[i + 1] to ts[i] is one adjoint problem. We have to restart the backwards integrator and the quadrature at every output time to pull in more y adjoints (

state_sens(j) += non_chaining_varis_[i * N_ + j]->adj_;
, ignore step_sens there).

The adjoints from the output look like a forcing function that is a series of delta functions at times ts. Given the expense of restarting the ODE solvers, it might be worth double checking these adjoints are not actually identically zero before doing the restarts.

@bbbales2
Copy link
Member Author

Oh yeah in that notation lambda corresponds roughly to the adjoints of y0 and u corresponds roughly to the adjoints of the input parameters.

@wds15 wds15 self-assigned this Jul 16, 2020
@wds15
Copy link
Contributor

wds15 commented Jan 1, 2021

After ages I finally found some time to look into this here. From what I see it looks as if we are close to getting this working, which should/would be great. A few Qs:

  • It was always a problem to run reverse mode AD during the backward-sweep as I recall. Can this still be a problem (there were a few fixes)? If yes, then we could switch to use the scopedchainable stack thing to do the needed nested AD calculations possibly. This would entirely separate the two AD data structures and thus fix the issue in a clean way as I would expect. EDIT: We should probably do that to make this robust / there are also some tricks for speed to be implemented as well.
  • Are there no benchmarks at all so far? Is this where we should probably start to check if this is all worth the trouble? My expectation would be that it is a matter of the number of parameters; the more parameters the more the adjoint method should be faster. Thus, an automatic switch between adjoint vs forward could be based on some heuristic (or we just expose a new ode_bdf_adjoint function). EDIT: We should probably introduce a ode_bdf_adjoint and stuff in a number of extra tuning parameters (checkpointing, tolerances)
  • I understand that we integrate backward in time the "adjoint problem" which has as many states as the original ODE system. Magically this gives us all the sensitivities we need... is there a more detailed write-up other than the CVODES doc which is good to read? The CVODES doc is not entirely clear to me. EDIT: OK, it dawns an me... g(t,y,p) is a sum over delta peaks as I understand your notes here.

Looks like cool beans.

@rok-cesnovar Would a new ode_bdf_adjoint function call be hard to implement in a stanc3 prototype? I will need to figure out what arguments we want to pass to it, so no hurry... but I think that this (amazing) work should land in Stan at some point provided benchmarks support this, which I do expect to do so.

@bbbales2
Copy link
Member Author

bbbales2 commented Jan 1, 2021

It was always a problem to run reverse mode AD

That's fixed I believe.

Are there no benchmarks at all so far?

Not on this code. Once the timing stuff gets in it's gonna be waaaay easier to start talking about ODE performance: stan-dev/cmdstan#960 (comment)

Hopefully that'll be within the month.

is there a more detailed write-up other than the CVODES doc which is good to read?

It's pretty confusing to track down. I think @charlesm93 has it written up somewhere. I know Charles also has some thoughts related to the adjoint problem as well.

Yeah a separate function makes sense for this as well as some benchmarks at the Stan model level. It will be much easier to get this info if we wait on timing -- that gives us forward and reverse mode timings and it's easy to benchmark the different algos.

I forget if this was passing all the existing tests, but it'll probably need a little dust knocked off to get up to date with develop. You wanna take over here?

@charlesm93
Copy link
Contributor

I think @charlesm93 has it written up somewhere.

I'm not sure how helpful this is, but @betanalpha and I have a writeup about the adjoint method, but we only cover the math that motivates what the adjoint system should be, not the implementation details (checkpointing, etc). Section 1 in https://arxiv.org/abs/2002.00326 is our attempt at explaining this in an intelligible manner. We have some work in progress with generalizations for other implicit functions, but this is taking time to write.

@wds15
Copy link
Contributor

wds15 commented Jan 2, 2021

@bbbales2 Sure, I can take this up - I may need some help here and there which would be great. One thing I wonder is if we can reduce the number of quadrature problems to just what we need. I mean, right now we output for every state the derivatives for every parameter. For N ode states and P parameters this makes up N x P quadrature problems per time point. If we restrict to just one output (ideally the lpdf value or for now only one ode state needed), then that should be cheaper.

@charlesm93 Skimming through your abstract it reads really similar to https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005331 (see supplements section 1). It's a bit more specialized though. Anyway, it's good to have another take on this, I will dive into yours as well.

So the next steps for me are:

  • make this it's own function
  • merge in develop
  • start a basic benchmark
  • tweak a few things as I see a need
  • consider restricting number of output states to reduce quadrature problems

Then we are a few steps further.

@charlesm93
Copy link
Contributor

@wds15 Thanks for sharing the paper. The supplement section 1 is similar to our section 1, as both review the classic adjoint method for ODEs. While the authors discuss discrete times, my understanding is that the system is still assumed to evolve continuously between these times. Ultimately, equations 1, 9, 10, and 11 all describe ODEs / integrals to evaluate and differentiate the objective function. The discretization my co-authors and I investigated concerns difference equations (initially motivated by HMMs). There's no ODE and the sensitivities are not given by an integral but by a summation. I think the term "discrete" is ambiguous and maybe a little misleading in both papers.

@wds15
Copy link
Contributor

wds15 commented Jan 2, 2021

@charlesm93 yup, things are confusing indeed. Your paper though is relatively verbose and straightforward to follow which is great.

@bbbales2 I just push a feature/adjoint-odes-v2 branch which is up to date with develop and I renamed the new stuff to ode_bdf_adjoint. The bad news is that a few things do not work right now:

  • Eigen data structures cannot anymore be part of the arguments due to changes in value_of related to expressions, I think. I don't have an idea what to change... but maybe it's an easy change to do?
  • The thing spits out wrong results whenever the parameters vary. I don't know why that is the case right now. @bbbales2 are you aware of any conventions being different? I will need to thoroughly debug and solve this before I can continue.

I am using the test test/unit/math/rev/functor/ode_bdf_adjoint_grad_rev_test.cpp to test for correctness. This test passes ok on the old branch feature/adjoint-odes; so it's something due to the merge with develop. No idea right now.

@rok-cesnovar
Copy link
Member

@rok-cesnovar Would a new ode_bdf_adjoint function call be hard to implement in a stanc3 prototype? I will need to figure out what arguments we want to pass to it, so no hurry... but I think that this (amazing) work should land in Stan at some point provided benchmarks support this, which I do expect to do so.

Sorry, missed this. Adding a new variadic signature is just a bit more work then adding a simple signature which is trivial. Once you figure out what you need do tag me and I can prepare that, really not a problem.

@betanalpha
Copy link
Contributor

betanalpha commented Jan 3, 2021 via email

@wds15
Copy link
Contributor

wds15 commented Jan 3, 2021

I will not implement this stuff for RK45 - just to make this clear from the start. If anyone makes this is a requirement for including this, then I will stop working now on this.

@betanalpha We are using CVODES callbacks and all the facilities they provide.

@bbbales2 Good news! I looked closely at what changed in cvodes_integrator in the meantime and found that some stuff changed wrt to Eigen expression stuff. By using plain_type_t for the value_of call with the variadic arguments things now work magically. However, I still cannot pass Eigen structures as shared arguments into any of this. When I do, I get the error below. Does that ring a bell? Things are pushed to the feature/adjoint-odes-v2 branch.

For now I can continue as unit tests are ok with the plain_type_t thing - is there somewhere some doc on the Eigen expression stuff (pinging @t4c1 & @SteveBronder )?

In file included from test/unit/math/rev/functor/ode_bdf_adjoint_rev_test.cpp:1:
In file included from ./stan/math/rev.hpp:4:
In file included from ./stan/math/prim/fun/Eigen.hpp:22:
In file included from lib/eigen_3.3.9/Eigen/Dense:1:
In file included from lib/eigen_3.3.9/Eigen/Core:463:
lib/eigen_3.3.9/Eigen/src/Core/Matrix.h:294:22: error: no matching member function for call to '_init1'
      Base::template _init1<T>(x);
      ~~~~~~~~~~~~~~~^~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/tuple:232:15: note: in instantiation of function template
      specialization 'Eigen::Matrix<double, -1, 1, 0, -1, 1>::Matrix<std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const
      Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > > >' requested here
            : __value_(_VSTD::forward<_Tp>(__t))
              ^
./stan/math/rev/functor/cvodes_integrator_adjoint.hpp:441:18: note: in instantiation of function template specialization 'stan::math::cvodes_integrator_adjoint_memory<2,
      stan::test::CosArg1, Eigen::Matrix<double, -1, 1, 0, -1, 1>, double, double, Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1>
      >::cvodes_integrator_adjoint_memory<nullptr>' requested here
    memory = new cvodes_integrator_adjoint_memory<Lmm, F, T_y0, T_t0, T_ts, T_Args...>(
                 ^
./stan/math/rev/functor/ode_bdf_adjoint.hpp:56:13: note: in instantiation of function template specialization 'stan::math::cvodes_integrator_adjoint_vari<2,
      stan::test::CosArg1, Eigen::Matrix<double, -1, 1, 0, -1, 1>, double, double, Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1>
      >::cvodes_integrator_adjoint_vari<nullptr>' requested here
      = new stan::math::cvodes_integrator_adjoint_vari<CV_BDF, F, T_y0, T_t0, T_ts,
            ^
./stan/math/rev/functor/ode_bdf_adjoint.hpp:154:10: note: in instantiation of function template specialization 'stan::math::ode_bdf_adjoint_tol_impl<stan::test::CosArg1,
      Eigen::Matrix<double, -1, 1, 0, -1, 1>, double, double, Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> >' requested here
  return ode_bdf_adjoint_tol_impl("ode_bdf_adjoint", f, y0, t0, ts, relative_tolerance, absolute_tolerance,
         ^
test/unit/math/rev/functor/ode_bdf_adjoint_rev_test.cpp:201:28: note: in instantiation of function template specialization 'stan::math::ode_bdf_adjoint<stan::test::CosArg1,
      Eigen::Matrix<double, -1, 1, 0, -1, 1>, double, double, Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> >' requested here
  var output = stan::math::ode_bdf_adjoint(stan::test::CosArg1(), y0, t0, ts, nullptr,
                           ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:774:30: note: candidate function template not viable: no known conversion from 'const
      std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >'
      to 'Eigen::Index' (aka 'long') for 1st argument
    EIGEN_STRONG_INLINE void _init1(Index size, typename internal::enable_if<    (Base::SizeAtCompileTime!=1 || !internal::is_convertible<T, Scalar>::value)
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:810:30: note: candidate function template not viable: no known conversion from 'const
      std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >'
      to 'const Eigen::PlainObjectBase<Eigen::Matrix<double, -1, 1, 0, -1, 1> >::Scalar *' (aka 'const double *') for 1st argument
    EIGEN_STRONG_INLINE void _init1(const Scalar* data){
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:824:30: note: candidate function template not viable: no known conversion from 'const
      std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >'
      to 'const Eigen::Matrix<double, -1, 1, 0, -1, 1>' for 1st argument
    EIGEN_STRONG_INLINE void _init1(const Derived& other){
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:788:30: note: candidate template ignored: substitution failure [with T = std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at
      ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >]: implicit instantiation of undefined template
      'Eigen::internal::enable_if<false, std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const
      Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > > >'
    EIGEN_STRONG_INLINE void _init1(const Scalar& val0, typename internal::enable_if<Base::SizeAtCompileTime==1 && internal::is_convertible<T, Scalar>::value,T>::type* = 0)
                             ^                                             ~~~~~~~~~
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:797:30: note: candidate template ignored: substitution failure [with T = std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at
      ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >]: implicit instantiation of undefined template
      'Eigen::internal::enable_if<false, std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const
      Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > > *>'
    EIGEN_STRONG_INLINE void _init1(const Index& val0,
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:817:30: note: candidate template ignored: could not match 'DenseBase' against 'tuple'
    EIGEN_STRONG_INLINE void _init1(const DenseBase<OtherDerived>& other){
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:831:30: note: candidate template ignored: could not match 'EigenBase' against 'tuple'
    EIGEN_STRONG_INLINE void _init1(const EigenBase<OtherDerived>& other){
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:837:30: note: candidate template ignored: could not match 'ReturnByValue' against 'tuple'
    EIGEN_STRONG_INLINE void _init1(const ReturnByValue<OtherDerived>& other)
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:845:30: note: candidate template ignored: could not match 'RotationBase' against 'tuple'
    EIGEN_STRONG_INLINE void _init1(const RotationBase<OtherDerived,ColsAtCompileTime>& r)
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:853:30: note: candidate template ignored: substitution failure [with T = std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at
      ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >]: implicit instantiation of undefined template
      'Eigen::internal::enable_if<false, std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const
      Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > > >'
    EIGEN_STRONG_INLINE void _init1(const Scalar& val0,
                             ^
lib/eigen_3.3.9/Eigen/src/Core/PlainObjectBase.h:865:30: note: candidate template ignored: substitution failure [with T = std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at
      ./stan/math/prim/fun/value_of.hpp:58:28), const Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > >]: implicit instantiation of undefined template
      'Eigen::internal::enable_if<false, std::__1::tuple<Eigen::CwiseUnaryOp<(lambda at ./stan/math/prim/fun/value_of.hpp:58:28), const
      Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1> > > *>'
    EIGEN_STRONG_INLINE void _init1(const Index& val0,
                             ^
1 error generated.
make: *** [test/unit/math/rev/functor/ode_bdf_adjoint_rev_test.o] Error 1
make -j1 test/unit/math/rev/functor/ode_bdf_adjoint_rev_test failed
exit now (01/03/21 07:34:44 CET)

@bbbales2
Copy link
Member Author

bbbales2 commented Jan 3, 2021

I will not implement this stuff for RK45

Yeah this came up before and @yizhang-yiz said implementing checkpointing wasn't a business we wanted to get in to.

When I do, I get the error below.

Will give this a go later

@bbbales2
Copy link
Member Author

bbbales2 commented Jan 3, 2021

This test seems to be missing:

test/unit/math/rev/functor/ode_bdf_adjoint_rev_test.cpp

@wds15
Copy link
Contributor

wds15 commented Jan 3, 2021

I branched off to a v2 version of yours. The file you are looking for is here:

https://github.com/stan-dev/math/blob/feature/adjoint-odes-v2/test/unit/math/rev/functor/ode_bdf_adjoint_rev_test.cpp

@bbbales2
Copy link
Member Author

bbbales2 commented Jan 3, 2021

That built for me and the checks pass, though I got tons of sanitzer warnings about memory leaks (though that might just be the tests not doing a stan::math::recover_memory() at the end).

Ubuntu 18.04, Clang 6.0.0

@bbbales2
Copy link
Member Author

bbbales2 commented Jan 3, 2021

Oh also feel free to just merge that branch in here and work here. Or we can close this and move to another pull. Your working branch should be the main thing since nobody else is directly working on this right now.

@wds15
Copy link
Contributor

wds15 commented Jan 3, 2021

Sorry, you need to uncomment tests which use eigen things. I did not want to leave this not compiling. Sure, we can merge v2 into this one and keep going here. Sorry for the confusion.

@bbbales2
Copy link
Member Author

bbbales2 commented Jan 3, 2021

uncomment tests which use eigen things

My hopes dashed on the rocks

@wds15
Copy link
Contributor

wds15 commented May 16, 2021

@SteveBronder @bbbales2 @rok-cesnovar I think I solved the problem which we had with real[] arguments. First I was puzzled as to why our tests did not catch the issue in the first place. I recalled that we had changes to some of our test which were related to arguments of the ODE functor. I reverted these changes to the harmonic_oscillator with commit 69158dd . Then the test test/unit/math/rev/functor/sho_ode_typed_test.cpp started to show the same compilation problems. The main crux is that it is entirely legal to pass in an ODE function which has as part of the variadic arguments a const std::vector<T>&. When we now change this internally to work with the arena_allocator, then we cannot any longer call the ODE function, because the vector has changed (and there are not any conversions being defined as it looks). At least this is how I understand this. The solution I now did is to avoid the use of the arena stuff for the variadic arguments and instead make these part of the cvodes_solver object which makes sure that these copies are cleaned up once the AD tape is being cleared.

Once tests pass I think that this can be merged.

@rok-cesnovar any idea why not all tests are being fired off? Or do they take a moment to pop up?

@rok-cesnovar
Copy link
Member

any idea why not all tests are being fired off? Or do they take a moment to pop up?

Github seems to be a bit slow spawning the instances. Will monitor but this is typically resolved in a matter of a few hours.

@rok-cesnovar
Copy link
Member

There seems to be an outage with Github actions: https://www.githubstatus.com/incidents/zbpwygxwb3gw

stanc3 now properly errors if tolerances are not data. I started the build (https://github.com/stan-dev/cmdstan/actions/runs/846697281) but it needs the outage to be resolved.

I also opened the stanc3 PR, will start begging for reviews tomorrow: stan-dev/stanc3#900

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.16 3.17 1.0 -0.09% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.01 1.27% faster
eight_schools/eight_schools.stan 0.11 0.11 1.01 0.99% faster
gp_regr/gp_regr.stan 0.16 0.16 1.01 1.46% faster
irt_2pl/irt_2pl.stan 5.98 5.98 1.0 0.06% faster
performance.compilation 88.4 87.31 1.01 1.24% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.65 8.82 0.98 -1.97% slower
pkpd/one_comp_mm_elim_abs.stan 30.09 28.96 1.04 3.75% faster
sir/sir.stan 121.19 126.11 0.96 -4.06% slower
gp_regr/gen_gp_data.stan 0.03 0.04 0.99 -1.45% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.11 2.98 1.04 3.89% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.41 0.92 -8.43% slower
arK/arK.stan 1.9 1.86 1.02 1.84% faster
arma/arma.stan 0.72 0.73 0.98 -2.15% slower
garch/garch.stan 0.57 0.57 1.0 0.26% faster
Mean result: 0.998614602537

Jenkins Console Log
Blue Ocean
Commit hash: de69763


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@wds15
Copy link
Contributor

wds15 commented May 16, 2021

„All checks good“? Should we merge or fire off the other github actions somehow?

Copy link
Member Author

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some minor comments. I tested the latest branch in cmdstan and the array problem seems to be gone.

This is good to approve/merge. I don't have permissions to do that since I made this PR.

stan/math/rev/functor/cvodes_integrator_adjoint.hpp Outdated Show resolved Hide resolved
stan::math::zero_adjoints(a, b, va, vb, c, d, e, vva, vvb, vc, vd, ve);
stan::math::for_each(
[](auto&& x) { stan::math::zero_adjoints(x); },
std::forward_as_tuple(a, b, va, vb, c, d, e, vva, vvb, vc, vd, ve));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this foreach. The previous code was testing that the variadic thing worked with all types. Since we aren't doing the variadic thing here, no need for this anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is change by @SteveBronder (I think)... and this is exactly how we test this thing on develop right now. I recall that he said that the for_each thing is much easier on the compiler due to less nesting and it's potentially even faster. I leave it as is if you don't mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I mean this test for zero adjoints is not necessary, not the for_each itself. This for_each is replacing:

stan::math::zero_adjoints(a, b, va, vb, c, d, e, vva, vvb, vc, vd, ve);

which is just not a test we need anymore. I definitely don't think we should remove the for_each function -- just this specific zero_adjoints test seemed extra.

@bbbales2
Copy link
Member Author

Let me know when the docs are up and I'll review those.

@rok-cesnovar
Copy link
Member

Will approve once these few last comments are resolved. The Github Actions stuff has been resolved.

@wds15
Copy link
Contributor

wds15 commented May 16, 2021

Done. Found some not quite up-to-date doxygen comments which I addressed as well. All good now.

Copy link
Member

@rok-cesnovar rok-cesnovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving assuming @SteveBronder agrees with the comment here: #1905 (comment)

@wds15
Copy link
Contributor

wds15 commented May 16, 2021

I changed things to what @bbbales2 suggested, but that leads to a compiler error. This is because @SteveBronder changed things from a recursion design to a loop over tuple thing. At least this is how I understood.... if @SteveBronder disagrees, then there is a few more hours for this...

Thanks for approving @rok-cesnovar !

@bbbales2
Copy link
Member Author

bbbales2 commented May 16, 2021

but that leads to a compiler error

I don't think the change I suggested should lead to compiler error -- was just a tiny cleanup of a specific test (delete 3 lines that should work but aren't really testing anything). Tried to clarify here.

Edit: But also that specific change is definitely optional -- just noticed it as I was scrolling through the code this morning

@wds15
Copy link
Contributor

wds15 commented May 17, 2021

Is there a shortage on Windows testing nodes? @serban-nicusor-toptal any idea?

@rok-cesnovar
Copy link
Member

Windows tests are skipped on PR (we run them with Github Actions anyway). Its waiting for GPU instances (the Columbia one is down). @serban-nicusor-toptal Maybe we up the restriction for these a bit for a few days around the release?

@serban-nicusor-toptal
Copy link
Contributor

Hey, sorry for the inconvenience. I've set the GPU machines to be more available so you don't need to wait that much sometimes.

@wds15
Copy link
Contributor

wds15 commented May 17, 2021

Looks like GPU tests have passed and now we are onto the upstream tests. Thanks!

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.09 3.06 1.01 0.96% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.93 -7.02% slower
eight_schools/eight_schools.stan 0.12 0.11 1.06 5.26% faster
gp_regr/gp_regr.stan 0.16 0.16 0.99 -0.75% slower
irt_2pl/irt_2pl.stan 6.08 6.05 1.01 0.54% faster
performance.compilation 89.77 87.47 1.03 2.56% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.68 8.69 1.0 -0.1% slower
pkpd/one_comp_mm_elim_abs.stan 29.71 29.64 1.0 0.22% faster
sir/sir.stan 127.99 122.1 1.05 4.6% faster
gp_regr/gen_gp_data.stan 0.03 0.03 1.0 0.16% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.05 2.99 1.02 1.97% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.41 0.4 1.03 2.54% faster
arK/arK.stan 1.88 1.9 0.99 -1.17% slower
arma/arma.stan 0.75 0.74 1.01 0.98% faster
garch/garch.stan 0.59 0.57 1.03 2.54% faster
Mean result: 1.00970209024

Jenkins Console Log
Blue Ocean
Commit hash: 01c9fa1


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@rok-cesnovar
Copy link
Member

Good to go?

@bbbales2
Copy link
Member Author

Yeah, it all looks good to me

@wds15
Copy link
Contributor

wds15 commented May 17, 2021

I think so yes... so I hit merge???!!!!

@wds15 wds15 merged commit 5534e22 into develop May 17, 2021
@rok-cesnovar rok-cesnovar deleted the feature/adjoint-odes branch May 17, 2021 15:08
@SteveBronder
Copy link
Collaborator

Hey sorry was apartment hunting all weekend. This all seems good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.