-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved benchmarks #259
Improved benchmarks #259
Conversation
Current coverage is 85.04% (diff: 100%)
|
We should trace total number of iterations as suggested in #145 . I even think that issue can be closed once that is done. Edit: it's in there, so I guess that issue is fixed once this is merged. |
7b0207c
to
276c98d
Compare
Would love to get feedback on storing the benchmark csv's in the repo. Is it annoying git-wise? |
Will they make the repo dirty in case someone runs the benchmarks? In that case julia will stop automatically updating the package for everyone who ran the benchmark and didnt remember to clean the repo. |
Good point. I guess we could add |
I would suggest keeping these files in a separate repo. In the long run, if you run them frequently enough, you're going to end up with more of the repo's mass being CSV files than code. |
What about a separate branch on this repo? The script can just checkout the branch, do what it's got to do, commit, and checkout whatever branch was active.- unless people request the branch, it won't clutter their .julia. I'd be fine with OptimBenchmarkReports.jl or whatever though. |
But that branch still gets downloaded when you do |
|
JuliaSmoothOptimizers/CUTEst.jl#69 this is very nice for the variable size problems! |
@pkofod What do you think of a |
Moved to the more appropriate place: JuliaNLSolvers/OptimTests.jl#7 |
Oookay, this took way too long, but here's a first go.
Caveats: I'm using non-
REQUIRE
d packages. This is not meant to be used by regular users, so let's not clutterREQUIRE
.So, what to show, what to show. This round of work on benchmarks is for internal benchmarking only. Let's not worry about benchmarking vs Matlab or R or Python just yet. We should do it at some point to see if our linesearch is good/bad, and so on.
So we can start to ask ourselves some questions. For example, on the CUTEst problems with dimension between 1 and 100 (length(initial_x)), we can log the best objective value obtained by any solver (we do not have any solutions from CUTEst as far as I can see), calculate
f_solver-f*
for all problems and solvers, and calculate which proportion is within a given threshold.If we log
x*
, thex
associated with the best of all objectives, we can also calculate how "close" each solver comes measured inx
notf
. This is relevant if you want to obtain parameters for example, and aren't too concerned with hitting the exact minimum loss. Then we get the following.The Newton's in the legend are a mistake
* Notes*
Be aware, that nothing is normalized. I do not want to normalize objective values, as many of these problems have minima of 0 exactly. I could normalize the
x
s, and probably should. (edit: or can I ?)We see that (L-)BFGS are the best of the first order methods, followed (but not so closely) by Momentum gradient descent. Nelder Mead (the new one) actually does quite well for a zeroth order method. Again, as we're in levels, 1e-30 is really really low in floating point precision terms, so the most interesting part is perhaps in the middle of the figures.
For the good old UnconstrainedProblems we can do the same. There, we actually have minima and minimizers available, so here I actually compare to the minimum, not the "best obtained by a solver". We see picture that's a bit different here. Quite a few of the problems are solved quite well by many solvers, and we also have second order methods here. The good performance might be biased because we've had these problems for so long, and used them to test and fine tune algorithms if they didn't solve the problems.
TODO
catch
part, it's almost always linesearch problems. Could they be fixed?