Covariance different from scipy `curve_fit` with the same data #70

kain88-de · 2018-06-14T20:14:45Z

I'm trying to fit an exponential decay of an autocorrelation function with LsqFit.jl. The optimized value I obtain is correct and is the same that I get with the scipy. But the reported covariance and associated standard error is unreasonable large with LsqFit.jl. Scipy reports a value of ~1e-5 and LsqFit has a value of ~1. I would tend to believe scipy more because visually the data I have fits perfect to a single exponential decay with a very small error. See gist for the code.

I'm using the latest release version of LsqFit (4ecb0ec). I'm not sure if this is fixed in the current master branch. Mostly because I'm new to julia and I do not know how to best test using a checkout of the master branch.

The text was updated successfully, but these errors were encountered:

pkofod · 2018-06-14T21:29:55Z

That is indeed a very large standard error for the data given. @iewaij will be working to improve some functionality in this package over the summer as part of google summer of code. If there's a bug in the covariance code, this would be top priority.

Thanks for the bug report. We'll trace this 🐛 and squash it :)

iewaij · 2018-06-14T22:19:26Z

Thanks for the bug report!

The bug seems related to the weight parameter. Delete the weight parameter and run fit = curve_fit(func, t, log.(res_norm), [1., ]), the covariance decreases to 8.75554e-6.

Another related issue is #69. I'll have a look on this.

kain88-de · 2018-06-15T04:29:07Z

Thanks for the quick response. The workaround to leave out the weights parameter works for me too.

rsturley · 2018-08-16T00:40:25Z

This issue is actually two issues:

The correct interpretation of the weighting vector. Right now, the vector weighting as applied to the fits is 1/sigma, where sigma is the standard deviation. This is obvious from the levenberg_marquardt code where sum(abs2, f(p)) is what is minimized. Since f(p) = wt .* ( model(xpts, p) - ydata ) in the curve_fit routine with vector weights, the weight must be 1/sigma. The current code is doing the minimization correctly with this definition of wt.
The routine estimate_covar is calculating the covariance matrix in correctly for the case of vector weights. The functions g(p) calculated in these cases are scaled by the weights rather than being
the covariance matrix for the unweighted residuals. This means the correct formula should be covar = inv(J'*J) rather than covar = inv(J'*Diagonal(fit.wt)*J) as it is now.

This can be checked quite readily. One can do multiple fits to a curve with difference random noise added each time with known uncertainty in the data. The standard deviation of the fit parameters from multiple fits should be approximately equal to the estimated standard error computed by the standard_error function. The attached file has the results showing that the revised formula for covar I suggested above is the correct one.

LsqFitCovariance.pdf

iewaij · 2018-08-21T12:22:52Z

Hi @rsturley thanks a lot for pointing out the issues and the experiment.

I agree that the current code implementation for the weights is wrong. For weighted least squares, the objective function should be sum(f(p).^2.*w), if we want to keep using sum(abs2, f(p)), then we should pass square root of weights, i.e.`f(p) = sqrt(wt) .* (model(xpts, p) - ydata). A possible fix is #72. There's another discussion in #69.

I think the current covariance matrix is still correct under the implementation above. There's a documentation explaining the covariance computation using linear approximation.

yanyuechuixue · 2018-12-18T08:07:22Z

It seems this issue still exist?
I use Scipy, Matlab, and LsqFit with same data.
Scipy and Matlab return the same covariance matrix, while LsqFit gives a different covariance matrix.

pkofod · 2018-12-18T08:40:36Z

It seems this issue still exist?
I use Scipy, Matlab, and LsqFit with same data.
Scipy and Matlab return the same covariance matrix, while LsqFit gives a different covariance matrix.

Could you try master Juila? Also if you could post the exact code for Scipy and Matlab it would be easier to track down this problem.

pkofod · 2018-12-20T20:25:38Z

Having played around with it a bit today, I'm quite certain that master LsqFit now has the correct behavior. The weights are either no present, present as a vector (in which case they're inverse variances) or a matrix (in which case they're inverse variance-covariance matrices). This is also documented in the README. IF you provide "stupid" weights such as in the thread starter's gist, it will give the correct parameter values but since the weights are not the inverse variances, you will get "wrong" (but not really, because you said that was the inverse variances) standard errors.

This was referenced Jun 18, 2018

Introduce doc and fix weight issues #71

Closed

Change weight parameter to standard deviation and covariance parameter #74

Closed

pkofod closed this as completed Dec 20, 2018

pkofod mentioned this issue Nov 14, 2024

wrong calculation of covariance matrix in function StatsAPI.vcov(fit::LsqFitResult) #255

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Covariance different from scipy `curve_fit` with the same data #70

Covariance different from scipy `curve_fit` with the same data #70

kain88-de commented Jun 14, 2018

pkofod commented Jun 14, 2018

iewaij commented Jun 14, 2018

kain88-de commented Jun 15, 2018

rsturley commented Aug 16, 2018

iewaij commented Aug 21, 2018

yanyuechuixue commented Dec 18, 2018

pkofod commented Dec 18, 2018

pkofod commented Dec 20, 2018

Covariance different from scipy curve_fit with the same data #70

Covariance different from scipy curve_fit with the same data #70

Comments

kain88-de commented Jun 14, 2018

pkofod commented Jun 14, 2018

iewaij commented Jun 14, 2018

kain88-de commented Jun 15, 2018

rsturley commented Aug 16, 2018

iewaij commented Aug 21, 2018

yanyuechuixue commented Dec 18, 2018

pkofod commented Dec 18, 2018

pkofod commented Dec 20, 2018

Covariance different from scipy `curve_fit` with the same data #70

Covariance different from scipy `curve_fit` with the same data #70