Documentation -- standardization example: NMAD used instead of STD #381

MatteaE · 2023-06-29T09:10:03Z

At plot_standardization.py:101, it is claimed that "We perform a scale-correction for the standardization, to ensure that the standard deviation of the data is exactly 1." (and at line 141: "With standardized input, the variogram should converge towards one.").

But actually the code uses xdem.spatialstats.nmad() to compute the rescaling factor, and the empirical variogram later converges to 1.48 instead of 1 (see attachment).

xdem 0.0.10

rhugonnet · 2023-06-29T19:20:06Z

@MatteaE Are you using different data than the example? The variogram looks OK at https://xdem.readthedocs.io/en/latest/advanced_examples/plot_standardization.html right after the standardization (which doesn't seem to affect the estimate: "Standard deviation before scale-correction: 1.0; Standard deviation after scale-correction: 1.0").

I realize that right now the default variogram estimator in sample_empirical_variogram is not "dowd" (which is the one to use to match the NMAD), while it is in wrapper functions estimate_model_spatial_correlation and infer_spatial_correlation_from_stable... Maybe something to change and which could explain your variogram (by default your variogram will be estimated using the "matheron" estimator).

MatteaE · 2023-06-30T07:54:44Z

@rhugonnet Yes, I am using a dh grid from SPOT and NASADEM.
Thanks for the suggestion about Dowd's estimator - I have now tried passing estimator = "dowd" to function sample_empirical_variogram, now the variogram converges to... 2.0! (See attachment).
Interestingly, if I run the unmodified example plot_standardization.py I also get a variogram which converges to 1.0, but if I pass estimator = "dowd" to sample_empirical_variogram (plot_standardization.py:127) then the variogram again converges to about 2.0

rhugonnet · 2023-06-30T20:56:40Z

OK perfect!
Yes, the factor of 2 is due to how Dowd's variogram is defined in SciKit-GStat (it needs to be divided by 2 to be compared to a NMAD). This is inconsistent with the fact that Matheron or Cressie's estimators are directly comparable to a STD, and should be fixed.

It's on my list of things to modify in SciKit-GStat, I'll open a PR there during the summer! 😉

To summarize, to-do-list for closing this issue:

Use Dowd's variogram consistently with the NMAD in the uncertainty tools,
Open a PR to SciKit-GStat to fix the 2 scaling factor in Dowd's estimator,
Clarify the section of the documentation on robust estimators,
Put a link to that section in the "Standardization" gallery example.

Anything else I missed @MatteaE?
Also note that we're going to rework the structure of spatialstats.py soon: #378

MatteaE · 2023-07-01T10:06:42Z

Thanks a lot for the detailed explanation @rhugonnet! Just one last question then - right now, do I need to do the scaling by 2 manually to later use the variogram parameters and de-standardize the integrated uncertainty? If yes, where? (I use functions fit_sum_model_variogram, number_effective_samples and neff_circular_approx_numerical)

rhugonnet · 2023-07-03T20:06:30Z

@MatteaE Good question! You don't need to scale 🙂.
The function number_effective_samples considers that the sum of partial sills of the variogram adds up to 100% of the total sill (https://github.com/GlacioHack/xdem/blob/main/xdem/spatialstats.py#L2007), so the total sill of the variogram does not matter.
(the assumption behind is that all the total average variance is observed in the variogram sampling; which is largely true for DEMs as long as you have enough stable terrain samples)

rhugonnet · 2023-08-01T19:58:20Z

Scale factor of Dowd's estimator fixed in scikit-gstat: mmaelicke/scikit-gstat#158
Will try to fix the rest linked to this issue today

rhugonnet added the priority Needs to be fixed rapidly label Aug 1, 2023

rhugonnet mentioned this issue Aug 2, 2023

Use NMAD consistently in examples and clarify link to Dowd's variogram in doc #390

Merged

rhugonnet closed this as completed in #390 Aug 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation -- standardization example: NMAD used instead of STD #381

Documentation -- standardization example: NMAD used instead of STD #381

MatteaE commented Jun 29, 2023

rhugonnet commented Jun 29, 2023 •

edited

Loading

MatteaE commented Jun 30, 2023

rhugonnet commented Jun 30, 2023 •

edited

Loading

MatteaE commented Jul 1, 2023

rhugonnet commented Jul 3, 2023

rhugonnet commented Aug 1, 2023

Documentation -- standardization example: NMAD used instead of STD #381

Documentation -- standardization example: NMAD used instead of STD #381

Comments

MatteaE commented Jun 29, 2023

rhugonnet commented Jun 29, 2023 • edited Loading

MatteaE commented Jun 30, 2023

rhugonnet commented Jun 30, 2023 • edited Loading

MatteaE commented Jul 1, 2023

rhugonnet commented Jul 3, 2023

rhugonnet commented Aug 1, 2023

rhugonnet commented Jun 29, 2023 •

edited

Loading

rhugonnet commented Jun 30, 2023 •

edited

Loading