You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A main challenge I see is that the skgstat.Variogram is already fitting the model to the empirical variogram on instantiation. I cannot see how an object in that state can be transformed into a CovModel in a concise and clean way, which can be only the model without anything fitted, right @MuellerSeb ?
1. using sklearn
This is as of now my favorite path. We can either directly implement the estimators interface directly into existing classes, or provide a helper class that inherits from Estimator and handles the existing classes.
For scikit-gstat direct inheritance will break my data-flow and therefore a major version shift will be neccessary.
We would also have to define, what the result or outcome of these classes is. In the case of Kriging: The PyKrige classes should stick to Estimators or Predictors as well and would need the variogram estimation result for config and i.e. a meshgrid or something else to predict on (in the predict method).
I see the main advantages that GS-Framework can easily be used in data science workflows, where sklearn is really common. For GS-Framework would gain clean and clear interfaces between classes and skgstat.Variogram could be used instead of CovModel and vice versa. In this case, scikit-gstat can focus more on fitting, variogram analysis and CovModel with the Cython backbone would be the performant, fast big brother for production.
Just suggestions.
A possbile result could be the varigram parameters along with the fitted model as a callable. Should be enough for Kriging.
It is not working correctly at the moment and I might remove it directly again if we go for another avenue.
The advantage of this implementation would be that we can keep both packages in their current logic while still offering a way to map between both to the users, which will definitely be appreciated.
The main challenge from my current point of view is: Once you instantiate a skgstat.Variogram, it will run through the fitting procedure. That means it's per se a theoretical function fitted to empirical values. It is meant to be used as an analysis tool, because you can change all parameters and stuff at runtime, immediately yielding the new fit. gstools.CovModel is as far as I understand it different. You can create the Model and use it and variogram fitting is just one thing that you might do to it or not.
Hence, the only thing that makes sense is that CovModel indicates if it was fitted or not (which it might well do already). The interface paths would be
skgstat.Variogram --> gstools.FittedCovModel or
skgstat.Variogram <--> gstools.FittedCovModel if you see benefits here.
The only option to export the default CovModel would be to use it in Variogram.model somehow, but here it might be way easier not to allow that and see that all theoretical model functions are available for both packages. Like a Gaussian class that can return the Variogram.model and the CovModel.cor.
At the end of the day I think only the fitted versions would be helpful to interface, as for unfitted Models the user can simply provide the data to both classes, that's not really a hassle.
3. Combination
Here, we would go for 1 but implement a low-level export functionality, like Variogram.make_CovModel and CovModel_makeVariogram or whatsoever.
The main advantage here would be that while VariogramEstimator could well return the result needed for Kriging, I am not sure if it could be used for Field generation etc. And why would it?
In any case the sklearn pathways (1, 3) would also have a very personal advantage. Implementing the interface via. sklearn is not too complicated when using a helper function (although it has downsides) and I could do that in the near future. Then, future developments in one package do not necessarily be reflected in the other. Everyone could develop in his own speed.
Since GSTools 1.3 will cover all variogram/covariance models provided by scikit-gstat, we should add a to_gstools method to the Variogram class in scikit-gstat. This should be a simple mapping of the describe dict output to CovModel instances.
I'll open an issue, still no access to the project board. This is a discussion issue for the GS-Framework v2.
To bring gstools and scikit-gstat work well together I see a few avenues we could take:
Estimator
etc. andPipeline
to interface.skgstat.Variogram
could export a fittedCovModel
, which would partly contradict [Refactor] prefere "cor" to specify userdefined CovModel #90A main challenge I see is that the
skgstat.Variogram
is already fitting the model to the empirical variogram on instantiation. I cannot see how an object in that state can be transformed into aCovModel
in a concise and clean way, which can be only the model without anything fitted, right @MuellerSeb ?1. using sklearn
This is as of now my favorite path. We can either directly implement the estimators interface directly into existing classes, or provide a helper class that inherits from
Estimator
and handles the existing classes.For scikit-gstat direct inheritance will break my data-flow and therefore a major version shift will be neccessary.
We would also have to define, what the result or outcome of these classes is. In the case of Kriging: The PyKrige classes should stick to
Estimators
orPredictor
s as well and would need the variogram estimation result for config and i.e. a meshgrid or something else to predict on (in thepredict
method).I see the main advantages that GS-Framework can easily be used in data science workflows, where sklearn is really common. For GS-Framework would gain clean and clear interfaces between classes and
skgstat.Variogram
could be used instead ofCovModel
and vice versa. In this case, scikit-gstat can focus more on fitting, variogram analysis andCovModel
with the Cython backbone would be the performant, fast big brother for production.Just suggestions.
A possbile result could be the varigram parameters along with the fitted model as a callable. Should be enough for Kriging.
2. Direct interface
I only have this in the list as it is way easier to implement and would not impact the whole framework as with 1.
At the moment I have an experimental feature that does exactly this: https://github.com/mmaelicke/scikit-gstat/blob/master/skgstat/interfaces/gstools.py
It is not working correctly at the moment and I might remove it directly again if we go for another avenue.
The advantage of this implementation would be that we can keep both packages in their current logic while still offering a way to map between both to the users, which will definitely be appreciated.
The main challenge from my current point of view is: Once you instantiate a
skgstat.Variogram
, it will run through the fitting procedure. That means it's per se a theoretical function fitted to empirical values. It is meant to be used as an analysis tool, because you can change all parameters and stuff at runtime, immediately yielding the new fit.gstools.CovModel
is as far as I understand it different. You can create the Model and use it and variogram fitting is just one thing that you might do to it or not.Hence, the only thing that makes sense is that
CovModel
indicates if it was fitted or not (which it might well do already). The interface paths would beskgstat.Variogram
-->gstools.FittedCovModel
orskgstat.Variogram
<-->gstools.FittedCovModel
if you see benefits here.The only option to export the default
CovModel
would be to use it inVariogram.model
somehow, but here it might be way easier not to allow that and see that all theoretical model functions are available for both packages. Like aGaussian
class that can return theVariogram.model
and theCovModel.cor
.At the end of the day I think only the fitted versions would be helpful to interface, as for unfitted Models the user can simply provide the data to both classes, that's not really a hassle.
3. Combination
Here, we would go for 1 but implement a low-level export functionality, like
Variogram.make_CovModel
andCovModel_makeVariogram
or whatsoever.The main advantage here would be that while
VariogramEstimator
could well return the result needed for Kriging, I am not sure if it could be used for Field generation etc. And why would it?In any case the sklearn pathways (1, 3) would also have a very personal advantage. Implementing the interface via. sklearn is not too complicated when using a helper function (although it has downsides) and I could do that in the near future. Then, future developments in one package do not necessarily be reflected in the other. Everyone could develop in his own speed.
For scikit-gstat I already played around and I have a working interface. It is far from unvailing all the power in sklearn and a bit clumsy, but it's working:
https://github.com/mmaelicke/scikit-gstat/blob/master/skgstat/interfaces/variogram_estimator.py
Open for discussion!
The text was updated successfully, but these errors were encountered: