Skip to content

nkarasiak/MuseoToolBox

Repository files navigation

Museo ToolBox logo

Build status Documentation status codecov PyPI version Conda version Downloads status

Museo ToolBox is a python library to simplify the use of raster/vector, especially for machine learning and remote sensing. It is now easy to extract raster values from vector polygons and to do some spatial/unspatial cross-validation for scikit-learn from raster.

One of the most meaningful contribution is, in my humble opinion, the RasterMath class and the spatial cross-validation.

What's the point ?

Today, the main usages of Museo ToolBox are :

  • museotoolbox.cross_validation
    • Create validation/training sets from vector, and cross-validation compatible with Scikit-Learn GridSearchCV. The aim is here to promote the spatial cross-validation in order to better estimate a model (with a lower spatial auto-correlation overestimation).
  • museotoolbox.processing
    • RasterMath, allows you to apply any of your array-compatible function on your raster and save it. Just load RasterMath, then it will return you the value for each pixel (in all bands) and now you can do whatever you want : predicting a model, smooth signal (whittaker, double logistic...), compute modal value... RasterMath reads and writes a raster block per block to avoid loading the full image in memory. It is compatible with every python function (including numpy) as the first and only argument RasterMath needs on your function is an array.
    • Extract bands values from vector ROI (polygons/points) (function : extract_ROI)
  • AI based on Scikit-Learn. SuperLearner simplifies the use of cross-validation by extracting each accuracy (kappa,F1,OA, and above all confusion matrix) from each fold. It also eases the way to predict a raster (just give the raster name and the model).

That seems cool, but is there some help to use this ?

I imagined Museo ToolBox as a tool to simplify raster processing and to promote spatial cross-validation, so of course there is some help : a complete documentation with a lot of examples is available on readthedocs.

How do I install Museo ToolBox ?

We recommend you to install Museo ToolBox via conda as it includes gdal dependency :

conda install -c conda-forge museotoolbox

However, if you prefer to install this library via pip, you need to install first gdal, then :

python3 -m pip install museotoolbox --user

For early-adopters, you can install the latest development version directly from git :

python3 -m pip install https://github.com/nkarasiak/museotoolbox/archive/develop.zip --user -U

Feel free to remove the --user if you like to install the library for every user on the machine or if some dependencies need root access. -U is for update if a newer version exists.

Using and citing the toolbox

If you use Museo ToolBox in your research and find it useful, please cite this library using the following bibtex reference:

@article{Karasiak2020,
  doi = {10.21105/joss.01978},
  url = {https://doi.org/10.21105/joss.01978},
  year = {2020},
  publisher = {The Open Journal},
  volume = {5},
  number = {48},
  pages = {1978},
  author = {Nicolas Karasiak},
  title = {Museo ToolBox: A Python library for remote sensing including a new way to handle rasters.},
  journal = {Journal of Open Source Software}
}

Or copy this citation :

Karasiak, N., (2020). Museo ToolBox: A Python library for remote sensing including a new way to handle rasters.. Journal of Open Source Software, 5(48), 1978, https://doi.org/10.21105/joss.01978

I want to improve Museo ToolBox, how can I contribute ?

To contribute to this package, please read the instructions in CONTRIBUTING.rst.

Who built Museo ToolBox ?

I am Nicolas Karasiak, a Phd student at Dynafor Lab. I work tree species mapping from space throught dense satellite image time series, especially with Sentinel-2. A special thanks goes to Mathieu Fauvel who initiated me to the beautiful world of the open-source.

Why this name ?

As Orfeo ToolBox is one my favorite and most useful library to work with raster data, I choose to name my work as Museo because in ancient Greek religion and myth, Museo is the son and disciple of Orfeo. If you want an acronym, let's say MUSEO means 'Multiple Useful Services for Earth Observation'.