In this tutorial, we build up the k-means algorithm step-by-step. This tutorial uses only standard python.
The steps are designed to logically build up the pieces from the perspective of asking and answering simple questions about a data distribution.
To use this notebook, you'll need an installation of Python (preferably a recent version of Python 3), Jupyter (to run the notebook, pip install jupyter
) and matplotlib (for the plotting, pip install matplotlib
).
To solve any real problems, use scikit-learn.
Enjoy!