k-Means clustering is one of the most widely used clustering methods. The Lloyd algorithm serves as the de facto standard for optimizing cluster centroids in k-Means by minimizing intra-cluster dissimilarity. Despite its age, Lloyd’s algorithm remains relevant due to numerous adaptations over time. One such adaptation is Elkan’s algorithm, which improves efficiency by leveraging the triangle inequality to reduce the number of distance computations during clustering.
This work explores the integration of Ptolemy’s inequality, a generalization of the triangle inequality, into the k-Means framework. The goal is to further enhance computational efficiency while maintaining clustering quality. The results demonstrate the theoretical and practical potential of this approach.
main.py
: Runs all experiments for the paper and creates theresults/*
; it calls:means.py
: Contains the implementation of the standard k-Means algorithm, Elkan’s algorithm, and the novel extension using Ptolemy’s inequality.plot_results.py
: Creates the singular plot of the paper.
The repo also contains:
notebook.ipynb
: A Jupyter Notebook demonstrating how to use themeans.py
module and showcasing key functionalities.Makefile
: Programmatic overview of how the files interact, to be run with GNU Make.requirements.txt
andpyproject.toml
: Two most common formats to lists all python dependencies (and minimal versions, in the case of latter one).results/clustering_performance_results_formatted.csv
: Contains the results of experimental evaluations, including performance metrics and comparisons across different implementations.
-
Install prerequisites: Building this paper requires python 3.9.21 or higher, lualatex 1.18.0, and GNU Make 4.4.1. It optionally requires qpdf 11.9.1 to linearize the resulting paper. See
pyproject.toml
for the version requirements of the python libraries used. You might be able to use older version of the listed software, but we cannot guarantee compatibility or identical outputs. The code was run and paper created on Linux 6.12.8. -
Clone the repository :
git clone [email protected]:nikitaaveritchev/kmeans-Extension-Ptolemy.git cd kmeans-Extension-Ptolemy
-
Run the build script:
make all
This project contains code under two licenses:
- CC BY 4.0: All original contributions in this project are licensed under the Creative Commons Attribution 4.0 International License.
- MIT License: Portions of this project are derived from jjcordano's project, which is licensed under the MIT license. These portions remain under the MIT license.
If using this project, please provide attribution for both:
- The original work by jjcordano under the MIT license.
- This project under the CC BY 4.0 license.