Skip to content

Commit

Permalink
more
Browse files Browse the repository at this point in the history
  • Loading branch information
gagolews committed Oct 29, 2023
1 parent de99caf commit 8e16f29
Show file tree
Hide file tree
Showing 49 changed files with 310 additions and 145 deletions.
37 changes: 18 additions & 19 deletions .devel/sphinx/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@

import genieclust




pkg_name = "genieclust"
pkg_title = "genieclust"
pkg_version = genieclust.__version__
Expand Down Expand Up @@ -36,6 +39,20 @@
highlight_language = "python"
html_last_updated_fmt = today_fmt

plot_include_source = True
plot_html_show_source_link = False
plot_pre_code = """
import numpy as np
import genieclust
import matplotlib.pyplot as plt
np.random.seed(123)
"""
doctest_global_setup = plot_pre_code
numpydoc_use_plots = True
# https://www.sphinx-doc.org/en/master/usage/extensions/autosummary.html
autosummary_imported_members = True
autosummary_generate = True

extensions = [
'myst_parser',
'sphinx.ext.mathjax',
Expand Down Expand Up @@ -78,25 +95,7 @@
'code-block': 'Listing %s',
'section': 'Section %s'
}
numfig_secnum_depth = 1

plot_include_source = True
plot_html_show_source_link = False
plot_pre_code = """
import numpy as np
import genieclust
import matplotlib.pyplot as plt
np.random.seed(123)
"""

doctest_global_setup = plot_pre_code

numpydoc_use_plots = True

# https://www.sphinx-doc.org/en/master/usage/extensions/autosummary.html
autosummary_imported_members = True
autosummary_generate = True

numfig_secnum_depth = 0

html_theme = 'furo'

Expand Down
20 changes: 0 additions & 20 deletions .devel/sphinx/fix-code-blocks.sh

This file was deleted.

18 changes: 18 additions & 0 deletions .devel/sphinx/fix-html.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

# Copyright (C) 2020-2023, Marek Gagolewski <https://www.gagolewski.com/>

set -e

if [ ! -d "${1}" ]; then
echo "The input directory does not exist or was not provided."
exit 1
fi


cd "${1}"

for f in *.html; do
# merge input and output chunks:
sed -rz --in-place 's/<\/pre><\/div>\n<\/div>\n<div class="highlight-(r|python) notranslate"><div class="highlight"><pre>//g' "${f}"
done
2 changes: 1 addition & 1 deletion .devel/sphinx/news.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@
`wcnn_index`.

These cluster validity measures are discussed
in more detail at <https://clustering-benchmarks.gagolewski.com>.
in more detail at <https://clustering-benchmarks.gagolewski.com/>.

* [BACKWARD INCOMPATIBILITY] `normalized_confusion_matrix`
now solves the maximal assignment problem instead of applying
Expand Down
9 changes: 9 additions & 0 deletions .devel/sphinx/rapi/cluster_validity.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,16 @@ X <- as.matrix(iris[,1:4])
X[,] <- jitter(X) # otherwise we get a non-unique solution
y <- as.integer(iris[[5]])
calinski_harabasz_index(X, y) # good
```

```
## [1] 486.6681
```

```r
calinski_harabasz_index(X, sample(1:3, nrow(X), replace=TRUE)) # bad
```

```
## [1] 2.836713
```
75 changes: 75 additions & 0 deletions .devel/sphinx/rapi/compare_partitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,32 +108,107 @@ Gagolewski M., <span class="pkg">genieclust</span>: Fast and robust hierarchical
y_true <- iris[[5]]
y_pred <- kmeans(as.matrix(iris[1:4]), 3)$cluster
normalized_clustering_accuracy(y_true, y_pred)
```

```
## [1] 0.84
```

```r
normalized_pivoted_accuracy(y_true, y_pred)
```

```
## [1] 0.84
```

```r
pair_sets_index(y_true, y_pred)
```

```
## [1] 0.7568238
```

```r
pair_sets_index(y_true, y_pred, simplified=TRUE)
```

```
## [1] 0.7470968
```

```r
adjusted_rand_score(y_true, y_pred)
```

```
## [1] 0.7302383
```

```r
rand_score(table(y_true, y_pred)) # the same
```

```
## [1] 0.8797315
```

```r
adjusted_fm_score(y_true, y_pred)
```

```
## [1] 0.7304411
```

```r
fm_score(y_true, y_pred)
```

```
## [1] 0.8208081
```

```r
mi_score(y_true, y_pred)
```

```
## [1] 0.8255911
```

```r
normalized_mi_score(y_true, y_pred)
```

```
## [1] 0.7581757
```

```r
adjusted_mi_score(y_true, y_pred)
```

```
## [1] 0.7551192
```

```r
normalized_confusion_matrix(y_true, y_pred)
```

```
## [,1] [,2] [,3]
## [1,] 50 0 0
## [2,] 0 48 2
## [3,] 0 14 36
```

```r
normalizing_permutation(y_true, y_pred)
```

```
## [1] 1 2 3
```
12 changes: 12 additions & 0 deletions .devel/sphinx/rapi/gclust.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,9 +142,21 @@ plot(iris[,2], iris[,3], col=y_pred,

```r
adjusted_rand_score(y_test, y_pred)
```

```
## [1] 0.8857921
```

```r
pair_sets_index(y_test, y_pred)
```

```
## [1] 0.9049708
```

```r
# Fast for low-dimensional Euclidean spaces:
# h <- gclust(emst_mlpack(X))
```
69 changes: 69 additions & 0 deletions .devel/sphinx/rapi/inequality.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,27 +87,96 @@ Gagolewski M., <span class="pkg">genieclust</span>: Fast and robust hierarchical

```r
gini_index(c(2, 2, 2, 2, 2)) # no inequality
```

```
## [1] 0
```

```r
gini_index(c(0, 0, 10, 0, 0)) # one has it all
```

```
## [1] 1
```

```r
gini_index(c(7, 0, 3, 0, 0)) # give to the poor, take away from the rich
```

```
## [1] 0.85
```

```r
gini_index(c(6, 0, 3, 1, 0)) # (a.k.a. Pigou-Dalton principle)
```

```
## [1] 0.75
```

```r
bonferroni_index(c(2, 2, 2, 2, 2))
```

```
## [1] 0
```

```r
bonferroni_index(c(0, 0, 10, 0, 0))
```

```
## [1] 1
```

```r
bonferroni_index(c(7, 0, 3, 0, 0))
```

```
## [1] 0.90625
```

```r
bonferroni_index(c(6, 0, 3, 1, 0))
```

```
## [1] 0.8333333
```

```r
devergottini_index(c(2, 2, 2, 2, 2))
```

```
## [1] 0
```

```r
devergottini_index(c(0, 0, 10, 0, 0))
```

```
## [1] 1
```

```r
devergottini_index(c(7, 0, 3, 0, 0))
```

```
## [1] 0.7662338
```

```r
devergottini_index(c(6, 0, 3, 1, 0))
```

```
## [1] 0.6493506
```
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Package: genieclust
Type: Package
Title: Fast and Robust Hierarchical Clustering with Noise Points Detection
Version: 1.1.5-2
Date: 2023-10-26
Date: 2023-10-29
Authors@R: c(
person("Marek", "Gagolewski",
role = c("aut", "cre", "cph"),
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
genieclust package for R and Python
Copyleft (C) 2018-2023, Marek Gagolewski <https://www.gagolewski.com>
Copyleft (C) 2018-2023, Marek Gagolewski <https://www.gagolewski.com/>

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License
Expand Down
Loading

0 comments on commit 8e16f29

Please sign in to comment.