Skip to content

Commit

Permalink
Merge pull request #88 from mrucker/fix_distances_documentation
Browse files Browse the repository at this point in the history
fix: Improved distances_ and height documentation.
  • Loading branch information
gagolews authored Jun 17, 2024
2 parents 5f04949 + 8bf5247 commit 44cd449
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 7 deletions.
2 changes: 1 addition & 1 deletion .devel/sphinx/rapi/gclust.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ If `d` is a numeric matrix or an object of class `dist`, [`mst()`](mst.md) will

Given an minimum spanning tree, the algorithm runs in $O(n \sqrt{n})$ time. Therefore, if you want to test different `gini_threshold`s, (or `k`s), it is best to explicitly compute the MST first.

According to the algorithm\'s original definition, the resulting partition tree (dendrogram) might violate the ultrametricity property (merges might occur at levels that are not increasing w.r.t. a between-cluster distance). Departures from ultrametricity are corrected by applying `height = rev(cummin(rev(height)))`.
According to the algorithm\'s original definition, the resulting partition tree (dendrogram) might violate the ultrametricity property (merges might occur at levels that are not increasing w.r.t. a between-cluster distance). `gclust()` automatically corrects departures from ultrametricity by applying `height = rev(cummin(rev(height)))`.

## Value

Expand Down
4 changes: 2 additions & 2 deletions R/gclust.R
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,8 @@
#' the resulting partition tree (dendrogram) might violate
#' the ultrametricity property (merges might occur at levels that
#' are not increasing w.r.t. a between-cluster distance).
#' Departures from ultrametricity are corrected by applying
#' \code{height = rev(cummin(rev(height)))}.
#' \code{gclust()} automatically corrects departures from
#' ultrametricity by applying \code{height = rev(cummin(rev(height)))}.
#'
#'
#' @param d a numeric matrix (or an object coercible to one,
Expand Down
9 changes: 5 additions & 4 deletions genieclust/genie.py
Original file line number Diff line number Diff line change
Expand Up @@ -793,10 +793,11 @@ class Genie(GenieBase):
the distance between two clusters merged in each iteration,
see the description of ``Z[:,2]`` in `scipy.cluster.hierarchy.linkage`.
As the original Genie algorithm does not guarantee that that distances
are ordered increasingly (there are other hierarchical
clustering linkages that violate the ultrametricity property as well),
these are corrected by applying
As the original Genie algorithm does not guarantee that distances
are ordered increasingly (there are other hierarchical clustering
linkages that violate the ultrametricity property as well), Genie
automatically applies the following correction:
``distances_ = genieclust.tools.cummin(distances_[::-1])[::-1]``.
counts_ : None or ndarray
Expand Down

0 comments on commit 44cd449

Please sign in to comment.