-
Notifications
You must be signed in to change notification settings - Fork 64
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added helper functions and fixed mrd memory leak.
- Loading branch information
Showing
34 changed files
with
907 additions
and
434 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,22 @@ | ||
Package: dbscan | ||
Version: 1.1-9 | ||
Date: 2022-01-10 | ||
Title: Density Based Clustering of Applications with Noise (DBSCAN) and Related | ||
Algorithms | ||
Version: 1.1-10 | ||
Date: 2022-01-14 | ||
Title: Density-Based Spatial Clustering of Applications with Noise (DBSCAN) | ||
and Related Algorithms | ||
Authors@R: c(person("Michael", "Hahsler", role = c("aut", "cre", "cph"), | ||
email = "[email protected]"), | ||
person("Matthew", "Piekenbrock", role = c("aut", "cph")), | ||
person("Sunil", "Arya", role = c("ctb", "cph")), | ||
person("David", "Mount", role = c("ctb", "cph"))) | ||
Description: A fast reimplementation of several density-based algorithms of | ||
the DBSCAN family for spatial data. Includes the clustering algorithms | ||
DBSCAN (density-based spatial clustering of applications with noise) | ||
and HDBSCAN (hierarchical DBSCAN), the ordering algorithm | ||
OPTICS (ordering points to identify the clustering structure), | ||
and the outlier detection algorithm LOF (local outlier factor). | ||
The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. | ||
An R interface to fast kNN and fixed-radius NN search is also provided. | ||
the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based | ||
spatial clustering of applications with noise) and HDBSCAN (hierarchical | ||
DBSCAN), the ordering algorithm OPTICS (ordering points to identify the | ||
clustering structure), shared nearest neighbor clustering, and the outlier | ||
detection algorithms LOF (local outlier factor) and GLOSH (global-local | ||
outlier score from hierarchies). The implementations use the kd-tree data | ||
structure (from library ANN) for faster k-nearest neighbor search. An R | ||
interface to fast kNN and fixed-radius NN search is also provided. | ||
Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>. | ||
SystemRequirements: C++11 | ||
Imports: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
####################################################################### | ||
# dbscan - Density Based Clustering of Applications with Noise | ||
# and Related Algorithms | ||
# Copyright (C) 2017 Michael Hahsler | ||
|
||
# This program is free software; you can redistribute it and/or modify | ||
# it under the terms of the GNU General Public License as published by | ||
# the Free Software Foundation; either version 2 of the License, or | ||
# any later version. | ||
# | ||
# This program is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU General Public License along | ||
# with this program; if not, write to the Free Software Foundation, Inc., | ||
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. | ||
|
||
#' Find Connected Components in a NN Graph | ||
#' | ||
#' Generic function and methods to find connected components in nearest neighbor graphs. | ||
#' | ||
#' Note that for kNN graphs, one point may be in the kNN of the other but nor vice versa. | ||
#' `mutual = TRUE` requires that both points are in each other's kNN. | ||
#' | ||
#' @family NN functions | ||
#' @aliases components | ||
#' | ||
#' @param x the [NN] object representing the graph or a [dist] object | ||
#' @param eps threshold on the distance | ||
#' @param mutual for a pair of points, do both have to be in each other's neighborhood? | ||
#' @param ... further arguments are currently unused. | ||
#' | ||
#' @return a integer vector with component assignments. | ||
#' | ||
#' @author Michael Hahsler | ||
#' @keywords model | ||
#' @examples | ||
#' set.seed(665544) | ||
#' n <- 100 | ||
#' x <- cbind( | ||
#' x=runif(10, 0, 5) + rnorm(n, sd = 0.4), | ||
#' y=runif(10, 0, 5) + rnorm(n, sd = 0.4) | ||
#' ) | ||
#' plot(x, pch = 16) | ||
#' | ||
#' # Connected components on a graph where each pair of points | ||
#' # with a distance less or equal to eps are connected | ||
#' d <- dist(x) | ||
#' components <- comps(d, eps = .8) | ||
#' plot(x, col = components, pch = 16) | ||
#' | ||
#' # Connected components in a fixed radius nearest neighbor graph | ||
#' # Gives the same result as the threshold on the distances above | ||
#' frnn <- frNN(x, eps = .8) | ||
#' components <- comps(frnn) | ||
#' plot(frnn, data = x, col = components) | ||
#' | ||
#' # Connected components on a k nearest neighbors graph | ||
#' knn <- kNN(x, 3) | ||
#' components <- comps(knn, mutual = FALSE) | ||
#' plot(knn, data = x, col = components) | ||
#' | ||
#' components <- comps(knn, mutual = TRUE) | ||
#' plot(knn, data = x, col = components) | ||
#' | ||
#' # Connected components in a shared nearest neighbor graph | ||
#' snn <- sNN(x, k = 10, kt = 5) | ||
#' components <- comps(snn) | ||
#' plot(snn, data = x, col = components) | ||
#' @export comps | ||
comps <- function(x, ...) UseMethod("comps", x) | ||
|
||
#' @rdname comps | ||
comps.dist <- function(x, eps, ...) | ||
stats::cutree(stats::hclust(x, method = "single"), h = eps) | ||
|
||
#' @rdname comps | ||
comps.kNN <- function(x, mutual = FALSE, ...) | ||
as.integer(factor(comps_kNN(x$id, as.logical(mutual)))) | ||
|
||
# sNN and frNN are symmetric so no need for mutual | ||
#' @rdname comps | ||
comps.sNN <- function(x, ...) comps.kNN(x, mutual = FALSE) | ||
|
||
#' @rdname comps | ||
comps.frNN <- function(x, ...) comps_frNN(x$id, mutual = FALSE) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.