Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hnsw0.8.0 #19

Merged
merged 9 commits into from
Feb 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 24 additions & 17 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,23 +1,30 @@
Package: RcppHNSW
Title: 'Rcpp' Bindings for 'hnswlib', a Library for Approximate Nearest Neighbors
Version: 0.5.9000
Authors@R: c(person("James", "Melville", email = "[email protected]",
role = c("aut", "cre")),
person("Aaron", "Lun", role = "ctb"),
person("Samuel", "Granjeaud", role = "ctb"),
person("Dmitriy", "Selivanov", role = "ctb"),
person("Yuxing", "Liao", role = "ctb"))
Description: 'Hnswlib' is a C++ library for Approximate Nearest Neighbors. This
package provides a minimal R interface by relying on the 'Rcpp' package. See
<https://github.com/nmslib/hnswlib> for more on 'hnswlib'. 'hnswlib' is
released under Version 2.0 of the Apache License.
Title: 'Rcpp' Bindings for 'hnswlib', a Library for Approximate Nearest
Neighbors
Version: 0.6.0
Authors@R: c(
person("James", "Melville", , "[email protected]", role = c("aut", "cre", "cph")),
person("Aaron", "Lun", role = "ctb"),
person("Samuel", "Granjeaud", role = "ctb"),
person("Dmitriy", "Selivanov", role = "ctb"),
person("Yuxing", "Liao", role = "ctb")
)
Description: 'Hnswlib' is a C++ library for Approximate Nearest Neighbors.
This package provides a minimal R interface by relying on the 'Rcpp'
package. See <https://github.com/nmslib/hnswlib> for more on
'hnswlib'. 'hnswlib' is released under Version 2.0 of the Apache
License.
License: GPL (>= 3)
URL: https://github.com/jlmelville/rcpphnsw
BugReports: https://github.com/jlmelville/rcpphnsw/issues
Imports:
methods,
Rcpp (>= 0.11.3)
Suggests:
covr,
testthat
LinkingTo:
Rcpp
Encoding: UTF-8
Imports: methods, Rcpp (>= 0.11.3)
LinkingTo: Rcpp
RoxygenNote: 7.2.3
Suggests: testthat,
covr
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# RcppHNSW 0.5.9000
# RcppHNSW 0.6.0

## New features

* Updated hnswlib to [version 0.8.0](https://github.com/nmslib/hnswlib/releases/tag/v0.8.0).

# RcppHNSW 0.5.0

Expand Down
48 changes: 15 additions & 33 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,46 @@
## Release Summary

This is a patch release to fix various CRAN check errors.
This is a patch release for a new version of the underlying hnswlib library.

## Test environments

* ubuntu 22.04 (on github actions), R 4.2.3, R 4.3.1, devel
* local ubuntu 23.04 R 4.2.2
* ubuntu 22.04 (on github actions), R 4.2.3, R 4.3.2, devel
* local ubuntu 23.10 R 4.3.1
* Debian Linux, R-devel, GCC ASAN/UBSAN (via rhub)
* Debian Linux, R-release, GCC (via rhub)
* Debian Linux, R-release, GCC valgrind (via rhub)
* Ubuntu Linux 20.04.1 LTS, R-release, GCC (via rhub)
* Fedora Linux, R-devel, clang, gfortran (via rhub)
* Windows Server 2022 (on github actions), R 4.2.3, R 4.3.1
* Windows Server 2022 (on github actions), R 4.2.3, R 4.3.2
* Windows Server 2022, R-devel, 64 bit (via rhub)
* local Windows 11 build, R 4.3.1
* local Windows 11 build, R 4.3.2
* win-builder (devel)
* mac OS X Monterey (on github actions) R 4.3.1
* local mac OS X Sonoma R 4.3.2
* mac OS X Monterey (on github actions) R 4.3.2

## R CMD check results

There were no ERRORs or WARNINGs.

There was one NOTE:

N checking installed package size ...
installed size is 6.6Mb
sub-directories of 1Mb or more:
libs 6.3Mb
* checking installed package size ... NOTE
installed size is 6.7Mb
sub-directories of 1Mb or more:
libs 6.4Mb

This is expected due to the use of C++ templates in hnswlib.

This is spelled correctly.

## CRAN checks

There are no ERRORs or WARNINGs.

There is a NOTE:

Check: C++ specification
Result: NOTE
Specified C++11: please drop specification unless essential

This submission fixes this.

There is a NOTE:

Check: Rd metadata
Result: NOTE
Invalid package aliases in Rd file ‘RcppHnsw-package.Rd’:
‘RcppHnsw-package’

This submissions fixes this.

There are four flavors with NOTEs about installed package size (r-release-macos-arm64,
r-release-macos-x86_64, r-oldrel-macos-arm64, r-oldrel-macos-x86_64). This is expected and won't be
fixed.
There are three flavors with NOTEs about installed package size (r-release-macos-arm64,
r-release-macos-x86_64, r-oldrel-macos-arm64). This is expected and won't be fixed.

## Downstream dependencies

We checked 2 reverse dependencies (0 from CRAN + 2 from Bioconductor), comparing R CMD check
We checked 3 reverse dependencies (1 from CRAN + 2 from Bioconductor), comparing R CMD check
results across CRAN and dev versions of this package.

* We saw 0 new problems
Expand Down
14 changes: 10 additions & 4 deletions inst/include/bruteforce.h
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,16 @@ class BruteforceSearch : public AlgorithmInterface<dist_t> {


void removePoint(labeltype cur_external) {
size_t cur_c = dict_external_to_internal[cur_external];
std::unique_lock<std::mutex> lock(index_lock);

dict_external_to_internal.erase(cur_external);
auto found = dict_external_to_internal.find(cur_external);
if (found == dict_external_to_internal.end()) {
return;
}

dict_external_to_internal.erase(found);

size_t cur_c = found->second;
labeltype label = *((labeltype*)(data_ + size_per_element_ * (cur_element_count-1) + data_size_));
dict_external_to_internal[label] = cur_c;
memcpy(data_ + size_per_element_ * cur_c,
Expand All @@ -106,7 +112,7 @@ class BruteforceSearch : public AlgorithmInterface<dist_t> {
dist_t dist = fstdistfunc_(query_data, data_ + size_per_element_ * i, dist_func_param_);
labeltype label = *((labeltype*) (data_ + size_per_element_ * i + data_size_));
if ((!isIdAllowed) || (*isIdAllowed)(label)) {
topResults.push(std::pair<dist_t, labeltype>(dist, label));
topResults.emplace(dist, label);
}
}
dist_t lastdist = topResults.empty() ? std::numeric_limits<dist_t>::max() : topResults.top().first;
Expand All @@ -115,7 +121,7 @@ class BruteforceSearch : public AlgorithmInterface<dist_t> {
if (dist <= lastdist) {
labeltype label = *((labeltype *) (data_ + size_per_element_ * i + data_size_));
if ((!isIdAllowed) || (*isIdAllowed)(label)) {
topResults.push(std::pair<dist_t, labeltype>(dist, label));
topResults.emplace(dist, label);
}
if (topResults.size() > k)
topResults.pop();
Expand Down
Loading
Loading