jaccard R package

This R package enables statistical testing of similarity between binary data using the Jaccard/Tanimoto similarity coefficient -- the ratio of intersection to union. Biochemical fingerprints, genomic intervals, and ecological communities are some examples of binary data in life sciences. For examples, competition between two different operational taxonomic units (OTUs) are often evaluated by a Jaccard/Tanimoto coefficient between their absence/presence vectors across multiple bioregions.

We provide 4 methods of computing statistical significance of such similarity coefficients for binary data: the exact solution, the asymptotic approximation, the bootstrap method, and the measure concentration algorithm. We recommand using either the bootstrap method or the measure concentration algorithm, since the exact solution can be slow and the asymptotic approximation could be inaccurate depending on the data size.

Reference

Chung, N., Miasojedow, B., Startek, M., and Gambin, A. "Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data" BMC Bioinformatics (2019) 20(Suppl 15): 644. https://doi.org/10.1186/s12859-019-3118-5

Basic Usage

To install this package:

install.packages("devtools")
library("devtools")
install_github("ncchung/jaccard")

To load this package:

library("jaccard")

To compute the exact p-value of similarity between two binary vectors, x and y:

jaccard.test(x,y,method="exact")$pvalue

When a length of a binary vector is moderately long, the bootstrap and the measure concentration algorithm (mca) are much faster while maintaining high accuracy:

jaccard.test(x,y,method="bootstrap",B=1000)
jaccard.test(x,y,method="mca",accuracy=1e-05)

Help documents can be loaded in R, such as:

? jaccard.test.bootstrap

License

GNU General Public License 2

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
R		R
inst/shinyapps/jaccard-shiny		inst/shinyapps/jaccard-shiny
man		man
src		src
.Rbuildignore		.Rbuildignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitignore.swp		.gitignore.swp
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
jaccard.Rproj		jaccard.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jaccard R package

Reference

Basic Usage

License

About

Releases

Packages

Contributors 2

Languages

License

ncchung/jaccard

Folders and files

Latest commit

History

Repository files navigation

jaccard R package

Reference

Basic Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages