Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] rename containment to contained_by #199

Merged
merged 52 commits into from
May 24, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
23f9137
enrich -o output, provide column headers, eliminate --csv on sbt_gather
ctb Apr 17, 2017
609bf01
fix tests, add --save-matches test
ctb Apr 17, 2017
0b19027
simplify sbt_gather a bit by doing legacy hll compute in minhash max_…
ctb Apr 19, 2017
fc89bd3
hackity hack add in downsample
ctb Apr 19, 2017
b8eac7d
Merge branch 'master' of github.com:dib-lab/sourmash into update/gath…
ctb Apr 19, 2017
0956b4f
add in containment
ctb Apr 19, 2017
4fc8aff
fix legacy max_hash
ctb Apr 22, 2017
66f1c10
fixed some super dumb errors in sourmash sbt_gather
ctb Apr 22, 2017
a11fdf3
fix print -> notify
ctb Apr 22, 2017
65e5928
fix notifications
ctb Apr 22, 2017
69b8ae6
matplotlib fix
ctb Apr 22, 2017
bc74e1f
avoid printing out really large matrices
ctb Apr 23, 2017
6d3a07d
update output format
ctb Apr 24, 2017
1a238f5
Modify TOC depth
betatim Apr 25, 2017
97aab1a
added scaled_to_max_hash function
ctb Apr 26, 2017
f5a7140
updated output
ctb Apr 26, 2017
f343269
Merge branch 'fix/sbt_search' into spacegraphcats
ctb Apr 26, 2017
5031b54
fix sbt_search --best-only
ctb May 2, 2017
1ff3e69
fix sbt_search --best-only
ctb May 2, 2017
e0a4c0b
clean up sbt_search output a bit
ctb May 2, 2017
98d2da8
Merge branch 'fix/sbt_search' into spacegraphcats
ctb May 2, 2017
de999b7
add scaled property to MinHashes
ctb May 4, 2017
5d89849
rename containment to contained_by
ctb May 4, 2017
82ba57a
Merge branch 'master' of github.com:dib-lab/sourmash into fix/sbt_search
ctb May 14, 2017
3b1eae6
added test for search --containment
ctb May 14, 2017
4670a1c
test similarity calculation with downsampling
ctb May 14, 2017
b2cff29
rework sbt_search output a bit; test downsampling
ctb May 14, 2017
a3971a0
comment as to why output count changes
ctb May 14, 2017
157b05d
Merge branch 'master' of github.com:dib-lab/sourmash into update/gath…
ctb May 14, 2017
b63303b
get sbt_gather output to line up properly:
ctb May 14, 2017
9411fdb
Merge branch 'docs' of https://github.com/betatim/sourmash into updat…
ctb May 14, 2017
b72c958
fix up nav links to be a bit clearer
ctb May 14, 2017
a4c4e63
add link to databases; update requirements for markdown support
ctb May 14, 2017
f15b650
add a draft tutorial
ctb May 14, 2017
b0df50b
Merge branch 'update/gather_out' into spacegraphcats
ctb May 14, 2017
09b6ec8
Merge branch 'fix/sbt_search' into spacegraphcats
ctb May 14, 2017
c764843
upd with data
ctb May 14, 2017
6db102f
upd links/data
ctb May 14, 2017
a99e6ab
update output
ctb May 14, 2017
11332c9
whoops, forgot to add this :)
ctb May 14, 2017
e468fae
whoops, forgot to add this :)
ctb May 14, 2017
f1cac81
Merge branch 'master' of github.com:dib-lab/sourmash into update/doc_…
ctb May 16, 2017
fdcf507
Merge branch 'master' of github.com:dib-lab/sourmash into fix/sbt_search
ctb May 16, 2017
02f3468
Merge branch 'master' of github.com:dib-lab/sourmash into update/gath…
ctb May 16, 2017
c4626bc
Merge branch 'update/gather_out' into spacegraphcats
ctb May 16, 2017
aad924e
Merge branch 'fix/sbt_search' into spacegraphcats
ctb May 16, 2017
02b791e
Merge branch 'update/doc_sbts' into spacegraphcats
ctb May 16, 2017
96de9e9
Merge branch 'master' of github.com:dib-lab/sourmash into spacegraphcats
ctb May 17, 2017
60366a7
remove unneeded code (again)
ctb May 17, 2017
ef09bb5
Merge branch 'master' of github.com:dib-lab/sourmash into spacegraphcats
ctb May 21, 2017
7b14709
Merge branch 'master' of github.com:dib-lab/sourmash into spacegraphcats
ctb May 23, 2017
150a0f2
Merge branch 'master' into spacegraphcats
luizirber May 24, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions sourmash_lib/_minhash.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,6 @@ cdef class MinHash(object):

def downsample_scaled(self, new_num):
max_hash = self.max_hash

if max_hash is None:
raise ValueError('no max_hash available - cannot downsample')

Expand Down Expand Up @@ -364,9 +363,9 @@ cdef class MinHash(object):
distance = 2*math.acos(prod) / math.pi
return 1.0 - distance

def containment(self, other):
def contained_by(self, other):
"""\
Calculate containment of self by other.
Calculate how much of self is contained by other.
"""
return self.count_common(other) / len(self.get_mins())

Expand Down
2 changes: 1 addition & 1 deletion sourmash_lib/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -625,7 +625,7 @@ def search(args):
# similarity vs containment
query_similarity = lambda x: query.similarity(x, downsample=True)
if args.containment:
query_similarity = lambda x: query.containment(x, downsample=True)
query_similarity = lambda x: query.contained_by(x, downsample=True)

# set up the search databases
databases = sourmash_args.load_sbts_and_sigs(args.databases,
Expand Down
6 changes: 3 additions & 3 deletions sourmash_lib/signature.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,15 +113,15 @@ def jaccard(self, other):
"Compute Jaccard similarity with the other MinHash signature."
return self.minhash.similarity(other.minhash, True)

def containment(self, other, downsample=True):
def contained_by(self, other, downsample=False):
"Compute containment by the other signature. Note: ignores abundance."
try:
return self.minhash.containment(other.minhash)
return self.minhash.contained_by(other.minhash)
except ValueError as e:
if 'mismatch in max_hash' in str(e) and downsample:
xx = self.minhash.downsample_max_hash(other.minhash)
yy = other.minhash.downsample_max_hash(self.minhash)
return xx.containment(yy)
return xx.contained_by(yy)
else:
raise

Expand Down