Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit names and hierarchy #88

Closed
ctb opened this issue Jan 4, 2017 · 2 comments
Closed

Revisit names and hierarchy #88

ctb opened this issue Jan 4, 2017 · 2 comments

Comments

@ctb
Copy link
Contributor

ctb commented Jan 4, 2017

A few things for @lgautier and @luizirber to weigh in on --

  • the sourmash_lib.signature module is badly named; I keep on colliding names with sig, etc. I was thinking of renaming it sigutils or something.

  • the Estimators class is badly named. It's a legacy class anyway, held over from when MinHash didn't exist. It's still a convenient wrapper around the CPython MinHash class, at least for now, since pure Python is much easier to write, change, and test than new C code. One approach might be to deprecate the direct use of MinHash and wrap it more tightly in Estimators, and then rename Estimators to MinHash. Or if that's too confusing, MinHashWrapper. Thoughts?

  • regardless of what we do with the names, Estimators should be moved out of __init__.py.

Not hugely urgent but if you have super strong opinions, now is the time :)

@lgautier
Copy link
Contributor

lgautier commented Jan 4, 2017

Just opinions:

  • I prefer signature to sigutils (the later bringing little IMHO - to take a parallel, I like better the stdlib package os to be called that way rather than osutils). Not saying that signature should not be named anything else though...

  • I agree that Estimators is not the best name for what it represents. The C-level class could be _MinHash (or better _MinHashSketch, since this is a collection of hash values) while the Python-level wrapper be simply MinHash (or MinHashSketch). That's strategy I like because it lets one implement ideas in Python and only move what is both meant to stay and performance-critical down to C as needed.

@ctb
Copy link
Contributor Author

ctb commented Apr 19, 2017

The 'estimators' class was completely removed and now 'Signature' has a 'minhash' member, as of #155. We'll leave 'signature' as is, as suggested by @lgautier.

@ctb ctb closed this as completed Apr 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants