-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UMap support #110
Comments
Hello @MohammadFakhreddin, Thank you for the interest and nice to read you are finding the library useful. I might be able to help with the k-NN question. Even though tapkee already includes tree data structures for it as several dimensionality reduction techniques are based on nearest neighbors, you could take a look at Shogun. This notebook should help to get a quick idea of how you can do k-NN in Shogun using the Python interface. Even though the notebook is about LMNN (you can think of it as an extension to k-NN), see e.g. code cell [14] for an example applying k-NN in a metagenomics dataset. So :) |
@iglesias Thanks a lot! I look into it and try to implement something based on that. At the moment, I'm trying to keep the build as simple as possible, so I do my best to avoid a complex library. One of our goals is the project's accessibility. We have some prototypes in Python using Scikit Learn, but currently, my aim is the project's longevity and ease of build. As a side note, I think the cmake minimum version is too high :) Let me know if anyone knows about the current state of the UMAP library. |
thanks for reaching out! The UMAP should be a good addition to the library but none of us two have got enough time recently to implement it. As of now there is no implementation even in a branch. |
Indeed. On Open Source, I am with the CodeQl stuff and making contributions to GitHub’s coding-standards repo. Would I look into something in tapkee atm, I’d be more interested in some topic related to that (even widely, such as safety with Circle or just even trying the new clang real-time sanitizer on it). I recalled on umap there was already this #95 The umap python repo on github looks quite popular, and there’s also a c++ repo. What would be the goal of adding a new method DR now to tapkee? I wondered and I couldn’t think of any besides completeness in tapkee. |
So, I integrated Tapkee into my project and fed it the dataset I used for testing PCA using OpenCV. Strangely, it took 3-4 seconds for OpenCV PCA, while for Tapkee, it took 4-5 minutes. I noticed that OpenCL was present in the cmake. Are you guys using OpenCL for optimization? Can it be that by not including OpenCL in my project, I made Tapkee much slower than OpenCV?
|
Hello @MohammadFakhreddin, assuming I understood your message and questions correctly after reading them a few times, a comparison between OpenCV with GPU acceleration and Tapkee without, providing that PCA is amenable to data-parallelism, would obviously result in a large difference in a large dataset. |
Hello,
First of all, I want to thank you for this library. I've been looking for a library that I can use to integrate dimensionality reduction techniques into our tool for our paper, and this is perfect for that. (I make sure to cite :))
I would like to ask about the current situation with the UMAP. Is it ready to use?
Also, as a side question, Are you guys aware of any good library for the K-NN classifier?
The text was updated successfully, but these errors were encountered: