Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Musing #130

Open
SamGG opened this issue Dec 3, 2024 · 1 comment
Open

Musing #130

SamGG opened this issue Dec 3, 2024 · 1 comment

Comments

@SamGG
Copy link

SamGG commented Dec 3, 2024

Hi,
In the case you didn't come up to this:

ActUp: Analyzing and Consolidating tSNE and UMAP
Andrew Draganov, Jakob Rødsgaard Jørgensen, Katrine Scheel Nellemann, Davide Mottin, Ira Assent, Tyrus Berry, Cigdem Aslay
tSNE and UMAP are popular dimensionality reduction algorithms due to their speed and interpretable low-dimensional embeddings. Despite their popularity, however, little work has been done to study their full span of differences. We theoretically and experimentally evaluate the space of parameters in both tSNE and UMAP and observe that a single one -- the normalization -- is responsible for switching between them. This, in turn, implies that a majority of the algorithmic differences can be toggled without affecting the embeddings. We discuss the implications this has on several theoretic claims behind UMAP, as well as how to reconcile them with existing tSNE interpretations.
Based on our analysis, we provide a method (\ourmethod) that combines previously incompatible techniques from tSNE and UMAP and can replicate the results of either algorithm. This allows our method to incorporate further improvements, such as an acceleration that obtains either method's outputs faster than UMAP. We release improved versions of tSNE, UMAP, and \ourmethod that are fully plug-and-play with the traditional libraries at this https URL

sleepwalk: Interactively Explore Dimension-Reduced Embeddings
A tool to interactively explore the embeddings created by dimension reduction methods such as Principal Components Analysis (PCA), Multidimensional Scaling (MDS), T-distributed Stochastic Neighbour Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP) or any other.
https://anders-biostat.github.io/sleepwalk/
https://cran.r-project.org/package=sleepwalk

sleepwalk is used by TJ Burns in a demo at https://www.linkedin.com/posts/tylerjburns_in-light-of-recent-scrutiny-around-umap-activity-7169341694348324865-sCmq/

Thanks for all your "unpublished" work.

@jlmelville
Copy link
Owner

Sleepwalk and the related projects are very cool and useful. I hadn't seen the LinkedIn demo, thank you for pointing that out.

A lot of interesting things to process in those ActUp/GiDR-DUN papers, especially that a simple square loss on the affinities does as well as KL or the UMAP loss. In terms of the t-SNE-to-UMAP I think the approach in From t-SNE to UMAP with contrastive learning is more general and I read that one first. Personal opinion in-coming: I don't really buy that you can get truly t-SNE-looking results without doing a lot of negative sampling, which is not at all efficient under these frameworks. You can get results which are reminiscent of t-SNE-looking by scaling the repulsion, but they don't look like the real McCoy to me. Perhaps I have spent too much time starting at t-SNE output though. YMMV.

As for the speed up they get, I would need to actually look at the source code of to fully understand what they are doing from a speed-up perspective (whether I have the time to actually do that is another matter). I don't quite get it from the papers themselves as there seems to be some talk of an interaction between the optimization method and whether you weight the gradient contributions or keep with the sampling-based expectation approach.

You can emulate the main source of what I think is what they say contributes to the main speed-up with negative_sample_rate = 1, repulsion_strength = 5.0. The optimization will for sure go faster and gives qualitatively similar output. But results are definitely not identical here. NCVis scales up the amount of negative sampling over time which is probably the better compromise, but I don't know if that strategy would be easy to integrate into uwot.

In a similar vein, the GiDR-DUN paper is mentioned by Leland McInnes in a Reddit comment, but in there also is a suggestion to do negative sampling deterministically, which would be a good thing to try.

I hadn't realized I needed to activate the Discussions part of this repo, but I've done that now. Feel free to use that for anything non-issue-related going forward if you want (it doesn't matter to me though either way).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants