-
Notifications
You must be signed in to change notification settings - Fork 3
Results differ based on number of threads #7
Comments
I explore this further and here is a minimal working example:
and some demo output from Rstudio:
As you can see, the results are consistent between different runs using the same number of threads (here for 1 or 2 threads) yet differ when using different numbers of threads. This is quite puzzling to me. |
Not sure as we didn't implement the multicore support, just wrapped the implementation. |
Thanks for the answer. Maybe you can tell me what I have missed there, but the Rtsne package seems not (yet) to contain parallelisation support. There seems to rather be some "derivative" of it (https://github.com/rappdw/tsne) which seems to be Python-based and currently without a wrapper for convenient use in R. |
I observed that the results differ based on the number of threads specified.
In my application which used BH-SNE to create a 2D embedding followed by automated clustering using DBSCAN, I have replaced the single-threaded
Rtsne
call by a call to your multi-threadedRtsne.multicore
. This was nice&easy thanks to the similarity of both interfaces.However, when I run the application, the results differ ever so slightly, as indicated below (just the first couple of points each time):
Using 1 thread
Using 2 threads
Using 3 threads
Using 4 threads
The results using the same number of threads seems to be consistent between different runs, though - which is good at least :)
Using 1 thread - a second run
And for all the points, computing the MD5SUM:
While the differences are hard to spot by eye - I mean in a 2D scatterplot -, the automatic clustering is affected by the differences.
Your input is greatly appreciated!
Best,
Cedric
The text was updated successfully, but these errors were encountered: