-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Speed up cluster-extent thresholding function #239
Conversation
This also clusters positive and negative values separately.
Codecov Report
@@ Coverage Diff @@
## master #239 +/- ##
==========================================
- Coverage 46.39% 46.02% -0.37%
==========================================
Files 33 33
Lines 2037 2049 +12
==========================================
- Hits 945 943 -2
- Misses 1092 1106 +14
Continue to review full report at Codecov.
|
I have a brief thought about the separation of positive and negative clustering - it absolutely makes sense if we are looking for clusters of BOLD activity, as then you expect a nice smooth pattern, all in one direction. For things like GRAPPA artifacts, particularly those associated with movement, the artifacts sometimes lead to alternating positive and negative bands or grid like manifestations (likely due to their Fourier origin) - and I worry that this would reduce the rho metric for those. Perhaps I am misunderstanding the change, however. I also don't have a figure handy for that, so I'm going off of memory. I'll see if I can dig up something tomorrow. |
That is an extremely interesting point that relates to my concerns about how we should treat betas in calculating our metrics. When you have a chance to take a look at that issue, please make sure to bring up that point. In this case, though, we just use In the case of banding, then I don't think that this proposed change will hide that (unless the bands of alternative positive and negative betas are extremely small), since our minimum cluster size is almost always 20 voxels. At most, this will peel off the outlying couple of voxels that might glom onto a larger cluster of a different sign. The resulting differences in the component table are minimal. |
I could also update the function to support different kinds of thresholding. We could use 3dClustSim's terminology. To paraphrase 3dClustSim:
We could then discuss what is most appropriate on a map-by-map basis. |
I'll add this to the agenda for tomorrow, at least as a non-verbal update, to get folks to weigh in ! |
I changed the default to be two-sided (but with the options to use one-sided or bi-sided) and renamed the function to |
At this point the spatial clustering should be equivalent to the old version (but much faster and with control over binarization and thresholding method). Does anyone have any concerns/thoughts about the method I'm using? |
Sounds excellent to me! - how much time was spatial clustering contributing to the whole? I know you've done some work with clocking tedana as a whole, but I'm now curious how much each step takes. |
I haven't done proper profiling, but based on the runlogs when running with MLE on the 5-echo test dataset on my laptop, with the old method it took ~2:50 to threshold the relevant maps for 159 components, and with these changes it took ~6 seconds. And all of the |
That's a pretty substantial speed up! Awesome! |
* Speed up spatclust function. This also clusters positive and negative values separately. * Rename spatclust to threshold_map and add binarize/sided arguments. * Replace manually generated binary structure with function-made one.
Closes None.
Changes proposed in this pull request:
numpy.unique
to cluster-extent threshold and binarize maps withspatclust
.spatclust
tothreshold_map
and move frommodel.fit
toutils
.