Address divide by zero when getelbow called with only one component #808

handwerkerd · 2021-09-22T21:53:25Z

Summary

This is a continuation of issue #752 and PR #760. That discussion focused on removing a really uninformative error when getelbow was called on an empty array. I now have a situation where getelbow is called on only one component and it runs, but has a divide by zero warning.

The basic solution to this is extremely easy: The error goes away if getelbow is only run with a minimum of 2 components. I'm creating an issue rather than a PR because there may be some relevant discussion here. One can calculate an elbow of a curve with only two values, but that's essentially meaningless. If we're setting a threshold for a number of components than than which we shouldn't calculate an elbow, what should it be? 3 is mathematically plausible, but still non-ideal. I'm leaning towards "5" while acknowledging that's arbitrary.

Additional Detail

The block of code where I get the divide by zero error is:

tedana/tedana/selection/tedica.py

Lines 261 to 269 in 5399ca5

    
           kappas_nonsig = comptable.loc[comptable["kappa"] < f01, "kappa"] 
        
           if not kappas_nonsig.size: 
        
               LGR.warning( 
        
                   "No nonsignificant kappa values detected. " 
        
                   "Only using elbow calculated from all kappa values." 
        
               ) 
        
               kappas_nonsig_elbow = np.nan 
        
           else: 
        
               kappas_nonsig_elbow = getelbow(kappas_nonsig, return_val=True)

The error goes away by replacing if not kappas_nonsig.size with if kappas_nonsig.size<=1 but the question is whether it should be something like <=5 instead.

The last PR also added a check and warning in getelbow. We can adjust that too to throw a warning if the number of components are under a threshold rather than just empty:

tedana/tedana/selection/_utils.py

Lines 97 to 105 in 5399ca5

    
           if not arr.size: 
        
               raise ValueError( 
        
                   "Empty array detected during elbow calculation. " 
        
                   "This error happens when getelbow is incorrectly called on no components. " 
        
                   "If you see this message, please open an issue at " 
        
                   "https://github.com/ME-ICA/tedana/issues with the full traceback and any data " 
        
                   "necessary to reproduce this error, so that we create additional data checks to " 
        
                   "prevent this from happening." 
        
               )

As far as I can tell, getelbow is called in two situations. (1) On most or all of the components. In that situation, this will never be an issue because, if there are only 5 total components, then tedana will have bigger problems. (2) A subset of components where that second elbow threshold is used in conjunction with a threshold on all components. The current approach was to just ignore that second threshold when there are zero components.

Another approach is that the second threshold on non-significant components can only lower the kappa elbow and result in accepting more components and T2* signal. Excluding this option if there are less than X components means that denoising may be slightly more aggressive.

Next Steps

Decide on the appropriate threshold for setting an elbow to nan rather than using just a few components to calculate an elbow.
Options:
1. If there's only one component, set that value as the elbow
2. Set the minimum to 2, which means the code will run as is and the elbow won't actually be an elbow, but will be somewhere around those two value
3. Set a minimum closer to 5 where an elbow calculation has a chance of being meaningful
I originally was leaning towards iii, but I might be overthinking this, and ii might be simplest.
Once a decision is made, implement this.

The text was updated successfully, but these errors were encountered:

handwerkerd added bug issues describing a bug or error found in the project priority: medium Should get addressed soon effort: low Theoretically less than a day's work impact: medium Improves code/documentation functionality for some users labels Sep 22, 2021

handwerkerd mentioned this issue Sep 23, 2021

Run getelbow only if there are enough components #809

Closed

handwerkerd mentioned this issue Nov 17, 2021

[REF] Decision Tree Modularization #756

Merged

22 tasks

tsalo closed this as completed in #756 May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address divide by zero when getelbow called with only one component #808

Address divide by zero when getelbow called with only one component #808

handwerkerd commented Sep 22, 2021 •

edited

Loading

Address divide by zero when getelbow called with only one component #808

Address divide by zero when getelbow called with only one component #808

Comments

handwerkerd commented Sep 22, 2021 • edited Loading

Summary

Additional Detail

Next Steps

handwerkerd commented Sep 22, 2021 •

edited

Loading