Hard coded number of echoes in Kappa & Rho Estimates #77

handwerkerd · 2018-06-08T18:43:12Z

I was going through some of the code and I think I've noticed a bug in the kappa & rho calculations both here and in the meica code. If you look in Appendix A of Olafsson et al NeuroImage 2015 ( https://doi.org/10.1016/j.neuroimage.2015.02.052 ), you can see that the degree of freedom in the denominator of the F tests are (number of echoes - 1)/1. When there are 3 echoes, this becomes the hard coded '* 2' in the following two lines of code. I think it should be (n_echos-1). Does this look right to others?

tedana/tedana/model/fit.py

Line 136 in 30762df

F_S0 = (alpha - SSE_S0) * 2 / (SSE_S0)

tedana/tedana/model/fit.py

Line 143 in 30762df

F_R2 = (alpha - SSE_R2) * 2 / (SSE_R2)

In practice, this would make all kappa & rho values larger by a constant value based on the # of echoes, which shouldn't affect the elbow thresholds in the selection step, but I don't know if this might affect some of the non-elbow-based selection criteria.

emdupre · 2018-06-09T00:42:15Z

Thanks for pointing this out, @handwerkerd !! Yes, reviewing the paper it looks like you're right. Do you want to open a PR to patch this ?

Relatedly, there are a few other hard-coded values that I've been having trouble understanding— for example here:

tedana/tedana/utils/utils.py

Line 195 in 7b197d6

# get 33rd %ile of `first_echo` and find corresponding index

Do you know if this percentile should also be adjusted by the number of echos ?

handwerkerd · 2018-06-10T01:26:46Z

I'll aim to do this after OHBM. I'm trying to think if any of the criteria might be altered by this scale change and it would be good to know potential effects of this change before altering the code.

The line 195 that you reference looks arbitrary to me. The variable is med_val, which I suspect was originally the median value and, at some point, the 33rd percentile worked better. That said, a few lines down at

tedana/tedana/utils/utils.py

Line 200 in 7b197d6

lthrs = np.squeeze(echo_means[med_val].T) / 3 # QUESTION: why divide by 3?

It looks like these values are divided by 3 and that might be a hard coded number of echoes

tsalo · 2018-07-24T21:07:35Z

I have an open PR that I'm working on and can fix this (i.e., add the line n_echos = data.shape[1] and change the 3 to n_echos) fairly easily while I'm at it. Should I do that?

handwerkerd · 2018-07-24T21:13:15Z

Your choice. I've looked through the code enough to see that this may non-trivially alter results. If there's a benefit to making a distinct PR for a result-altering change, then this change should be a distinct PR. If there's no benefit to that, I have no problem with you making the change as part of your PR.

tsalo · 2018-07-24T21:21:58Z

That's a really good point that I hadn't considered. I've been doing a lot of random minor things in my PR and it would probably be a bad idea to bury an important (though small) change among all of them. I can open a new, separate PR really quickly though.

emdupre · 2018-07-25T15:31:40Z

#95 is passing CI, so it seems like it's not altering results too significantly. It'd be great to get your review there though @handwerkerd !

tsalo · 2018-07-25T16:03:00Z

I think it's only not affecting the results because the input data have three echoes. Is that correct?
Having a second test dataset with four echoes might be good at some point.

handwerkerd · 2018-07-25T16:38:07Z

I can share a 5 echo data set (3T MRI, no SMS, 150 volumes, block design checkerboard task). It's probably better to pass it along to someone else to integrate into the CI testing. Who should I send it to and what would you need? I can either send a separate volume or a zcat file + the echo times. That said, I just ran this version of tedana on one of these data sets and the results are problematic. A component with 3% of the normalized variance is ending up in the ignored bin, which, to my understanding should never happen. I could see how it might incorrectly end up in midK, but, at least for the older selcomps code, this should have never ended up in ignore. It might be useful to figure out what's happening before using this data set as part of CI.

emdupre · 2018-07-25T18:49:20Z

It'd be great to have the five echo dataset added as preprocessed, individual echoes so that we can test the IO with individual echo files.

Adding them brings up a great point, which I might open another issue for: I've been hosting the data files on dropbox but I'd like to move them to another location, as I'm likely closing that account soon. My first thought is to try and host the files on NITRC, unless y'all can think of a ssh server or s3 bucket to drop them in.

KirstieJane · 2018-07-25T18:55:40Z

These could be great examples of BIDS datasets so maybe @chrisfilo has a suggestion of where to put them?

(This information could also go in the BIDS Starter Kit for future ref!)

chrisgorgo · 2018-07-25T19:02:11Z

If the preprocessed data come along with the raw we would be more than happy to host them on OpenNeuro.org. Just put the preprocessed data in the derivatives folder.

Co-authored-by: jdkent <[email protected]> Thank you to James for the text from the HBClab/NiBetaSeries project (PR ME-ICA#77)

) * Add text from HBClab/NiBetaSeries to contributing guidlines Explains a little about markdown and a little about rstructured text :) This text is taken from jdkent's contribution to the NiBetaSeries project, which built on my contribution to the BIDS Starter Kit! I'll try to attribute him in the pull request so he gets some credit 😺 * Update CONTRIBUTING.md Co-authored-by: [email protected] (Actually it was the commit before - I don't know how to fix iiiiiit! I just want him to show up on the contributors list!) * Add guide to markdown and rst Co-authored-by: jdkent <[email protected]> Thank you to James for the text from the HBClab/NiBetaSeries project (PR ME-ICA#77) * put table of contents in the right order

emdupre added the bug issues describing a bug or error found in the project label Jun 9, 2018

tsalo mentioned this issue Jul 24, 2018

[FIX] Remove hardcoded numbers of echoes. #95

Merged

handwerkerd mentioned this issue Jul 25, 2018

Large variance components can be ignored #97

Closed

emdupre closed this as completed in #95 Jul 25, 2018

KirstieJane added a commit to KirstieJane/tedana that referenced this issue Nov 7, 2018

Add guide to markdown and rst

b13cb0a

Co-authored-by: jdkent <[email protected]> Thank you to James for the text from the HBClab/NiBetaSeries project (PR ME-ICA#77)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hard coded number of echoes in Kappa & Rho Estimates #77

Hard coded number of echoes in Kappa & Rho Estimates #77

handwerkerd commented Jun 8, 2018

emdupre commented Jun 9, 2018

handwerkerd commented Jun 10, 2018

tsalo commented Jul 24, 2018

handwerkerd commented Jul 24, 2018

tsalo commented Jul 24, 2018

emdupre commented Jul 25, 2018

tsalo commented Jul 25, 2018

handwerkerd commented Jul 25, 2018

emdupre commented Jul 25, 2018 •

edited

Loading

KirstieJane commented Jul 25, 2018

chrisgorgo commented Jul 25, 2018

Hard coded number of echoes in Kappa & Rho Estimates #77

Hard coded number of echoes in Kappa & Rho Estimates #77

Comments

handwerkerd commented Jun 8, 2018

emdupre commented Jun 9, 2018

handwerkerd commented Jun 10, 2018

tsalo commented Jul 24, 2018

handwerkerd commented Jul 24, 2018

tsalo commented Jul 24, 2018

emdupre commented Jul 25, 2018

tsalo commented Jul 25, 2018

handwerkerd commented Jul 25, 2018

emdupre commented Jul 25, 2018 • edited Loading

KirstieJane commented Jul 25, 2018

chrisgorgo commented Jul 25, 2018

emdupre commented Jul 25, 2018 •

edited

Loading