Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No convergence after 5000 steps #144

Closed
daraucla opened this issue Oct 23, 2018 · 12 comments
Closed

No convergence after 5000 steps #144

daraucla opened this issue Oct 23, 2018 · 12 comments
Labels
documentation issues related to improving documentation for the project question issues detailing questions about the project or its direction
Milestone

Comments

@daraucla
Copy link

Hi,

We are getting an issue with convergence during the call to icanodes.py - see below. We're running the following in Python 3.6.1:

tedana -d BOLD_SmokeCues_MultiTE_10.nii BOLD_SmokeCues_MultiTE_10a.nii BOLD_SmokeCues_MultiTE_10b.nii -e 13.0 35.9 58.0

The job ran for 18 hours with 36 GBs RAM. Each 4D image file contains 288 volumes.

Any ideas what may be the root of the issue?

/u/local/apps/python/3.6.1/lib/python3.6/site-packages/h5py/init.py:34: FutureWarning: Conversion of the second argument of is
subdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
/u/local/apps/python/3.6.1/lib/python3.6/site-packages/scipy/stats/stats.py:1633: FutureWarning: Using a non-tuple sequence for mu
ltidimensional indexing is deprecated; use arr[tuple(seq)] instead of arr[seq]. In the future this will be interpreted as an a
rray index, arr[np.array(seq)], which will result either in an error or a different result.
return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
/u/home/d/darag/.local/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated
in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note th
at the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
/u/home/d/darag/.local/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_te
sts is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
from numpy.core.umath_tests import inner1d
/u/home/d/darag/.local/lib/python3.6/site-packages/sklearn/grid_search.py:42: DeprecationWarning: This module was deprecated in ve
rsion 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. This module will
be removed in 0.20.
DeprecationWarning)
/u/home/d/darag/.local/lib/python3.6/site-packages/sklearn/learning_curve.py:22: DeprecationWarning: This module was deprecated in
version 0.18 in favor of the model_selection module into which all the functions are moved. This module will be removed in 0.20
DeprecationWarning)
Traceback (most recent call last):
File "/u/home/d/darag/.local/bin/tedana", line 11, in
load_entry_point('tedana==0.0.4', 'console_scripts', 'tedana')()
File "/u/home/d/darag/.local/lib/python3.6/site-packages/tedana/workflows/tedana.py", line 447, in _main
tedana_workflow(**vars(options))
File "/u/home/d/darag/.local/lib/python3.6/site-packages/tedana/workflows/tedana.py", line 400, in tedana_workflow
verbose=debug)
File "/u/home/d/darag/.local/lib/python3.6/site-packages/tedana/decomposition/eigendecomp.py", line 298, in tedica
smaps = icanode.execute(dd) # noqa
File "", line 1, in
File "/u/home/d/darag/.local/lib/python3.6/site-packages/mdp/signal_node.py", line 646, in execute
self._pre_execution_checks(x)
File "/u/home/d/darag/.local/lib/python3.6/site-packages/mdp/signal_node.py", line 521, in _pre_execution_checks
self._if_training_stop_training()
File "/u/home/d/darag/.local/lib/python3.6/site-packages/mdp/signal_node.py", line 500, in _if_training_stop_training
self.stop_training()
File "", line 1, in
File "/u/home/d/darag/.local/lib/python3.6/site-packages/mdp/signal_node.py", line 627, in stop_training
self._train_seq[self._train_phase][1](*args, **kwargs)
File "/u/home/d/darag/.local/lib/python3.6/site-packages/mdp/nodes/ica_nodes.py", line 140, in _stop_training
convergence = core(data)
File "/u/home/d/darag/.local/lib/python3.6/site-packages/mdp/nodes/ica_nodes.py", line 533, in core
raise mdp.NodeException(errstr)
mdp.NodeException: No convergence after 5000 steps

@tsalo
Copy link
Member

tsalo commented Oct 24, 2018

Just to make sure I understand, it ran for 18 hours on a single run for a single subject? Even if there was no convergence in the ICA, that seems like a very long time. I honestly have no clue why that would happen.

We've noticed that the ICA can fail to converge in certain circumstances, and have been discussing some possible solutions in #101 and offline. Two possible causes are that the current PCA selection method is removing too many components (something we're working on and hope to resolve soon) or that the preprocessing is changing the scaling of the data.

What preprocessing steps have you applied?

@daraucla
Copy link
Author

daraucla commented Oct 24, 2018 via email

@tsalo
Copy link
Member

tsalo commented Oct 24, 2018

You may want to apply some preprocessing (especially motion correction and slice timing correction), based on @handwerkerd's comment here. I haven't had similar issues, but @dowdlelt has mentioned playing around with the arguments kdaw and rdaw.

Would you be willing to share some of the outputs that are created? We can take a look at the T2* and S0 maps, along with the pcastate pickle file. We can figure out which/how many PCA components were retained to see from the pcastate file, and can make sure that the T2* map at least fits well with expectations.

@dowdlelt
Copy link
Collaborator

Definitely second preprocessing as a necessary step - unless these are ultra low motion subjects. Even in that case however there will likely be drift over time due to hardware heating up which will should be corrected by timeseries realignment.

Regarding kdaw and rdaw, those arguments can dramatically alter the number of components selected, or at least they have in the past for me. Going out on a limb and thinking that since it took ages for the ICA run that it may have selected a very large number of components. Reducing kdaw to 5, and rdaw to 0 is one option, based on one of the last commits at the original MEICA repository.

I've most often struggled with convergence when an very high number of components were selected, rather than a small number (say >250, in a run with 445 timepoints)

@daraucla
Copy link
Author

daraucla commented Oct 25, 2018 via email

@emdupre
Copy link
Member

emdupre commented Oct 25, 2018

Hi @daraucla ! Yes, we actually have an open PR (#143) to re-implement logging. Minimal preprocessing is definitely mandatory -- if there is any way we can make this clearer, please let me know ! I'm also anxious to hear how your running with minimally preprocessed data goes !

@daraucla
Copy link
Author

daraucla commented Oct 26, 2018 via email

@dowdlelt
Copy link
Collaborator

dowdlelt commented Oct 27, 2018

The filenames are a bit different from earlier implementations. The dn_ts_OC.nii is identical to the medn.nii file. This can be confirmed by looking back at the bitbucket code's meica.py, in particular, the following line:
export_result('TED/dn_ts_OC.nii','%s_medn' % (outprefix), "Denoised timeseries (including thermal noise), produced by ME-ICA %s"

EDIT - Didn't even think to check the github version it is the same there:
https://github.com/ME-ICA/me-ica/blob/6ae63c719c439d5aa202bb775230135796939425/meica.py#L760-L769

wherein meica copied the dn_ts_OC.nii file, renamed it with the prefix desired + medn.

The dn_ts_OC_T1c is another file that may be of interest, which reflects the removal of global noise via minimal image regression. There is paper (with Power and Kundu) which used a similar method (GODEC) to further reduce the impact of motion. Unfortunately I am unable to link it at the moment...nor can I dig up the issue in which prantikk mentioned the the ~ equivilence of GODEC and the T1c output. I'll try later...

EDIT:

Manuscript details here:
https://www.jonathanpower.net/paper-multiecho.html

and Kundu mentions similarity here
ME-ICA/me-ica#4

@emdupre emdupre added the question issues detailing questions about the project or its direction label Oct 30, 2018
@KirstieJane KirstieJane added this to the documentation milestone Oct 31, 2018
@KirstieJane KirstieJane added the documentation issues related to improving documentation for the project label Oct 31, 2018
@KirstieJane
Copy link
Member

Hi @daraucla & @dowdlelt! I'm just tidying up the issue list and it looks like we could do something to improve the user documentation to prevent future folks from having the same problems!!

Do either of you want to take that on? Basically - what would you have liked to read in the docs before you opened this question 👾

@tsalo
Copy link
Member

tsalo commented Nov 10, 2018

I was thinking that we could add in a FAQ/Common Issues section to either Processing pipeline details or Support and communication. Either that or we could incorporate warning boxes to relevant sections of Processing pipeline details.

Personally, I think a FAQ section would be optimal. We can incorporate some of the more important questions both from here and from NeuroStars in that section, including this one.

@jbteves
Copy link
Collaborator

jbteves commented Apr 20, 2019

I'd like to propose that we take @tsalo's comment above and turn it into an issue to update the FAQ. WDYT, @emdupre @KirstieJane @dowdlelt @handwerkerd ? Please tagged anybody that I missed pertinent to the discussion.

@tsalo
Copy link
Member

tsalo commented Apr 20, 2019

We do have a FAQ with information about convergence failure and the outputs page has info about the files one should care about. The convergence failure question reflects our current knowledge, but I think that it will change dramatically once we've finished the reliability analysis (i.e., once we know how big of a problem convergence failure is). I think we can put that on hold until the analysis is finished and can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation issues related to improving documentation for the project question issues detailing questions about the project or its direction
Projects
None yet
Development

No branches or pull requests

6 participants