-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce intermediate outputs #17
Comments
I don't like having an |
I'd like to continue the discussion started in #133 regarding intermediate outputs and the possible addition of a new argument to control their generation here. Here is a hopefully comprehensive list of files that are generated. I've taken the liberty of bolding files that I think should be optional.
If
|
Filename | Content |
---|---|
t2ss.nii | Voxel-wise T2* estimates using ascending numbers of echoes, starting with 2. |
s0vs.nii | Voxel-wise S0 estimates using ascending numbers of echoes, starting with 2. |
t2svG.nii | Full T2* map/time series. |
s0vG.nii | Full S0 map/time series. |
__meica_mix.1D | Mixing matrix (component time series) from ICA decomposition. |
hik_ts_e[echo].nii | High-Kappa time series for echo number echo |
midk_ts_e[echo].nii | Mid-Kappa time series for echo number echo |
lowk_ts_e[echo].nii | Low-Kappa time series for echo number echo |
dn_ts_e[echo].nii | Denoised time series for echo number echo |
If global signal correction is employed:
Filename | Content |
---|---|
T1gs.nii | Spatial global signal |
glsig.1D | Time series of global signal from optimally combined data. |
tsoc_orig.nii | Optimally combined time series with global signal retained. |
tsoc_nogs.nii | Optimally combined time series with global signal removed. Same as ts_OC.nii when GSR is used. |
If T1-GS correction is employed:
Filename | Content |
---|---|
sphis_hik.nii | T1-like effect |
hik_ts_OC_T1c.nii | T1 corrected high-kappa time series by regression |
dn_ts_OC_T1c.nii | T1 corrected denoised time series |
betas_hik_OC_T1c.nii | T1-GS corrected high-kappa components |
meica_mix_T1c.1D | T1-GS corrected mixing matrix |
In order to binderize the walkthrough notebooks, we'll also need the following. All of these should be optional.
Filename | Content | Reason |
---|---|---|
adaptive_mask.nii | Adaptive mask. Each voxel has value corresponding to number of echoes with good signal. | Needed to show adaptive mask. |
mask.nii | Binary mask of voxels with good data. | Applied to all other images needed for walkthrough. |
tsoc_whitened.nii | Optimally combined data after dimensionality reduction with PCA. | Needed to show time series plot of whitened vs. original OC data (i.e., to show impact of TEDPCA). |
meica_betas_catd.nii | Echo-specific weight maps for each of the ICA components. | Needed to show how the component weights align with predicted weights from S0 and R2 models in line plots. |
meica_metric_weights.nii | Weight maps used to average metrics (R2 F, S0 F, predicted R2 model values, and predicted S0 model values) in the same manner as fitmodels_direct . |
Needed to show how the component weights align with predicted weights from S0 and R2 models in line plots. |
meica_R2_pred.nii | Echo-specific maps of predicted values for R2 model for each component. | Needed to show how the component weights align with predicted weights from R2 models in line plots. |
meica_S0_pred.nii | Echo-specific maps of predicted values for S0 model for each component. | Needed to show how the component weights align with predicted weights from S0 models in line plots. |
I don't have time to comment on each file, but I wanted to point out some key things. As of now, I think comp_table_ica.txt is where the province and selection metrics for each ICA component are stored. If this ends up being stored elsewhere, that's fine, but, until this, this is vital to save. I have never used lowk_ts_OC.nii midk_ts_OC.nii & hik_ts_OC.nii for anything meaningful and those are very easy to regenerate if you have meica_mix.1D, betas_OC.nii, & comp_table_ica.txt |
@handwerkerd Thanks for the feedback. I suppose this is a general question for everyone, but how much do we expect regular tedana users to examine and/or manually perform the component selection? To be honest, I assumed that that would be something only power users would do, and those users would have We also want to generate visual reports, which should, at minimum, include the following: component time series, component maps, component statistics (Kappa, Rho, and variance explained). Should the report be, by default, in addition to related files (comp_table_pca.txt, comp_table_ica.txt, mepca_mmix.1D, meica_mmix.1D, betas_OC.nii, and feats_OC2.nii), or in lieu of those files? I think it's worth it to keep the high-Kappa time series, but dropping low-Kappa and mid-Kappa makes sense. I don't want regular users to have to regenerate files, but if no one ever uses those files then there's no reason to keep them. |
Skimming the components & where they are classified his highly recommended for all users since odd things do happen. One of the top requests for help I get is from end users who see a component that is clearly misclassified & they want to know how to either add it back in or remove it. I've slacked on setting up a mockup of the report, but I think it's good to have these information in formats that are easily access by programs. The viewing-friendly report will either copy some of that information into another format or access the files where that information resides. There's really no end use application for the high kappa time series. They're sometimes useful to figure out what's going on in a weird dataset, but they shouldn't be used in analyses so I don't think there's a need to save them by default. |
High kappa can be optional too, I guess. I've updated the tables so that the component tables, mixing matrices, and betas file are all required. Should we use |
* Use pandas for component tables. * Write out MEPCA component maps. * Fix. * Update. * Fix tedpca selection. * Re-add duplication. * More cleanup. * Add decision information to ICA compatible. * Add metrics to PCA component table. * Undo some refactoring. * Edit test failed file. * Undo refactor. * Fix style issues. * Add pandas to requirements. * Update requirements. * Fix tests. * Update test just to test a thing. * Fix how writeresults is called. * Add orphans to ignored and revert integration test. * Clean up MEICA component table. * Address review comments and track reasons to ignore components. * midkreg --> midkrej. Whoops! * Address some review comments. - Add docstring for comptable output in selcomps. - “variance explained 2 (normalized)” —> “normalized variance explained” - “variance explained (normalized)” —> “normalized variance explained” * Split PCA off into a separate function and rename some variables * Remove writect function. * Remove writect from init file too. * Remove unused import. * Rename run_pca to run_svd and improve documentation. * Add MLEPCA citation. * Remove mlepca flag and consolidate SVD functions. * Change markdown to rst and add figures. * Update file name. * Update file name. * Update documentation. * Update outputs file. * Document derivatives. * Fix some RST formatting issues in README. * rename model.monoexponential to decay - tedana.model.fit_decay => tedana.decay.fit_decay - tedana.model.fit_decay_ts => tedana.decay.fit_decay_ts ref: #135 * rename model.combine to combine breaks: tedana.model.make_optcom not sure if this should first be depricated ref: #135 * move io and utils into separate modules ref: #135 * move utils.new_nii_like to io.new_nii_like ref: #135 * move utils.filewrite to io.filewrite ref: #135 * move utils.load_data to io.load_data ref: #135 * revert docs/api.rts to a534c1e the changes did not work correctly, Failed to import 'tedana.io': no module named tedana.io * fix: remove duplicate imports * Add notebooks and figures for pipeline description page. * Fix some formatting. * move io.gscontrol_mmix to utils suggested by @tsalo * update api with the changes * fix test failure circular dependency circular dependency between io and utils after gscontrol_mmix move to utils ref: 5889135 * Revert "fix test failure circular dependency" This reverts commit 2200cd8. * Revert "move io.gscontrol_mmix to utils" This reverts commit 5889135. * add io.gscontrol_mmix to api.rst * Update multi-echo.rst * Update multi-echo.rst * Update figures in walkthrough. * Add yellow_heart emoji to README. * Split README into rst (for site) and md (for GitHub). * Update requirements. * Fix updated requirements. * Revert changes. * Try changing the tests a bit. * Wow... I had the files switched. * Fix test. * Fix test again. * Revert changes to tests. * Update README.md Added notes for creating a conda environment for use with tedana Added links to dependencies * [FIX] Logging in tedana and t2smap Closes: #127 Changes proposed in this pull request: - use logging.basicConfig to fix logging level for tedana and t2smap see: https://docs.python.org/3.7/howto/logging.html#logging-from-multiple-modules * Update list of workflow outputs on RTD site. * Fix docstrings. * Add note!!! * Fix MMIX dimensions. * Remove unused import. * Re-run workflow and re-make figures. - Change example voxel to one more affected by T1c-GSR. - Use SVD+decision tree for TEDPCA instead of MLEPCA (because MLEPCA doesn’t whiten this dataset). - Turn off GSR. * Change precision of floats in output tables and change midk to rejected. * Address review comments * Remove verbose output table * Update filenames in workflow docstring. * Change component table float precision from %g to %.6f. * Link workflow to outputs page. * Add Support and Communication page to docs. * Update line lengths, remove sys admin information * Fix old link. * [DOC] Remove getting started section * fix style errors - E127 continuation line over-indented for visual indent - E128 continuation line under-indented for visual indent * fix filewrite call filewriter is moved to io * Hackily "fix" bugs in component selection and model fitting. * Use tables in v2.5 component selection. * Remove unused import. * Revert TEDICA reindexing. I’d prefer to sort by variance explained, but Kappa is good too. * Clean up how metrics are added to ICA compatible.
@tsalo @emdupre do you believe that some of the above discussion should be updated in light of @dowdlelt's |
I think that our current set of optional (via verbose) and required outputs is good, although this issue may not accurately reflect them at this point. I think we can probably close this issue, to be honest. We've moved a lot of intermediate outputs over to verbose, which should address the original request. |
I'm ok to close this and create a new issue with a more specific request ! |
* Output codes in kundu.json * fixed kappa ratio * Update tedana/selection/selection_nodes.py Co-authored-by: Joshua Teves <[email protected]> * minimal tree keep kappa>2rho Co-authored-by: Joshua Teves <[email protected]>
* Decision tree refactor with minimal and kundu * Fix commented-out tedana workflow * Appease the style checker * All tremble before the mighty linter * Actually fix incorrect style checker issue * Unfix another style checker error * Attempt to make Black happy, even though it does not actually say what's wrong * ran black * Added elbows to reports * fixing kundu tree and added calc_median * kundu.json added comment * kundu kappa_elbow is GTE not GT * kundu dtm matches main and minimal updated * flake8 style fixes * fixed linting * fixed report elbow warning * removed unneeded second d_table calc function * Links building decision trees to index * Adds ComponentSelector to API docs * Set language to English * Fix dead nilearn link * Add load_config and ComponentSelector to API docs * Fix mixing matrix over-save bug * Separately modularized kappa & rho elbow calcs and created liberal rho elbow (#15) * kundu tree provisionalreject to unclassified * calc_rho_elbow progress * calc_rho_elbow done * Removed calc_varex_upper_p * Removed kappa_rho_elbow tests * both decision trees running * linting fixes * Enable tedana_reclassify as console script * No errors if no xcomp but also no decide_comps (#16) * Update tedana/io.py Co-authored-by: Taylor Salo <[email protected]> * Appease style checker * Appease the style checker? * Force to use up to date setuptools; installation bug otherwise * Remove out of date make entry * Create functional reclassify CLI * Replace blanks with n/a * Maybe appease black * Fix typo Co-authored-by: Eneko Uruñuela <[email protected]> * BIDSify some outputs * Appease black * Heavily revise ComponentSelector module docs * Fixing mid kappa A inconsistency (#17) * Output codes in kundu.json * fixed kappa ratio * Update tedana/selection/selection_nodes.py Co-authored-by: Joshua Teves <[email protected]> * minimal tree keep kappa>2rho Co-authored-by: Joshua Teves <[email protected]> * Drops 3.6 support * Remove 3.6 support from CircleCI tests * Reformat comment * Reduce line length * Update lint in Makefile * Correctly collect API submodule doc * Fix errors * Fix more sphinx * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst * Working on improving selector documentation (#18) * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst Co-authored-by: Joshua Teves <[email protected]> * Remove manual selection * Force user to pick a tree * Fix CLI test * Revert "Force user to pick a tree" This reverts commit 4fc656f. * Revert "Fix CLI test" This reverts commit 4038336. * Make kundu default tree * Attempt to fix error * Adds input data to registry * Revert "Adds input data to registry" This reverts commit c7349bd. * Adds input registration * Appease linter * Add class template start * Add previous workflow registry into new one * Fix failure to update tags and classifications in manual * Fix missing less likely BOOLD tag * Adds more useful reporting for unused metrics * Create generated metrics * Update line terminator * Force black to run before flake8 * Updates percentile call * more doc updates * fixed meica to v2.5 in docstrings * docs building again * more updates to building decision trees * improved docs (#19) * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst * more doc updates * fixed meica to v2.5 in docstrings * docs building again * more updates to building decision trees Co-authored-by: Joshua Teves <[email protected]> * Get rid of optional method keyword * Revert "Get rid of optional method keyword" This reverts commit e5fdec1. * Revert "Updates percentile call" This reverts commit 9d6a487. * Revert "Update line terminator" This reverts commit 8cf697c. * Autodocument ComponentSelector methods/attributes (#20) * Rename ComponentSelector module. * Document the ComponentSelector directly. * fixed rename of component_selector * Fixed remaining transition to component_selector (#21) * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst * more doc updates * fixed meica to v2.5 in docstrings * docs building again * more updates to building decision trees * fixed rename of component_selector Co-authored-by: Joshua Teves <[email protected]> * more doc updates * mostly classification_output_descriptions * Fixed io API and selector API warnings * message message * key parts of docs all updated * output_file_descriptions fully updated * filled testing gaps for component_selector * Updates integration test fnames * Try a numpy fix * Try again * Remove dead code * full selector coverage (#23) * Add tedana_reclassify tests * Actually add test to circle workflow * Maybe actually add it * Change o to outdir * Fix noreports maybe * Fix tedort * CircleCI are you okay? * Circle if you keep this up I will switch to Actions * Revert "Circle if you keep this up I will switch to Actions" This reverts commit ad29c0d. * Maybe silence duecredit and re-trigger Circle * Try something else * Guess that wasn't legal * Switch main to _main * Add to pyproject.toml * Force it to be editable * Add references to resources package * Dispose of sanity check * Add more reclassify tests * Adaptive mask is not a bool * Add label for setup.cfg * Revert "Adaptive mask is not a bool" This reverts commit f7db360. * Add resource files * Clarify variables * Update date and weep * Fixed NoLikelyBOLDBug (#24) * Fixed NoLikelyBOLDBug * Updated docs for Likely BOLD * Added note for when ICA will rerun * updated message * New verbose tag for more detailed logging. * at_least_num_exist to classification_doesnt_exist * Cleaned up selector logging output * fixed debug logging * Temporarily turn on force overwrite for redo ICA * Fixed I007 divergence * calc_varex_thresh now has num_highest_var_comps * fixed linting errors * Update integration test data * Adds csv and text file reading for manual acc/rej * Add tests for CustomEncoder * Adds bibtex warning check test * Appease linter * Fix unused metrics warning * Add reclassify tests and patches to test failures * Make stylistic changes. * Remove trailing whitespace. * Spacing in io. * More minor changes. * Add custom napoleon section "Generated Files" * Replace numTrue/numFalse with n_true/n_false. * Replace ifTrue/ifFalse with if_true/if_false. * Use fill_doc. * Style fixes. * more int32 * more int32 fun * Appease linter * Fixed style issues * Add RICA to Approach section of docs * Fixed CI style check failure * DTM documentation review (#30) * Standardization of usage descriptions * Minor grammar edits * Minor grammar/spelling edits * Update docs/faq.rst --------- * Rename reclassify force (#32) * changed tedana_reclassify and force * Added default messages to CLI workflows * clean up CLI default messages * added t2smap to function from CLI * style fix * Add defaults to --help output (#31) * added ica_reclassify to setup.cfg * Using a more persistent cache for the testing data (#33) * Cleans up how testing datasets are downloaded within test_integration.py. In Main & the current JT_DTM each dataset is downloaded in a slightly different way and the five-echo data are downloaded twice. * Added `data_for_testing_info` which gives the file hash location and local directory name for each of the four files we download. All tests are updated to use this function. * The local copy of testing data will now go into the `.testing_data_cache` subdirectory * The downloaded testing data will be in separate directories from the outputs so the downloaded directories can be completely static * When `download_test_data` is called, it will first download the metadata json to see if the last updated copy on osf.io is newer than the downloaded version and will only download if osf has a newer file. Downloading the metadata will happen frequently, but it will hopefully be fast. * The logger is now used to give a warning if osf.io cannot be accessed, but it will still run using cached data * Change to TestLGR.info * Fixing high variance classification mess (#34) * Added dec_reclassify_high_var_comps plus * clarified diff btwn rho_kundu and _liberal thresh * Clarified docs for minimal tree * Replace versioneer with hatch (#35) * Update gitignore. * Delete _version.py * Adopt new packaging. * Ignore the _version.py file. * Fix CI (#36) * Base the cache on pyproject.toml, not setup.cfg. * Also drop use of setup.py in publishing action. * Add flake8-pyproject as a requirement. (#37) * Try fixing coverage. (#38) * Improving ica_reclassify (#39) * ica_reclassify docs now rendering in usage.html * moves file parsing to ica_reclassify_workflow * added error checks and tests * Ica reclassify registry fixes (#42) * add pandas version check >= 1.5.2 and mod behavior (#938) * add version check and mod behavior if pandas >= 1.5.2 to prevent error in writing csv * formatting * adding P. Molfese --------- Co-authored-by: Molfese <[email protected]> * readded InputHarvester and expanduser * fixed handler base_dir path * mixing matrix file always in registry --------- Co-authored-by: Peter J. Molfese <[email protected]> Co-authored-by: Molfese <[email protected]> * Drop Python 3.6 and 3.7 support (#40) * Drop Python 3.6 and 3.7 support. * line_terminator --> lineterminator * added mixm to 4echo test (#43) * Updating Contributor Information (#41) * Some contributor updates * Added doc to Marco * Added flow charts and some text (#44) * Added flow charts and some text * Finished flow charts and text. Co-authored-by: marco7877 <[email protected]> --------- Co-authored-by: marco7877 <[email protected]> * RTDfix (#45) * Update documentation (#46) * Update docs. * Update docs/building_decision_trees.rst Co-authored-by: Dan Handwerker <[email protected]> --------- Co-authored-by: Dan Handwerker <[email protected]> * Output docs on one page (#47) * Output docs on one page * added new multi-echo lectures --------- Co-authored-by: Joshua Teves <[email protected]> Co-authored-by: handwerkerd <[email protected]> Co-authored-by: Taylor Salo <[email protected]> Co-authored-by: Eneko Uruñuela <[email protected]> Co-authored-by: handwerkerd <[email protected]> Co-authored-by: Taylor Salo <[email protected]> Co-authored-by: Eneko Uruñuela <[email protected]> Co-authored-by: Neha Reddy <[email protected]> Co-authored-by: Peter J. Molfese <[email protected]> Co-authored-by: Molfese <[email protected]> Co-authored-by: marco7877 <[email protected]> Co-authored-by: Taylor Salo <[email protected]>
Getting generate metrics fully running
From @emdupre on November 15, 2017 16:23
The
niwrite
function is called throughouttedana
to output many intermediate files; however, these are poorly documented and of unclear value to the user. It would make more sense to only have these intermediate files output if the user provides a--verbose
flag.Copied from original issue: emdupre/tedana#2
The text was updated successfully, but these errors were encountered: