Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] --debug flag appear now in the help & documentation #385

Merged

Conversation

benoitberanger
Copy link
Contributor

Hello,

This PR is a solution for #361, as discussed with @tsalo.
The changes are :

  • make --debug visible in the help
  • added some help documentation about this --debug option
  • this --debug option is is now the default behavior

For now, I see 2 drawbacks that can easily be fixed :

  1. changing the behavior so the option can be controlled : --debug yes/no
    and / or
  2. create a new option separate such as --report or --log and separate if from the --debug option

What do you think ?

Best,
Benoît

- writting report is the default behaviour
- help is now provided for the --debug (enable report) option
@emdupre
Copy link
Member

emdupre commented Aug 21, 2019

Thanks so much, @benoitberanger !

I missed this conversation, so just to check in -- why are we changing this to be the default behavior ?

@benoitberanger
Copy link
Contributor Author

Proposition
Changing the default behavior of the log file is a proposition that I'd like to discuss with everyone.
Writing the log file takes only a few kB, and opening it is a simple way to check if the ICA converged of stalled after the maximum iterations is reached.

Why ?
Because when the workflow is done, I don't see how you can check, without the log file, if the ICA converged or not.

Side notes
In my usage of tedana I'm testing this workflow on several cohorts (controls, patients, different sequences, different pre-processing steps, ...), the procedure is performed on a computing cluster. I could save the logs of the jobs executed on the cluster, but that's messy. Saving the report.txt on the disk is a very comfortable way to save the info. Afterwards, I can parse the comp_table_ica.txt and report.txt to summarize and compare the number of volumes, number of accepted components after the decision tree, how many iterations for ICA, did it converge, how many accepted, ignored, rejected final components, etc.

@emdupre
Copy link
Member

emdupre commented Sep 6, 2019

Thanks for summarizing, @benoitberanger ! I think I understand the situation better, now.

The test failure is just because it's checking the outputs of the files, and it's not expecting the report.txt to be an output.

That being said, I'm not yet sure that this is something we should have on by default. For your specific situation: usually with e.g. SLURM schedulers, you should be able to request that the logs are written to a specific file. Or, could you just add the --debug flag to your calls ?

For a more general case: I think it's useful to have different levels of verbosity, depending on your interest. Maybe this is too optimistic, but I'd like to think (hope, aspire to) that whether the ICA converged won't be something that everyone wants access to, as it's assumed to be true. If it's something we definitely want folks to review, maybe we should think about putting it into the new report that @tsalo created or as a disclaimer on the QC images.

What do you think ?

@benoitberanger
Copy link
Contributor Author

@emdupre SLURM is writing log files, but only in one directory, not next to the nifti files being processed, making the parsing very difficult. The --debug flag is the best solution, as @tsalo suggested in #361.

I do agree different levels of verbosity is more ergonomic, and an option to write down the log file by default is debatable.
My point is the --debug flag does not appear in the help with tedana --help, or the documentation https://tedana.readthedocs.io/en/latest/usage.html

What do you think ?

@tsalo
Copy link
Member

tsalo commented Sep 9, 2019

I agree that debug needs to be included in the documentation. However, I think it would be better to make writing out the report and log files the default (i.e., divorce them from any arguments) and to keep debug off by default. The files are small enough that space is not an issue.

Sorry to be so late in joining this discussion.

@jbteves
Copy link
Collaborator

jbteves commented Sep 9, 2019

We should move discussion from this PR to the issue IMO.

@emdupre
Copy link
Member

emdupre commented Sep 9, 2019

We should move discussion from this PR to the issue IMO.

Since work has started, I'd prefer to keep it linked to the PR. That way we can more easily comment on the changes 😸

My point is the --debug flag does not appear in the help with tedana --help, or the documentation https://tedana.readthedocs.io/en/latest/usage.html

Absolutely ! Yes, this part of the PR I love. My concern is here:

https://github.com/ME-ICA/tedana/pull/385/files#diff-10962f3ba21a7497257aefd948c6c750R199

I think we're all converging on:

We should probably have a log file made by default and a verbose flag instead.

to quote @jbteves. So, that would be a bit more of an extensive change -- we could keep the --debug argument (off by default), and when it was called the log files would have more information. But the log files should be written by default. This would also mean stepping through our logging right now and making sure everything is at the appropriate level we want (e.g., DEBUG).

That might be out of scope for this PR, in which case I think this one could just document that --debug exists. WDYT ?

@benoitberanger
Copy link
Contributor Author

OK I try to summarize :

  • report.txt is written by default
  • tedana_run.txt is written by default
  • --debug flag is disabled by default, and when activated increase the verbosity of tedana_run.txt, using logging.basicConfig(level=logging.DEBUG

Does everyone agree ?

Side question : is everyone ok with "tedana_run.txt" ? I would like to change the name in something a bit more meaningful, such as tedana.log / tedana_log.txt / log.txt

@jbteves
Copy link
Collaborator

jbteves commented Sep 10, 2019

I have no strong feelings about tedana_run.txt as the name; I chose the it fairly whimsically and there are no alternatives suggested in the original PR, so I doubt opinions are strong. You'll have to modify the name in the CircleCI config though.

Copy link
Collaborator

@jbteves jbteves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this behavior negates the option's existence, but I could be wrong. When I hear back I'll change my review status accordingly.

help=('Logs in the terminal will have increased '
'verbosity, and also will also be written '
'into a .txt file in the output directory.'),
default=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emdupre @tsalo I thought that store_true args could only store True; in other words, a default of True makes the argument always true. Is that incorrect? In some other code I'm working on that was my experience, but maybe I was handling it incorrectly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yeah, debug's default should be set to False here and in the function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but then in this setup it won't get written because it requires --debug to execute. Okay, this PR will unfortunately need changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I'm ok if this PR is only to make the debug option visible in the documentation, and if we make the additional changes in a separate PR ! It's up to @benoitberanger how much to tackle, here.
But definitely agree that regardless, this and the other boolean will need to change back to False 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected in e9a5fb1

@emdupre
Copy link
Member

emdupre commented Sep 10, 2019

Does everyone agree ?

Yes, that sounds exactly right to me ! Thanks @benoitberanger

Side question : is everyone ok with "tedana_run.txt" ? I would like to change the name in something a bit more meaningful, such as tedana.log / tedana_log.txt / log.txt

tedana.log sounds great. We could optionally timestamp it, too, so it'd look more like tedana-HHMMSS.log No strong preferences, just thinking out loud.

Copy link
Collaborator

@jbteves jbteves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting these changes @benoitberanger ! Due to the behavior of the argument parser, these changes will unfortunately have --debug behave in an always-on manner. Consequently, you will have to make the value default to False as before, and then remove the check for --debug's presence before creating the logfile. As I recall you'll basically need to edit the logic here only.

@tsalo
Copy link
Member

tsalo commented Sep 10, 2019

tedana.log sounds great. We could optionally timestamp it, too, so it'd look more like tedana-HHMMSS.log No strong preferences, just thinking out loud.

I agree that tedana.log is good, but we originally had the timestamps and those made checking the outputs in the CI difficult (impossible maybe?), which is why we switched to a single name with the backup (tedana_run_old.txt) system currently used.

@jbteves
Copy link
Collaborator

jbteves commented Sep 10, 2019

@emdupre I strongly oppose timestamps! It makes testing much more difficult since the filenames aren't static. You can put the timestamp at the top of the file instead.

@emdupre
Copy link
Member

emdupre commented Sep 10, 2019

I agree that tedana.log is good, but we originally had the timestamps and those made checking the outputs in the CI difficult (impossible maybe?), which is why we switched to a single name with the backup (tedana_run_old.txt) system currently used.

@emdupre I strongly oppose timestamps! It makes testing much more difficult since the filenames aren't static. You can put the timestamp at the top of the file instead.

Yes, I remember this now (thanks both for the reminder), but I wonder if the simpler solution isn't just to ignore them in testing. I think it's a much more natural pattern for users, and if they're written by default I think we should cater to users rather than developers.

EDIT: It could be as simple as adding a ! -name '*.log' to the end of this line:

find /tmp/data/five-echo/TED.five-echo/* \

(and to the 3 echo test as well, of course).

@jbteves
Copy link
Collaborator

jbteves commented Sep 10, 2019

@emdupre in what way do you think it's more natural? Timestamps only feel natural to me if the timing has some inherent meaning, for example if it marks an event that is time-sensitive, not a post-hoc data run. I would guess that non-power-users would find no timestamps more natural, and that power-users would be suited well enough by the old suffix, but maybe that's not correct. Also, sorry, I hadn't seen @tsalo's comment that made mine redundant.

@emdupre
Copy link
Member

emdupre commented Sep 10, 2019

Timestamps only feel natural to me if the timing has some inherent meaning, for example if it marks an event that is time-sensitive, not a post-hoc data run. I would guess that non-power-users would find no timestamps more natural, and that power-users would be suited well enough by the old suffix, but maybe that's not correct.

I think this is a really common pattern in software and I'm reticent to introduce a new one to users, especially since on our end it's straightforward to implement. Why I think it's a common pattern is if, for example, I'm submitting a job on a SLURM cluster I get output logs by default. These are each stamped with the unique job ID, so I can easily trace back to each subject as appropriate.

In the tedana context, a "new" and "old" qualifier won't work if, for example, I'm trying out all three of the PCA flags. it'd be more natural to give the time, so I can continue to experiment with parameters without looking at an "_old_old" file.

@jbteves
Copy link
Collaborator

jbteves commented Sep 10, 2019

That makes sense, I hadn't thought of that. I do have some programs that use the old appending instead, but I agree that on Linux environments it's quite uncommon. My only objection now is an emotional objection to reducing our test cases; I would feel reasonably cozy if we could still test for file existence. @tsalo can we check for something like tedana_run_*?

@benoitberanger
Copy link
Contributor Author

@jbteves @emdupre @tsalo thank you all for your comments.
I will correct the PR content about argument parsing behaving correctly (#385 (review)), and rename the PR title to something more related to it's final objective : make --debug flag appear in the help & documentation.

  • --debug flag appear now in the help & documentation : main objective of this PR
  • report.txt is written by default : already done before, not covered in the PR
  • Log file written by default is accepted, but it's name is still in debate

The log file written by default will be assessed in another PR, to keep development segmented.

@benoitberanger benoitberanger changed the title [ENH] Writting report.txt is now default [ENH] --debug flag appear now in the help & documentation Sep 10, 2019
@jbteves
Copy link
Collaborator

jbteves commented Sep 10, 2019

That sounds awesome @benoitberanger, thank you!

- typo correction
- set to False, according to the PR discussion
@emdupre emdupre requested a review from jbteves September 10, 2019 15:29
Copy link
Collaborator

@jbteves jbteves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks @benoitberanger! Feel free to squash and merge when ready. @all-contributors please add @benoitberanger for code.

@emdupre
Copy link
Member

emdupre commented Sep 19, 2019

This looks great, so I'm going to go ahead and merge. Thanks again, @benoitberanger !

@emdupre emdupre merged commit 2ebb34a into ME-ICA:master Sep 19, 2019
@emdupre
Copy link
Member

emdupre commented Sep 19, 2019

@all-contributors please add @benoitberanger for code contributions !

@allcontributors
Copy link
Contributor

@emdupre

I've put up a pull request to add @benoitberanger! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants