Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow returning candidate BIDSPaths #1083

Merged
merged 5 commits into from
Oct 15, 2022

Conversation

larsoner
Copy link
Member

For MNE-BIDS-Pipeline, we want to know the set of input files that each function uses during each step. In one step, we call raw_bids_path.find_empty_room(), which could use the sidecar or traverse a bunch of files (potentially) to find the best-matching empty-room file. This PR adds a return_candidates=False kwarg that will return None if the sidecar was used, or a list of BIDSPath instances if files were traversed.

@codecov
Copy link

codecov bot commented Oct 14, 2022

Codecov Report

Merging #1083 (d22a2f6) into main (bd3c92b) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1083   +/-   ##
=======================================
  Coverage   95.19%   95.19%           
=======================================
  Files          24       24           
  Lines        3847     3853    +6     
=======================================
+ Hits         3662     3668    +6     
  Misses        185      185           
Impacted Files Coverage Δ
mne_bids/path.py 97.60% <100.00%> (+0.01%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@hoechenberger
Copy link
Member

I don't understand why we need this?

We can already enforce to use the sidecar only. If that fails, we can call the function again without that restriction, and it will find a candidate empty-room recording based on recording dates.

Why do we need more than that?

@larsoner
Copy link
Member Author

If that fails, we can call the function again without that restriction, and it will find a candidate empty-room recording based on recording dates.

Let's only consider this use case -- where the sidecar isn't there -- for now since it's the only case that matters for this functionality.

In MNE-BIDS-Pipeline, we use file-based caching. For this to work, we need to know all input files that affect a given function.

We want to process empty-room data. To do this, we call find_empty_room. And we want this step to be cached because it can take tens of seconds for each subject all by itself on some filesystems. So "just call the function and see what it gives" is exactly what we're trying to avoid, as it is not always fast.

But to cache this step -- i.e., to know that run.find_empty_room() has not changed -- we need to know the set of candidate files run.find_empty_room() iterates over. In other words, find_empty_room implements what amounts to a argmin(abs(er.info['meas_date'] - run.info['meas_date']) for er in candidates). Without knowing these candidates, our caching can silently fail.

You could imagine a dataset update to any one of the existing empty-room files in an OpenNeuro dataset ("oops I date-shifted incorrectly!"), or adding a new empty-room file ("oops I forgot to include this in v1.0!") could change the find_empty_room result. Allowing the list of candidates to be output by the function would fix this stuff.

@larsoner
Copy link
Member Author

(Conceptually as well -- this just shows "behind the curtain" of the somewhat magical empty-room matching algorithm, so it's nice in terms of transparency to say "if you want to know what files we considered, here they are".)

@hoechenberger
Copy link
Member

Thanks for the explanation, this makes sense!

@agramfort
Copy link
Member

@larsoner can you just update what’s new ?

@larsoner
Copy link
Member Author

Done!

@sappelhoff sappelhoff merged commit 9231836 into mne-tools:main Oct 15, 2022
@sappelhoff
Copy link
Member

Thanks @larsoner 🚀

@larsoner larsoner deleted the find_files branch October 17, 2022 13:35
@larsoner larsoner mentioned this pull request Nov 1, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants