-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restrict find_matching_paths
to only look outside of derivatives/
#1127
Comments
It does, so your proposal sounds like a good solution! |
it does, but such a folder |
My idea was to only allow sub-X folders in the top level. If people wish to search the derivates folder, I think it would make more sense to change the 'root'-parameter in the call? |
I agree. I would also vote for ignoring derivatives by default. We wouldn't even know if the derivatives are BIDS-compliant |
okay, sounds reasonable to me, ignoring derivatives by default ... and if someone is really keen on getting matches there, they can pass the derivatives directory as the root. |
@kaare-mikkelsen do you want to make a PR for this? |
@sappelhoff yes and no. It's on my to-do list (I agree with @hoechenberger that it's a nice first contribution opportunity), but unfortunately that list is always growing with more urgent stuff. If someone is more able/eager to solve it, they should just do it. |
find_matching_paths
to only look outside of derivatives/
Description of the problem
This command is supposed to find 4 .set files for the given subject, from 4 different subjects:
setFilePaths = mb.find_matching_paths(root=root,extensions='.set',datatypes='eeg')
However, it finds 8. Inspecting these 8, it turns out that 4 of them are 'bastard-files' with directories matching the subejct/session directories of the correct files, but basenames matching .set files in a derivatives folder. Looking in path.py, it seems that
fpaths = _return_root_paths(root, datatype=datatypes, ignore_json=False)
looks in all subfolders beneath root, including 'derivatives', after which_filter_fnames
does not check whether anything has been collected accidentally. Finally,_fnames_to_bidspaths
creates bidspaths using the inferred subject/session folder based on the basename, but with root specifically set to the root command passed to `find_matching_paths which leads to these 'bastard' bidspaths.Steps to reproduce
search for bidsfiles in a root directory which also contains a derivatives folder with similar bids-formatted files inside.
Expected results
I suggest ensuring that "find_matching_paths" ignores 'sourcedata' and 'derivatives' when walking the folder tree.
Actual results
find_matching_paths returns too many bidspaths.
Additional information
I don't remember whether the BIDS standard requires subject folders be called 'sub-X', if so, it would be easy to only search folders matching that pattern. Otherwise, 'derivatives', 'sourcedata', 'code' and 'stimuli' might be good folders to ignore when searching for matching files.
The text was updated successfully, but these errors were encountered: