Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding format parameter to save function #159

Merged
merged 16 commits into from
Nov 9, 2024

Conversation

drpedapati
Copy link
Contributor

@drpedapati drpedapati commented Jul 7, 2024

This pull request adds a format and 'event_id' parameter to the pipeline.save() method in pipeline.py. A new pipeline method gets event_ids from existing event codes and annotations.

Changes Made

  • Added a format parameter to the pipeline.save() method in pipeline.py
  • Added a event_id parameter to the pipeline.save() method in pipeline.py
  • Updated relevant documentation to reflect these new parameters
  • Existing event codes retain their numeric values (i.e., 10003) and event codes derived from annotations are numbered after the highest existing event code.
  • Ensured backward compatibility with existing code

Motivation

When working with non-EDF formats, such as BrainVision, the current implementation leads to errors. This change allows for greater flexibility in output formats, improving the library's usability across different EEG data types. The event_id field perserves existing event codes, otherwise arbritary codes are assigned.

Potential error messages these changes address:

ValueError: The provided raw data contains annotations, but "event_id" does not contain entries for all annotation descriptions. The following entries are missing: BAD_LL_noisy, BAD_LL_noisy_ICs

added format parameter to pipeline save
updated documentation
added event-id
@scott-huberty
Copy link
Member

scott-huberty commented Jul 9, 2024

Thanks for the contribution! And sorry for the belated reply. I will try to take a look this week. This is the first time that the CI's have run in a while so I suspect that there will be CI failures that are unrelated to this PR.

@scott-huberty
Copy link
Member

I haven't forgot about this, you just caught me in a busy week last week. Will try again to get to it this week.

@drpedapati
Copy link
Contributor Author

drpedapati commented Jul 15, 2024 via email

@scott-huberty
Copy link
Member

Would you have some interest in closer collaboration in terms of helping optimize pipeline parameters for neurodevelopmental studies? We have over 5000 EEGs for autism, Fragile X, TSC etc. and happy to share authorship or joint grant applications.

Nice! My first research job was on a TSC study.

I'm happy to improve this pipeline in whatever ways will meet people's needs. This pipeline was initially developed for our neurodevelopmental EEG study so your use-case certainly sounds like one we'd want to support.

Regarding closer collaboration/papers/grants, this opportunity sounds interesting to me, but I want to give @christian-oreilly, the PI of this project and other core dev, a chance to weigh in (he is Out-of-Office until late July and so may be delayed in responding).

Copy link
Member

@scott-huberty scott-huberty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @drpedapati , thanks for your patience! Just a few nitpicks from me, left in code suggestions.

If I'm understanding correctly, this PR adds two new features:

1. Allow the user to save their EEG data in a format other than EDF, to a format is already supported by mne-bids (i.e. brain vision, fif, eeglab`).

I agree that we should support this. We initially forced the .edf format for historical reasons (the MATLAB lossless pipeline did this).

2. A new function, get_all_event_ids, that returns the event ids for all existing events and annotations in the raw object.

If I'm understanding correctly, the use-case for this new function would be:

pipeline.save(...,  event_id=pipeline.get_all_event_ids())

Which will effectively save the lossless annotations to the event.tsv sidecar file. Is that how you are using it?

Other thoughts:

  • The currently failing test is unrelated to this PR. MNE-Python switched from using edflib-python to edfio for exporting to EDF. We need to install edfio on the CI's, I will do this in another PR.
  • Since a user no long has to save their data to edf, then maybe we should make edifo an optional dependency?
  • It would be nice if we can add a tiny for get_all_event_ids, and its use with save. I can help with this.

Potential error messages these changes address:

ValueError: The provided raw data contains annotations, but "event_id" does not contain entries for all annotation descriptions. The following entries are missing: BAD_LL_noisy, BAD_LL_noisy_ICs

Is this an error that you were getting?

pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
pylossless/pipeline.py Show resolved Hide resolved
@drpedapati
Copy link
Contributor Author

drpedapati commented Jul 23, 2024 via email

@scott-huberty
Copy link
Member

@drpedapati no worries Ernie, no rush. In the mean time I may merge main into this branch so that we can make the CI's happy, and I might add a tiny test.

drpedapati and others added 9 commits July 23, 2024 19:20
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
Co-authored-by: Scott Huberty <[email protected]>
@christian-oreilly
Copy link
Collaborator

Would you have some interest in closer collaboration in terms of helping optimize pipeline parameters for neurodevelopmental studies? We have over 5000 EEGs for autism, Fragile X, TSC etc. and happy to share authorship or joint grant applications.

Thanks for contributing this PR, @drpedapati, and sorry for the late response due to vacation. As mentioned by @scott-huberty, we welcome input from the community and will make our best effort to integrate it so that this pipeline can best serve the needs of the community. @scott-huberty already answered this comment for his side; for mine, I can add that we (my lab at the University of South Carolina) are always happy to collaborate on topics related to EEG in ASD and neurodevelopment. This is a significant axis of our research, in collaboration with the broader Carolina Autism and Neurodevelopment Research Center. Do you have specific objectives in mind where you think our involvement could be beneficial? I am happy to discuss these at your convenience.

  1. Allow the user to save their EEG data in a format other than EDF, to a format is already supported by mne-bids (i.e. brain vision, fif, eeglab`).
    I agree that we should support this. We initially forced the .edf format for historical reasons (the MATLAB lossless pipeline did this).

I think there is a bit more than historical reasons and backward compatibility; PyLossless was designed to work with BIDS. If I remember well, BIDS (or the MNE support of BIDS) was oriented toward the EDF format. It now supports more formats (including BrainVision: https://bids-specification.readthedocs.io/en/stable/modality-specific-files/electroencephalography.html) and I think we should aim to support all formats supported by BIDS with as little format-specific code as possible (i.e., having external libraries such as mne-bids dealing with most of BIDS support directly).

Since a user no long has to save their data to edf, then maybe we should make edifo an optional dependency?

Agreed... but let's make sure that this is transparent to the user (e.g., having a very descriptive error message on how to pip install edfio if someone tries to output with EDF and does not install this optional dependency). I am wondering if we could install it on-the-fly (e.g., https://stackoverflow.com/a/55188705/1825043 ) when necessary. I am not sure if this is seen as a bad practice, however...

@drpedapati I assume you answered to this issue directly through email? If not too inconvenient, it is probably better to click on the "view on GitHub" link (or something to that effect) in the email and respond directly on GitHub. I am suggesting this because your answers currently copy all the history of the discussion which clutters the ticket and makes it more cumbersome to follow. I edited your previous answer to remove all this duplication. I made a local copy of this duplicated text in case you embedded some relevant info in this duplicated response that I missed. Just let me know if this is the case and I'll restore this duplicated text.

pylossless/pipeline.py Show resolved Hide resolved
pylossless/pipeline.py Show resolved Hide resolved
pylossless/pipeline.py Outdated Show resolved Hide resolved
@christian-oreilly christian-oreilly merged commit 0cc087b into lina-usc:main Nov 9, 2024
@christian-oreilly
Copy link
Collaborator

Apologies, @drpedapati, for not merging this earlier. Thanks for your contribution!

@scott-huberty
Copy link
Member

Thanks @drpedapati !!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants