Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert data #104

Merged
merged 6 commits into from
Nov 11, 2020
Merged

Convert data #104

merged 6 commits into from
Nov 11, 2020

Conversation

lawhead
Copy link
Collaborator

@lawhead lawhead commented Nov 6, 2020

Overview

Added a module for data conversions that will be useful for data sharing. Implemented a conversion function to the EDF format.

Ticket

https://www.pivotaltracker.com/story/show/175193894

Contributions

  • Added the convert.py helper module to convert bcipy raw data to EDF format.
  • Added new constructor to parameters.py to make it easier to mock parameters in unit tests.
  • Added functionality to the triggers.py helper for reading in a triggers.txt file.
  • Demo script
  • Unit tests

Test

  • Run the unit tests.
  • Run the demo script against a data folder.
  • In a python REPL, use the demo function plot_edf to plot the result.

@lawhead lawhead requested a review from tab-cmd November 6, 2020 23:12
@tab-cmd tab-cmd requested a review from AlisterD November 10, 2020 17:28
----------
raw_data_dir - directory which contains the data to be converted. This
location must also contain a parameters.json configuration file.
output_path - optional path to write converted data; defaults to writing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to be updated

durations = trigger_durations(params) if use_event_durations else {}

with open(Path(data_dir, params['trigger_file_name']), 'r') as trg_file:
triggers = read_triggers(trg_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't correct the offset here, right? I may be following wrong. We should add an argument to handle any offset correction (if present)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_triggers is a new helper function that does correct the offset for the returned values. I can add a parameter to be able to return the values uncorrected, but I'm not sure if this would be useful without also providing the offset. Are there any instances in our code when we want these values without the offset correction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to provide it as an argument because there are both system and static offsets. It may be useful to have them provide it as argument. As long as it's handled though we can ask the group during the demo how useful that may or not be.

location must also contain a parameters.json configuration file.
output_path - optional path to write converted data; defaults to writing
a file named raw.edf in the raw_data_dir.
overwrite - If True, the destination file (if it exists) will be overwritten.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of parameters to add

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch.

from bcipy.helpers.parameters import Parameters


def sample_data(rows: int = 1000, ch_names: List[str] = ['c1', 'c2',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a better place for this function to live - acquisition? Otherwise, let's pull in the generate random data function to populate the channel_data

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generation of sample data should definitely come from the acquisition generators. I'll update the code here.

As far as reading and writing a raw_data file, I think we're missing an abstraction/module. This logic is repeated a number of places throughout the code. I will create a followup ticket to clean this up since it affects a number of places.

"""
# Mock the raw_data file
sep = '\r\n'
meta = sep.join([f'daq_type,LSL', 'sample_rate,256.0'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linting

channel_data = [str(random.uniform(-1000, 1000)) for _ in range(3)]
trg = triggers_by_time.get(timestamp, NONE_VALUE)
channel_data = [
str(random.uniform(-1000, 1000)) for _ in range(len(ch_names))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to standardize how we mock our data - I think we have a random data generator function we can use / extend for your cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. I'll update this to use the acquisition module.

@@ -16,4 +16,5 @@ pandas==1.1.3
psutil==5.7.2
Pillow==8.0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned needing to upgrade mne for this - what version are you on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing and the issue I was encountering wasn't related to mne. I left the version as-is for this review so we could do more stress-testing on it.

@lawhead lawhead merged commit 0b74181 into 1.4.3 Nov 11, 2020
@lawhead lawhead deleted the convert-data branch November 11, 2020 21:58
@tab-cmd tab-cmd mentioned this pull request Jan 20, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants