This repository contains tools for preprocessing and analyzing EEG data alongside behavioral data. The primary tool is a data scrubber designed to clean and prepare data for analysis, specifically targeting reaction time data from an EEG-based video game experiment.
To run the data scrubber, you need to install Jupyter and associated libraries. It is recommended to download the Python Anaconda distribution, which includes most of the necessary packages. You can find it on the Anaconda website.
-
Start Jupyter Notebook:
jupyter notebook DataScrub.ipynb
This will open the Jupyter notebook in your web browser.
-
Run All Cells: If you do not need to modify the scrub file, you can run all cells by pressing
Ctrl+P
(orCmd+P
on Mac) and typingrun all cells
in the command search field. -
File Selection: Select your CSV file containing the EEG data when prompted. The scrubber assumes that your game file has the same name but with a
.txt
extension. For example, if you selectSubject1.csv
, it will processSubject1.txt
for the game data. -
Completion Check: Ensure the final cell printed "Done". If it did not, this usually indicates a missing Python module. Check for error messages and install any missing modules using
conda
orpip
.
Three structs are provided after the data scrubbing process. Fields within each struct match the structure of the EEG, game, and behavioral data:
- eeg: Fields separated by the type of EEG data. Includes a
config
field with extracted configuration information such as sampling rate and modes of operation. - behavior: Contains information about the muscle and headband signals.
- game: Contains information about the game, e.g., when key presses are rendered, or the type of stimuli presented.
The data are placed into terminal fields (fields that do not contain sub-fields) as follows:
- Non-time-based data: Single value, vector of values, or string.
- Time-based data: A
T x N
matrix whereT
is the number of time samples. The first column contains timestamps, followed byN-1
columns of data.
The data scrubber breaks down into four major sections:
- Preamble
- Scrubbing Section for EEG Data: Uses Pandas operations to create a DataFrame from the raw data.
- Scrubbing Section for Game Data: Similar approach as the EEG data, but adapted for game data specifics.
- Save Section: Saves the processed data into Python and Matlab compatible formats.
The provided Matlab library includes tools to aid in the selection of time inclusion periods. These are useful when working with lists of data that do not share the same exact timestamps.
- getTime: Gets slices of time from nested struct of data.
- applyTimes: Applies a set of inclusion ranges to every piece of data within a struct.
- unionTimes: Merges inclusion ranges.
- intersectTimes: Intersects inclusion ranges.
- cutSegments: Cuts out data from the struct in special time ranges.
- runGLM: Runs a general linear model to predict a target sequence based on EEG and behavioral data.
- combineSegments: Combines data from multiple subjects for analysis.
The reaction time data used in the analysis comes from an EEG-based video game experiment. Subjects played a game that required quick responses to visual and auditory stimuli, and their reaction times were recorded for further analysis.
Using the data scrubber and subsequent analysis tools, preliminary results (N=6) suggested - General linear models (GLM) applied to EEG data may predict a small piece of subject-level reaction times with more accuracy than chance. But more remains to be done on this front.
Examining linear model coefficients, frontal electrode beta band activity dominated reaction times prediction.
Predicting correct versus incorrect trial responses using EEG data remained challenging due to low error rates in the data set. Though this is not surprising, as correct/error often one of the hardest signals to pin down even with dense multi-electrode extracellular data.
The task involving audiovisual congruence revealed that reaction times are generally faster and more accurate when stimuli are congruent. This supports the hypothesis that multisensory integration plays a role in reaction time and accuracy.
Processed data can be saved in the following formats:
- Pickle Files: For Python/Numpy/Scipy analysis.
- Matlab Files: Separate structs for
eeg
,behavior
, andgame
, or a master struct containing all sub-structs.
This project is licensed under the MIT License. See the LICENSE file for details.