Fairseq-signals

Fairseq-signals is a collection of deep learning models for ECG data processing based on the fairseq.

We provide implementations of various deep learning methods on ECG data, including official implementations of our works.

List of implemented papers:

Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Lead-agnostic Self-supervised Learning for Local and Global Representations of Electrocardiogram*
3KG: Contrastive Learning of 12-Lead Electrocardiograms using Physiologically-Inspired Augmentations
CLOCS: Contrastive Learning of Cardiac Signals Across Space, Time, and Patients
wav2vec 2.0: A Framework for Self-supervised Learning of Speech Representations
A Simple Framework for Contrastive Learning of Visual Representations
ECG-FM: An Open Electrocardiogram Foundation Model*

* denotes for an official implementation

We will keep implementing new methods in this repo. If you have any recommendations, please contact us via an issue or an e-mail.

Requirements and Installation

PyTorch version >= 1.5.0
Python version >= 3.6, and <= 3.9
PIP version <= 24.0; if your pip version is higher than 24.0, please run:
```
pip install pip==24.0
```
For training new models, you'll also need an NVIDIA GPU and NCCL
To install fairseq-signals from source and develop locally:

git clone https://github.com/Jwoo5/fairseq-signals
cd fairseq-signals
pip install --editable ./

To preprocess ECG datasets: pip install pandas scipy wfdb
To build cython components: python setup.py build_ext --inplace
For large datasets install PyArrow: pip install pyarrow

Getting Started

For uni-modal tasks (ECG Classification, ...)

Prepare ECG dataset

We provide pre-processing codes for various ECG datasets.

Pre-process

Given a directory that contains WFDB directories to be pre-processed for PhysioNet2021:

$ python fairseq_signals/data/ecg/preprocess/preprocess_physionet2021.py \
    /path/to/physionet2021/ \
    --dest /path/to/output \
    --workers $N

Given a directory that contains .dat files from PTB-XL:

$ python fairseq_signals/data/ecg/preprocess/preprocess_ptbxl.py \
    /path/to/ptbxl/records500/ \
    --dest /path/to/output

Prepare data manifest

Given a directory that contains pre-processed data:

$ python fairseq_signals/data/ecg/preprocess/manifest.py \
    /path/to/data/ \
    --dest /path/to/manifest \
    --valid-percent $valid

For patient identification:

$ python fairseq_signals/data/ecg/preprocess/manifest_identification.py \
    /path/to/data \
    --dest /path/to/manifest \
    --valid-percent $valid

Please fine more details about pre-processing and data manifest from here.

For multi-modal tasks (Multi-modal pre-training or ECG question answering)

Prepare ECG dataset

We provide pre-processing codes for the following datasets.

Pre-process

For multi-modal pre-training of ECGs with reports using the PTB-XL dataset:

$ python fairseq_signals/data/ecg_text/preprocess/preprocess_ptbxl.py \
   /path/to/ptbxl \
   --dest /path/to/output \

For multi-modal pre-training of ECGs with reports using the MIMIC-IV-ECG dataset:

$ python fairseq_signals/data/ecg_text/preprocess/preprocess_mimic_iv_ecg.py \
   /path/to/mimic-iv-ecg \
   --dest /path/to/output \

For ECG Question Answering task with the ECG-QA dataset:

Map ecg_id to the corresponding ECG file path (you can find these scripts in the ECG-QA repository)

For PTB-XL-based ECG-QA:

$ python mapping_ptbxl_samples.py ecgqa/ptbxl \
    --ptbxl-data-dir $ptbxl_dir \
    --dest $dest_dir

For MIMIC-IV-ECG-based ECG-QA:

$ python mapping_mimic_iv_ecg_samples.py ecgqa/mimic-iv-ecg \
    --mimic-iv-ecg-data-dir $mimic_iv_ecg_dir \
    --dest $dest_dir

Preprocess ECG-QA and prepare manifests

$ fairseq_signals/data/ecg_text/preprocess/preprocess_ecgqa.py /path/to/ecgqa \
    --dest /path/to/output \
    --apply_paraphrase

You don't need to run additional scripts to prepare manifest files for ECG-QA dataset since it automatically generates manifest files during the pre-processing process.

Prepare data manifest

Given a directory that contains pre-processed PTB-XL data:

$ python fairseq_signals/data/ecg_text/preprocess/manifest.py \
    /path/to/data \
    --dest /path/to/manifest \
    --valid-percent $valid

Please find more details about pre-processing and data manifest here.

Examples

We provide detailed READMEs for each model implementation:

* denotes for an official implementation

Contact

If you have any questions or recommendations, please contact us via an issue or an e-mail.

[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 494 Commits
examples		examples
fairseq_cli		fairseq_cli
fairseq_signals		fairseq_signals
scripts/preprocess		scripts/preprocess
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fairseq-signals

List of implemented papers:

Requirements and Installation

Getting Started

For uni-modal tasks (ECG Classification, ...)

Prepare ECG dataset

Pre-process

Prepare data manifest

For multi-modal tasks (Multi-modal pre-training or ECG question answering)

Prepare ECG dataset

Pre-process

Prepare data manifest

Examples

Contact

About

Releases

Packages

Contributors 3

Languages

License

Jwoo5/fairseq-signals

Folders and files

Latest commit

History

Repository files navigation

Fairseq-signals

List of implemented papers:

Requirements and Installation

Getting Started

For uni-modal tasks (ECG Classification, ...)

Prepare ECG dataset

Pre-process

Prepare data manifest

For multi-modal tasks (Multi-modal pre-training or ECG question answering)

Prepare ECG dataset

Pre-process

Prepare data manifest

Examples

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages