This repository contains code to process tabular and imaging data from the Parkinson's Progression Markers Initiative (PPMI) dataset using the Nipoppy framework.
Image collections
idaSearch.csv
- Advanced Search
- Check every box in "Display in result" column
- Check "DTI" + "MRI" + "fMRI" in "Modality"
Study data
- Study Docs: Data & Databases
Code_List_-__Annotated_.csv
Data_Dictionary_-__Annotated_.csv
- Subject Characteristics: Patient Status
Participant_Status.csv
- Subject Characteristics: Subject Demographics
Age_at_visit.csv
Demographics.csv
Socio-Economics.csv
- Medical History: Medical
Clinical_Diagnosis.csv
Primary_Clinical_Diagnosis.csv
- Motor Assessments: Motor / MDS-UPDRS
MDS-UPDRS_Part_I.csv
MDS-UPDRS_Part_III.csv
MDS_UPDRS_Part_II__Patient_Questionnaire.csv
MDS-UPDRS_Part_I_Patient_Questionnaire.csv
MDS-UPDRS_Part_IV__Motor_Complications.csv
- Non-motor Assessments: ALL
- All downloaded, though not all used or up-to-date
Benton_Judgement_of_Line_Orientation.csv
Clock_Drawing.csv
Cognitive_Categorization.csv
Cognitive_Change.csv
Epworth_Sleepiness_Scale.csv
Geriatric_Depression_Scale__Short_Version_.csv
Hopkins_Verbal_Learning_Test_-_Revised.csv
Letter_-_Number_Sequencing.csv
Lexical_Fluency.csv
Modified_Boston_Naming_Test.csv
Modified_Semantic_Fluency.csv
Montreal_Cognitive_Assessment__MoCA_.csv
Neuro_QoL__Cognition_Function_-_Short_Form.csv
Neuro_QoL__Communication_-_Short_Form.csv
QUIP-Current-Short.csv
REM_Sleep_Behavior_Disorder_Questionnaire.csv
SCOPA-AUT.csv
State-Trait_Anxiety_Inventory.csv
Symbol_Digit_Modalities_Test.csv
Trail_Making_A_and_B.csv
University_of_Pennsylvania_Smell_Identification_Test_UPSIT.csv
- Some search fields in LONI search tool cannot be trusted
- Examples:
Modality
Modality=DTI
can have anatomical images, and there are diffusion images withMRI
modality
Weighting
(underImaging Protocol
)- Some T1s have
Weighting=PD
- Some T1s have
- We classify image modalities/contrast only based on the
Image Description
column- This can also lead to issues, for example when a subject has the same description string for all of their scans. In that case, we manually determine the image modality/contrast and hard-code the mapping in
heuristic.py
for HeuDiConv
- This can also lead to issues, for example when a subject has the same description string for all of their scans. In that case, we manually determine the image modality/contrast and hard-code the mapping in
- Examples:
- LONI viewer sometimes shows seemingly bad/corrupted files but they are actually fine once we convert them
- Observed for some diffusion images (tend to have ~2700 slices according to the LONI image viewer)
- Some subjects have a huge amount of small DICOM files, which causes us to exceed the inode quota on
/scratch
- We opted to create SquashFS archives/filesystems, which count as 1 inode and can be mounted as a filesystem in Singularity container (using the
--overlay
argument). This is similar to how McGill/NeuroHub stores UK Biobank data on Compute Canada
- We opted to create SquashFS archives/filesystems, which count as 1 inode and can be mounted as a filesystem in Singularity container (using the
The tabular/ppmi_imaging_descriptions.json file is used to determine the BIDS datatype and suffix (contrast) associated with an image's MRI series description. It will be updated as new data is processed.
Here is a description of the available BIDS data and the tags that can appear in their filenames:
anat
- The available suffixes are:
T1w
,T2w
,T2starw
, andFLAIR
- Most images have an
acq
tag:- Non-neuromelanin images:
acq-<plane><type>
, where<plane>
is one of:sag
,ax
, orcor
(for sagittal, axial, or coronal scans respectively)<type>
is one of:2D
, or3D
- Neuromelanin images:
acq-NM
- Non-neuromelanin images:
- For some images, the acquisition plane (
sag
/ax
/cor
) or type (2D
/3D
) cannot be easily obtained. In those cases, the filename will not contain anacq
tag.
- The available suffixes are:
dwi
- All imaging files have the
dwi
suffix. - Most images have a
dir
tag corresponding to the phase-encoding direction. This is one of:LR
,RL
,AP
, orPA
- Images where the phase-encoding direction cannot be easily inferred from the series description string do not have a
dir
tag. - Some participants have multi-shell sequences for their diffusion data. These files will have an additional
acq-B<value>
tag, wherevalue
is the b-value for that sequence.
- All imaging files have the
Currently, only structural (anat
) and diffusion (dwi
) MRI data are supported. Functional (func
) data has not been converted to the BIDS format yet.
AttributeError: 'Dataset' object has no attribute 'StackID'
- Vincent previously had the same issue, unclear if/how it was fixed. Error could be because the images are in a single big DICOM instead of many small DICOM files
AssertionError: Conflicting study identifiers found
- Could be because all of a subject's DICOMs are pooled together in the
dicom_org
step, in which case this can be fixed by manually running HeuDiConv for each image
- Could be because all of a subject's DICOMs are pooled together in the
numpy.AxisError: axis 1 is out of bounds for array of dimension 1
- Only happened for one image so far
- See nipy/heudiconv#670 and nipy/nibabel#1245
AssertionError (assert HEUDICONV_VERSION_JSON_KEY not in json_)
- Thrown by HeuDiConv
AssertionError: we do expect some files since it was called (assert bids_files, "we do expect some files since it was called")
- Thrown by HeuDiConv
- Some subjects only have a single diffusion image (e.g.,
Ax DTI
), might not be usable - Some subjects have 2 diffusion images, but they have the same description string (e.g.,
DTI_gated
)- Checked some cases after BIDS conversion, and the JSON sidecars seem to have the same
PhaseEncodingDirection
(j-
)
- Checked some cases after BIDS conversion, and the JSON sidecars seem to have the same
- Some subjects have multi-shell sequences. Their files seem to follow the following pattern:
dir-PA
: 1B0
, 1B700
, 1B1000
, and 1B2000
imagedir-AP
: 4B0
images
- Some (~2 for
ses-BL
) subjects havedir-AP
for all their diffusion images- Seem to have 4
dir-AP
B0
images and 4 otherdir-AP
images (according to their description string)
- Seem to have 4
- Some diffusion images do not contain raw data, but rather tensor model results (
FA
,ADC
,TRACEW
). Some of these have been excluded before BIDS conversion, but not all of them