This repository provides the official PyTorch implementation and reproduction for our SIGIR'24 paper titled "Leave No Patient Behind: Enhancing Medication Recommendation for Rare Disease Patients".
More descriptions are available via the paper.
If this work helps you, please kindly cite our papers:
@article{zhao2024leave,
title={Leave No Patient Behind: Enhancing Medication Recommendation for Rare Disease Patients},
author={Zhao, Zihao and Jing, Yi and Feng, Fuli and Wu, Jiancan and Gao, Chongming and He, Xiangnan},
journal={arXiv preprint arXiv:2403.17745},
year={2024}
}
-
Clone this git repository and change directory to this repository:
git clone https://github.com/zzhUSTC2016/RAREMed.git cd RAREMed/
-
A new conda environment is suggested.
conda create --name RAREMed
-
Activate the newly created environment.
conda activate RAREMed
-
Install the required modules.
sh install.sh
-
You must have obtained access to MIMIC-III and MIMIC-IV databases before running the code.
-
Download the MIMIC-III and MIMIC-IV datasets, then unzip and put them in the
data/input/
directory. Specifically, you need to download the following files from MIMIC-III:DIAGNOSES_ICD.csv
,PRESCRIPTIONS.csv
, andPROCEDURES_ICD.csv
, and the following files from MIMIC-IV:DIAGNOSES_ICD.csv
,PRESCRIPTIONS.csv
, andPROCEDURES_ICD.csv
. -
Download the drugbank_drugs_info.csv and drug-DDI.csv files, and put them in the
data/input/
directory.
Run the following command to process the data:
python process.py
If things go well, the processed data will be saved in the data/output/
directory. You can run the models now!
usage: main_RAREMed.py [-h] [-n NOTE] [--model_name MODEL_NAME] [--dataset DATASET] [--early_stop EARLY_STOP] [-t] [-l LOG_DIR_PREFIX] [-p PRETRAIN_PREFIX] [--cuda CUDA] [-s] [-e] [-nsp] [-mask] [--pretrain_epochs PRETRAIN_EPOCHS] [--mask_prob MASK_PROB] [--embed_dim EMBED_DIM] [--encoder_layers ENCODER_LAYERS] [--nhead NHEAD] [--batch_size BATCH_SIZE] [--adapter_dim ADAPTER_DIM] [--lr LR] [--dropout DROPOUT] [--weight_decay WEIGHT_DECAY] [--weight_multi WEIGHT_MULTI] [--weight_ddi WEIGHT_DDI]
RAREMed model training and evaluation script
optional arguments:
-h, --help show this help message and exit
-n NOTE, --note NOTE User notes
--model_name MODEL_NAME
model name (default: 'RAREMed')
--dataset DATASET dataset (default: 'mimic-iii')
--early_stop EARLY_STOP
early stop after this many epochs without improvement (default: 10)
-t, --test test mode
-l LOG_DIR_PREFIX, --log_dir_prefix LOG_DIR_PREFIX
log dir prefix like "log0", for model test (default: None)
-p PRETRAIN_PREFIX, --pretrain_prefix PRETRAIN_PREFIX
log dir prefix like "log0", for finetune (default: None)
--cuda CUDA which cuda (default: 6)
-s, --patient_seperate
whether to combine diseases and procedures
-e, --seg_rel_emb whether to use segment and relevance embedding layer (default: True)
-nsp, --pretrain_nsp whether to use nsp pretrain
-mask, --pretrain_mask
whether to use mask prediction pretrain
--pretrain_epochs PRETRAIN_EPOCHS
number of pretrain epochs (default: 20)
--mask_prob MASK_PROB
mask probability (default: 0)
--embed_dim EMBED_DIM
dimension of node embedding (default: 512)
--encoder_layers ENCODER_LAYERS
number of encoder layers (default: 3)
--nhead NHEAD number of encoder head (default: 4)
--batch_size BATCH_SIZE
batch size during training (default: 1)
--adapter_dim ADAPTER_DIM
dimension of adapter layer (default: 128)
--lr LR learning rate (default: 1e-5)
--dropout DROPOUT dropout probability of transformer encoder (default: 0.3)
--weight_decay WEIGHT_DECAY
weight decay (default: 0.1)
--weight_multi WEIGHT_MULTI
weight of multilabel margin loss (default: 0.005)
--weight_ddi WEIGHT_DDI
weight of ddi loss (default: 0.1)
Example:
python main_RAREMed.py -nsp -mask # pretrain and train
python main_RAREMed.py -t -l=log0 # test
For Baselines:
python main_GBert_pretrain.py
python main_GBert.py -p=0
python main_GBert.py -t -l=log1
python main_GAMENet.py
python main_GAMENet.py -t -l=log0
Thanks to Jing Yi(井怡) for her help in the implementation of the RAREMed model. This repository is partially based on the SafeDrug repository, you can find some additional details in the original repository.
Welcome to contact me [email protected] for any question.