Code for COLING'22 paper "TreeMAN: Tree-enhanced Multimodal Attention Network for ICD Coding"
- Place MIMIC-III dataset files under
data/mimic-data/
- The id files under
data/preprocessed/
are from caml-mimic data/preprocessed/top50_icds.txt
contains the top 50 icd codes (same as caml-mimic/dataproc_mimic_III.ipynb)
- Preprocess dataset:
preprocess.py
- Preprocess dataset for decision tree:
tree_datasets.py
- Train decision trees and generate the leaf information:
tree_method.py
- Train word embedding:
text_models/word_embed.py
(modify this file to change it's config) - Train TreeMAN, predict with TreeMAN and evaluate the result:
run.py
To change the configuration, see conf.py
which contains all configuration for this project (except the configuration for training word embedding).