LABRAD-OR: Lightweight Memory Scene Graphs for Accurate Bimodal Reasoning in Dynamic Operating Rooms
PDF Paper: https://link.springer.com/content/pdf/10.1007/978-3-031-43996-4_29.pdf?pdf=inline%20link
PDF Book: https://link.springer.com/content/pdf/10.1007/978-3-031-43996-4.pdf
Authors: Ege Özsoy, Tobias Czempiel, Felix Holm , Chantal Pellegrini, Nassir Navab
@inproceedings{Özsoy2023_LABRAD_OR,
title={LABRAD-OR: Lightweight Memory Scene Graphs for Accurate Bimodal Reasoning in Dynamic Operating Rooms},
author={Ege Özsoy, Tobias Czempiel, Felix Holm, Chantal Pellegrini, Nassir Navab},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
year={2023},
organization={Springer}
}
@Article{Özsoy2023,
author={{\"O}zsoy, Ege
and Czempiel, Tobias
and {\"O}rnek, Evin P{\i}nar
and Eck, Ulrich
and Tombari, Federico
and Navab, Nassir},
title={Holistic OR domain modeling: a semantic scene graph approach},
journal={International Journal of Computer Assisted Radiology and Surgery},
year={2023},
doi={10.1007/s11548-023-03022-w},
url={https://doi.org/10.1007/s11548-023-03022-w}
}
@inproceedings{Özsoy2022_4D_OR,
title={4D-OR: Semantic Scene Graphs for OR Domain Modeling},
author={Ege Özsoy, Evin Pınar Örnek, Ulrich Eck, Tobias Czempiel, Federico Tombari, Nassir Navab},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
year={2022},
organization={Springer}
}
@inproceedings{Özsoy2021_MSSG,
title={Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical Procedures},
author={Ege Özsoy, Evin Pınar Örnek, Ulrich Eck, Federico Tombari, Nassir Navab},
booktitle={Arxiv},
year={2021}
}
The 4D-OR Dataset itself, Human and Object Pose Prediction methods as well as the downstream task of role prediction are not part of this repository. Please refer to the original 4D-OR repository for information on downloading the dataset and running the 2D, 3D Human Pose Prediction and 3D Object Pose Prediction as well as the downstream task of Role Prediction.
- Recommended PyTorch Version: pytorch==1.10.0
- conda create --name labrad-or python=3.7
- conda activate labrad-or
- conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
cd
into scene_graph_prediction and runpip install -r requirements.txt
- Run
wget https://github.com/egeozsoy/LABRAD-OR/releases/download/v0.1/pretrained_models.zip
and unzip - (Optional) To use the pretrained models, move files from the unzipped directory ending with
.ckpt
into the folderscene_graph_prediction/scene_graph_helpers/paper_weights
cd
into pointnet2_dir and runCUDA_HOME=/usr/local/cuda-11.3 pip install pointnet2_ops_lib/.
Runpip install torch-scatter==2.0.9 torch-sparse==0.6.12 torch-cluster==1.5.9 torch-spline-conv==1.2.1 torch-geometric==2.0.2 -f https://data.pyg.org/whl/torch-1.10.0+cu113.html
As we built upon https://github.com/egeozsoy/4D-OR, the code is structured similarly.
cd
into scene_graph_prediction- To train a new visual only model which only uses point cloud, run
python -m scene_graph_prediction.main --config visual_only.json
. - To train a new visual only model which uses point cloud and images, run
python -m scene_graph_prediction.main --config visual_only_with_images.json
- To train labrad-or which only uses point clouds, run
python -m scene_graph_prediction.main --config labrad_or.json
. This requires the pretrained visual only model to be present. - To train labrad-or which uses point clouds and images, run
python -m scene_graph_prediction.main --config labrad_or_with_images.json
. This requires the pretrained visual only model to be present. - We provide all four pretrained models https://github.com/egeozsoy/LABRAD-OR/releases/download/v0.1/pretrained_models.zip. You can simply use them instead of retraining your own models, as described in the environment setup.
- To evaluate either a model you trained or a pretrained model from us, change the mode to
evaluate
in the main.py and rerun using the same commands as before- If you want to replicate the results from the paper, you can hardcode the corresponding weight checkpoint as
checkpoint_path
in the main.py
- If you want to replicate the results from the paper, you can hardcode the corresponding weight checkpoint as
- To infer on the test set, change the mode to
infer
again run one of the 4 corresponding commands. Again you can hardcode the corresponding weight checkpoint ascheckpoint_path
in the main.py- By default, evaluation is done on the validation set, and inference on test, but these can be changed.
- You can evaluate on the test set as well by using https://bit.ly/4D-OR_evaluator and uploading your inferred predictions.
- If you want to continue with role prediction, please refer to the original 4D-OR repository. You can use the inferred scene graphs from the previous step as input to the role prediction.