Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MISC] Federated Chest X-ray Screening #1819

Draft
wants to merge 50 commits into
base: misc
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
28da542
Intial Commit Nodule Detection
Rakshith2597 Dec 7, 2022
c83f842
updates in readme
Rakshith2597 Dec 30, 2022
36f27fd
pylint fix 1
Rakshith2597 Jan 3, 2023
beaab80
updates to codes
Rakshith2597 Jan 9, 2023
efd425f
Error in ONNX to IR conversion for sumnet
Rakshith2597 Jan 10, 2023
a67acf0
fixed IR conversion issue, unit test for train and export added
Rakshith2597 Jan 11, 2023
0b48ba9
added unit tests for inference
Rakshith2597 Jan 12, 2023
049c4d3
updated readme
Rakshith2597 Jan 14, 2023
04b3436
pylint fixes
Rakshith2597 Jan 14, 2023
09b8f4e
Add files via upload
Kasliwal17 Feb 9, 2023
4975a5d
Correcting the documentary for arguments in training and inference sc…
Kasliwal17 Feb 11, 2023
80a8fc2
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/test/t…
Kasliwal17 Feb 13, 2023
d0e6c9a
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/test/t…
Kasliwal17 Feb 13, 2023
8b847d8
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/test/t…
Kasliwal17 Feb 13, 2023
41cdb38
Update README.md
Kasliwal17 Feb 20, 2023
4976df1
Update init_venv.sh
Kasliwal17 Feb 20, 2023
a6cd217
Update setup.py
Kasliwal17 Feb 20, 2023
1ae065c
Update inference_utils.py
Kasliwal17 Feb 20, 2023
14e3e76
Update model.py
Kasliwal17 Feb 20, 2023
00de0c2
Update transformations.py
Kasliwal17 Feb 20, 2023
b498cc5
Update test_export.py
Kasliwal17 Feb 20, 2023
d5f8420
Updating loss.py
Kasliwal17 Feb 20, 2023
449711d
Update get_config.py
Kasliwal17 Feb 20, 2023
acddf37
removing pycache
Kasliwal17 Feb 20, 2023
9070926
Update train_utils.py
Kasliwal17 Feb 20, 2023
10a9f16
Update misc.py
Kasliwal17 Feb 20, 2023
df3be4d
Update get_config.py
Kasliwal17 Feb 20, 2023
117b408
pylint fixes
Kasliwal17 Feb 21, 2023
bac5baf
Update README.md
Kasliwal17 Feb 21, 2023
fab6f10
adding relative paths
Kasliwal17 Feb 21, 2023
5030e9f
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/src/ut…
Kasliwal17 Feb 21, 2023
efbb424
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/src/ut…
Kasliwal17 Feb 21, 2023
18b8fbe
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/src/ut…
Kasliwal17 Feb 21, 2023
e0a5b29
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/src/ut…
Kasliwal17 Feb 21, 2023
2404d4f
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/src/ut…
Kasliwal17 Feb 21, 2023
79e5502
Update misc/pytorch_toolkit/chest_xray_screening_federated_gcn/src/ut…
Kasliwal17 Feb 21, 2023
e838f36
Update test_export.py
Kasliwal17 Feb 21, 2023
023a75a
splitting train_utils
Kasliwal17 Feb 21, 2023
61d3827
updating model weights links
Kasliwal17 Feb 21, 2023
d515b1c
Update README.md
Kasliwal17 Feb 21, 2023
82a988b
splitting train_utils file
Kasliwal17 Feb 21, 2023
73ad71a
pylint changes
Kasliwal17 Feb 21, 2023
2a7481d
pylint changes
Kasliwal17 Feb 21, 2023
181fad0
Update test_inference.py
Kasliwal17 Feb 21, 2023
fdb9563
Updating dataset links
Kasliwal17 Feb 22, 2023
3ead981
Update README.md
Kasliwal17 Feb 22, 2023
f530e50
Merge pull request #4 from Kasliwal17/bmi8_new
Rakshith2597 Feb 23, 2023
50db8a9
Merge branch 'misc' of https://github.com/openvinotoolkit/training_ex…
Rakshith2597 Feb 23, 2023
52ec7b5
Merge branch 'bmi8_new' of https://github.com/Rakshith2597/training_e…
Rakshith2597 Feb 23, 2023
591d2ce
updated readme
Rakshith2597 Feb 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 168 additions & 0 deletions misc/pytorch_toolkit/chest_xray_screening_federated_gcn/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# Federated Learning for Site Aware Chest Radiograph Screening

The shortage of Radiologists is inspiring the development of
Deep Learning (DL) based solutions for detecting cardio, thoracic and pulmonary pathologies in Chest radiographs through
multi-institutional collaborations. However, sharing the training data across multiple sites is often impossible due to privacy, ownership and technical challenges. Although Federated
Learning (FL) has emerged as a solution to this, the large
variations in disease prevalence and co-morbidity distributions
across the sites may hinder proper training. We propose a DL
architecture with a Convolutional Neural Network (CNN) followed by a Graph Neural Network (GNN) to address this issue.
The CNN-GNN model is trained by modifying the Federated
Averaging algorithm. The CNN weights are shared across all
sites to extract robust features while separate GNN models are
trained at each site to leverage the local co-morbidity dependencies for multi-label disease classification. The CheXpert
dataset is partitioned across five sites to simulate the FL set
up. Federated training did not show any significant drop in
performance over centralized training. The site-specific GNN
models also demonstrated their efficacy in modelling local disease co-occurrence statistics leading to an average area under
the ROC curve of 0.79 with a 1.74% improvement.

Figure below shows the overall schematic diagram of federated learning proposed in <a href="#comp_journal">[1]</a>
<img src = "./media/scheme.png" width=650>

## The proposed CNN-GNN architecture

<img src = "./media/architecture.jpeg" width=650>

Separate Fully Connected (FC) layers are employed to obtain different 512-D
features for each class. These are used as node features to construct a graph whose edges capture the co-occurrence dependencies
between the classes at each site. The graph is provided as input to a Graph Neural Network to obtain the prediction labels for
each node. The entire CNN-GNN architecture is trainable in an end-to-end manner.

## Results

The overall performance of the proposed CNN-GNN is
<img src = "./media/results.jpeg" width=650>

## Model

Download `.pth` checkpoint for CNN-GNN model with the following [link](http://kliv.iitkgp.ac.in/projects/miriad/model_weights/bmi8/model_weights_w_gnn.zip).

> Note: The ONNX and IR representation models accepts inputs of fixed size mentioned in configuration file. This needs to be updated based on the input size.

> Note: PyTorch to ONNX conversion is not currently supported for GNN models. Hence, export to ONNX and IR is currently disabled for the GNN.

## Setup

* Ubuntu 20.04
* Python 3.8
* NVidia GPU for training
* 16 GB RAM for inference

## Code and Directory Organisation

```
federated_chest_screening/
src/
utils/
dataloader.py
downloader.py
exporter.py
get_config.py
inference_utils.py
loss.py
metric.py
misc.py
model.py
train_utils_cnn.py
train_utils_gnn.py
train_utils.py
transformation.py
export.py
inference.py
train.py
configs/
download_configs.json
fl_with_gnn.json
fl_without_gnn.json
loss_weights.json
media/
tests/
test_export.py
test_inference.py
test_train.py
init_venv.sh
README.md
requirements.txt
setup.py
```

## Code Structure

1. `train.py` in src directory contains the code for training the model.
2. `inference.py` in src directory contains the code for evaluating the model with test set.
3. `export.py` in src directory generating the ONNX and Openvino IR of the trained model.
4. All dependencies are provided in **utils** folder.

5. **tests** directory contains unit tests.
6. **config** directory contains model configurations for the network.

## Create Environment

```
sh init_venv.sh
source venv/bin/activate

```
In addition to the packages mentioned in requirements.txt, users are requested to install additional packages using

```
pip install pyg-lib torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric

```

where `${TORCH}` and `${CUDA}` should be replaced by the specific PyTorch and CUDA versions, respectively


## Run Tests

Necessary unit tests have been provided in the tests directory. The sample/toy dataset to be used in the tests can also be downloaded from [here](http://kliv.iitkgp.ac.in/projects/miriad/sample_data/bmi8/data_samples.zip)

## Acknowledgement

The model and architecture was first published in 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI).

This work is supported through a research grant from Intel
India Grand Challenge 2016 for Project MIRIAD.

**Principal Investigators**

<a href="https://www.linkedin.com/in/debdoot/">Dr Debdoot Sheet</a>,<a href="http://www.iitkgp.ac.in/department/EE/faculty/ee-nirmalya"> Dr Nirmalya Ghosh (Co-PI) </a></br>
Department of Electrical Engineering,</br>
Indian Institute of Technology Kharagpur</br>
email: [email protected]

<a href="https://www.linkedin.com/in/ramanathan-sethuraman-27a12aba/">Dr Ramanathan Sethuraman</a>,</br>
Intel Technology India Pvt. Ltd.</br>
email: [email protected]

**Contributor**

The codes/model was contributed to the OpenVINO project by

<a href="https://github.com/Kasliwal17"> Aditya Kasliwal</a>,</br>
Department of Data Science and Computer Applications,</br>
Manipal Institute of Technology, Manipal</br>
email: [email protected]</br>
Github username: Kasliwal17

<a href="https://github.com/Rakshith2597"> Rakshith Sathish</a>,</br>
Advanced Technology Development Center,</br>
Indian Institute of Technology Kharagpur</br>
email: [email protected]</br>
Github username: Rakshith2597

<a href="https://github.com/anupam-kliv"> Anupam Borthakur</a>,</br>
Center of Excellence in Artifical Intelligence,</br>
Indian Institute of Technology Kharagpur</br>
email: [email protected] </br>
Github username: anupam-kliv


## References


<div id="comp_journal">
<a href="#results">[1]</a> Chakravarty, Arunava and Kar, Avik and Sethuraman, Ramanathan and Sheet, Debdoot; Federated Learning for Site Aware Chest Radiograph Screening; In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI). <a href="https://ieeexplore.ieee.org/document/9433876"> (link) </a>
</div>
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"data":{
"dest_path_data": "dataset/data/data.zip",
"url_data":"http://kliv.iitkgp.ac.in/projects/miriad/sample_data/bmi8/data_samples.zip",
"url_split": "http://kliv.iitkgp.ac.in/projects/miriad/sample_data/bmi8/split.npz",
"dest_path_split": "dataset/split.zip"
},
"fl_with_gnn":{
"url_model": "http://kliv.iitkgp.ac.in/projects/miriad/model_weights/bmi8/model_weights_w_gnn.zip",
"dest_path_model": "model_weights/with_gnn/model_weights_w_gnn.zip"
},
"fl_without_gnn":{
"url_model": "http://kliv.iitkgp.ac.in/projects/miriad/model_weights/bmi8/model_weights_wo_gnn.zip",
"dest_path_model": "model_weights/without_gnn/model_weights_wo_gnn.zip"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{
"train": {
"data": "dataset/data/",
"split_npz": "dataset/split.npz",
"batch_size": 8,
"epochs": 1,
"gpu": "True",
"lr": 1e-5,
"checkpoint": "model_weights/with_gnn/model_weights_w_gnn.pt",
"savepath": "model_weights/with_gnn/",
"backbone":"resnet",
"gnn":"True"
},
"inference": {
"data": "dataset/data/",
"split_npz": "dataset/split.npz",
"batch_size": 1,
"gpu": "True",
"gnn":"True",
"model_file": "model_weights/with_gnn/model_weights_w_gnn.pt",
"checkpoint": "model_weights/with_gnn/model_weights_w_gnn.pt",
"backbone":"resnet",
"max_samples":10
},
"export": {
"checkpoint": "model_weights/with_gnn/model_weights_w_gnn.pt",
"backbone":"resnet",
"split_path":"dataset/split.npz",
"input_shape": [
1,
1,
320,
320
],
"model_name_onnx": "model_weights_w_gnn.onnx",
"model_name": "model_weights_w_gnn"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{
"train": {
"data": "dataset/data/",
"split_npz": "dataset/split.npz",
"batch_size": 12,
"epochs": 1,
"gpu": "True",
"lr": 1e-5,
"checkpoint": "model_weights/without_gnn/model_weights_wo_gnn.pth",
"savepath": "model_weights/without_gnn/",
"backbone":"resnet",
"gnn":"False"
},
"inference": {
"data": "dataset/data/",
"split_npz": "dataset/split.npz",
"batch_size": 1,
"gpu": "True",
"gnn":"False",
"model_file": "model_weights/without_gnn/model_weights_wo_gnn.pth",
"checkpoint": "model_weights/without_gnn/model_weights_wo_gnn.pth",
"backbone":"resnet",
"max_samples":10
},
"export": {
"checkpoint": "model_weights/without_gnn/model_weights_wo_gnn.pth",
"backbone":"resnet",
"split_path":"dataset/split.npz",
"input_shape": [
1,
1,
320,
320
],
"model_name_onnx": "model_weights_wo_gnn.onnx",
"model_name": "model_weights_wo_gnn"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"-999":{
"wts_pos":[ 6.07201409, 12.57545272, 5.07639982, 1.29352719, 14.83679525,
2.61834939, 9.25154963, 22.75312856, 4.12082252, 7.02592567,
1.58836049, 38.86513797, 15.04438092, 1.17096019],
"wts_neg":[0.68019345, 0.64294624, 0.69547317, 1.16038896, 0.63797481, 0.79812282,
0.6549775, 0.62857107, 0.71829276, 0.67000328, 0.99474773, 0.62145383,
0.63759652, 1.2806393 ]
},
"0":{
"wts_pos":[636.94267516, 13.93728223, 5.28038864, 5.26537489, 87.87346221, 8.61623298,
67.56756757, 228.83295195, 20.40816327, 79.42811755, 5.70450656, 276.24309392,
87.79631255, 5.6840789 ],
"wts_neg":[3.14594016, 4.0373047, 7.68875903, 7.72081532, 3.24612089, 4.91690432,
3.28256303, 3.17389786, 3.69767786, 3.2589213, 6.93769946, 3.16636059,
3.24622626, 6.96815553]
},
"1":{
"wts_pos":[ 31.82686187, 649.35064935, 568.18181818, 11.06439478, 75.75757576,
16.73920321, 11.19319454, 27.94076558, 25.4517689, 158.73015873,
11.25999324, 387.59689922, 88.73114463, 7.74653343],
"wts_neg":[3.40901343, 3.09386795, 3.09597523, 4.26657565, 3.20965464, 3.77330013,
4.24772747, 3.46056684, 3.50299506, 3.14011179, 4.23818606, 3.10385499,
3.18989441, 5.11064547]
},
"2":{
"wts_pos":[653.59477124, 662.25165563, 584.79532164, 4.56350112, 45.12635379, 11.55401502,
675.67567568, 746.26865672, 14.69723692, 29.20560748, 5.70418116, 159.23566879,
87.03220191, 5.50721445],
"wts_neg":[3.00057011, 3.00039005, 3.0021916, 8.645284, 3.19856704, 4.02819738, 3.00012,
2.99886043, 3.74868796, 3.3271227, 6.26998558, 3.04395471, 3.09300671, 6.52656311]
},
"3":{
"wts_pos":[359.71223022, 675.67567568, 515.46391753, 6.02772755, 65.40222368, 16.94053871,
800.0, 740.74074074,19.9960008, 11.29433025, 10.53962901, 78.49293564, 87.2600349,
7.8486775 ],
"wts_neg":[3.18775901, 3.17460317, 3.17924588, 6.64098818, 3.32016335, 3.88424937, 3.1722869,
3.17329356, 3.75276767, 4.38711942, 4.51263538, 3.29228946, 3.27847354, 5.28904638]
},
"4":{
"wts_pos":[7.84990973, 308.64197531, 454.54545455, 9.28074246, 186.21973929, 16.51800463,
819.67213115, 909.09090909, 27.52546105, 1515.15151515, 10.49538203, 1960.78431373,
47.93863854, 4.16684029],
"wts_neg":[ 4.71720364, 2.97495091, 2.96577496, 4.31723007, 2.99392234, 3.58628604,
2.95718003, 2.95613102, 3.29978551, 2.95229098, 4.09668169, 2.95098415,
3.13952028, 10.06137438]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/usr/bin/env bash

work_dir=$(realpath "$(dirname $0)")

venv_dir=$1
if [ -z "$venv_dir" ]; then
venv_dir=venv
fi

cd ${work_dir}

if [ -e venv ]; then
echo
echo "Virtualenv already exists. Use command to start working:"
echo "$ . venv/bin/activate"
fi

virtualenv ${venv_dir} -p python3 --prompt="(federated_gnn)"

. ${venv_dir}/bin/activate


cat requirements.txt | xargs -n 1 -L 1 pip3 install

pip install -e .

echo
echo "Activate a virtual environment to start working:"
echo "$ . ${venv_dir}/bin/activate"
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
torch==1.13.1
torchvision==0.13.1
torchmetrics
pydicom
scikit-learn
tqdm
pandas
requests==2.26.0
openvino-dev[onnx]==2021.4.2
onnxruntime==1.8.1
numpy==1.19.2
matplotlib==3.5.2
wget
tqdm
pytest
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from setuptools import setup, find_packages

setup(name='federated_gnn',
version='1.0',
packages=find_packages())

Loading