BS-LDM: Effective Bone Suppression in High-Resolution Chest X-Ray Images with Conditional Latent Diffusion Models
This code is a pytorch implementation of our paper "BS-LDM: Effective Bone Suppression in High-Resolution Chest X-Ray Images with Conditional Latent Diffusion Models".
-
We introduce an end-to-end LDM-based framework for high-resolution bone suppression, named BS-LDM. It utilizes a multi-level hybrid loss-constrained VQGAN for effective perceptual compression. This framework consistently generates soft tissue images with high levels of bone suppression while preserving fine details and critical lung lesions.
-
To enhance the quality of generated images, we incorporate offset noise and a temporal adaptive thresholding strategy. These innovations help minimize discrepancies in low-frequency information, thereby improving the interpretability of the soft tissue images.
-
We have compiled a comprehensive bone suppression dataset, SZCH-X-Rays, which includes 818 pairs of high-resolution CXR and DES soft tissue images from our partner hospital. Additionally, we processed 241 pairs of images from the JSRT dataset into negative formats more commonly used in clinical settings.
-
Our clinical evaluation focused on image quality and diagnostic utility. The results demonstrated excellent image quality scores and substantial diagnostic improvements, underscoring the clinical significance of our work.
Overview of the proposed BS-LDM: (a) The training process of BS-LDM, where CXR and noised soft tissue data in the latent space are transmitted to the noise estimator for offset noise prediction and L2 loss calculation; (b) The training process of ML-VQGAN, where a multi-level hybrid loss-constrained VQGAN is used to construct a latent space by training the reconstruction of CXR and soft tissue images, using a codebook to represent the discrete features of the images; (c) The sampling process of BS-LDM, where the latent variables obtained after each denoising step are clipped using a temporal adaptive thresholding strategy for the sake of contrast stability.
Visualization of high-frequency and low-frequency feature decomposition of latent variables before and after Gaussian noise addition using Discrete Fourier Transform
Power spectral densities of soft tissue images in SZCH-X-Rays, corresponding latent variables and Gaussian noise on 201 spectrogram components, averaged over 10000 samples
Visualization of ablation studies of offset noise and the temporal adaptive thresholding strategy on BS-LDM, with histograms given to visualize the pixel intensity distribution more intuitively
The soft tissue images generated by the BS-LDM on the SZCH-X-Rays dataset were independently evaluated for image quality using established clinical criteria that are commonly applied to assess bone suppression efficacy. Three radiologists, with 6, 11, and 21 years of experience respectively, conducted these evaluations at our partner hospital. The average scores for lung vessel visibility, airway visibility, and the degree of bone suppression were 2.758, 2.714, and 2.765, respectively, out of a maximum score of 3. These findings indicate that BS-LDM effectively suppresses bone while preserving fine details and lung pathology.
Clinical Evaluation Criteria | Junior Physician (6 years) | Intermediate Physician (11 years) | Senior Physician (21 years) | |
---|---|---|---|---|
Lung vessel visibility | Clearly displayed (3) | 2.431 | 2.858 | 2.984 |
Displayed (2) | ||||
Not displayed (1) | ||||
Airway visibility | Lobar and intermediate bronchi (3) | 2.561 | 2.643 | 2.937 |
Main bronchus and rump (2) | ||||
Trachea (1) | ||||
Degree of bone suppression | Nearly perfect suppression (3) | 2.781 | 2.793 | 2.722 |
Unsuppressed bones less than 5 (2) | ||||
5 or more bones unsuppressed (1) |
The diagnostic value of soft tissue imaging was independently evaluated by two radiologists with 6 and 11 years of experience, following the X-ray diagnosis standard. This analysis employed the SZCH-X-Rays dataset for bone suppression, using computed tomography to confirm lesions, which included common lung diseases such as inflammation, tuberculosis, and masses or nodules. Out of 818 data pairs assessed, 79 pairs contained one or more of these lesions. The radiologists independently evaluated both conventional CXR and the soft tissue images generated by our model. The findings suggest that the soft tissue images produced by BS-LDM enable more thorough and accurate lesion diagnosis compared to CXR images, thereby confirming its high clinical diagnostic value.
Junior Radiologist | Precision (↑) | Recall (↑) | F1 Score (↑) |
CXR | 0.70 | 0.40 | 0.51 |
Tissue | 0.73 | 0.56 | 0.63 |
---|---|---|---|
Senior Radiologist | Precision (↑) | Recall (↑) | F1 Score (↑) |
CXR | 0.74 | 0.51 | 0.60 |
Tissue | 0.75 | 0.75 | 0.75 |
-
Linux
-
Python>=3.7
-
NVIDIA GPU (memory>=6G) + CUDA cuDNN
VQGAN - SZCH-X-Rays UNet - SZCH-X-Rays VQGAN - JSRT UNet - JSRT
The original JSRT dataset and precessed JSRT dataset are located at https://drive.google.com/file/d/1RkiU85FFfouWuKQbpD7Pc7o3aZ7KrpYf/view?usp=sharing and https://drive.google.com/file/d/1o-T5l2RKdT5J75eBsqajqAuHPfZnzPhj/view?usp=sharing, respectively.
Three paired images with CXRs and DES soft-tissues images of SZCH-X-Rays for testing are located at
└─BS-LDM
├─ CXR
│ ├─ 0.png
│ ├─ 1.png
│ └─ 2.png
└─ BS
├─ 0.png
├─ 1.png
└─ 2.png
pip install -r requirements.txt
To do the evaluation process of VQGAN, please run the following command:
python vq-gan_eval.py
To do the evaluation process of the conditional latent diffusion model, please run the following command:
python ldm_eval.py
If you want to train our model by yourself, you are primarily expected to split the whole dataset into training, validation and testing sets. Please run the following command:
python dataSegmentation.py
Then, you can run the following command to train the VQGAN model:
python vq-gan_train.py
Then after finishing the training of VQGAN, you can use the saved VQGAN model as a decoder when training the conditional latent diffusion model by running the following command:
python ldm_train.py
You can also run the following command about evaluation metrics in our experiment including BSR, MSE, PSNR and LPIPS:
python metrics.py
Sun Y, Chen Z, Zheng H, et al. BS-LDM: Effective Bone Suppression in High-Resolution Chest X-Ray Images with Conditional Latent Diffusion Models[J]. arXiv preprint arXiv:2412.15670, 2024.