Related papers: MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography

MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography

URL: http://arxiv.org/abs/2508.09616v2
Date: Thu, 09 Oct 2025 07:53:47 GMT
Title: MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography
Authors: Daniel Barco, Marc Stadelmann, Martin Oswald, Ivo Herzig, Lukas Lichtensteiger, Pascal Paysan, Igor Peterlik, Michal Walczak, Bjoern Menze, Frank-Peter Schilling,
Abstract summary: MInDI-3D is the first 3D conditional diffusion-based model for real-world sparse-view Cone Beam Computed Tomography (CBCT) artefact removal.<n>Our results show MInDI-3D's effectiveness, achieving a 12.96 (6.10) dB PSNR gain over uncorrected scans.<n> Clinicians rated our model as sufficient for patient positioning across all anatomical sites.
Score: 0.023856101317606432
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present MInDI-3D (Medical Inversion by Direct Iteration in 3D), the first 3D conditional diffusion-based model for real-world sparse-view Cone Beam Computed Tomography (CBCT) artefact removal, aiming to reduce imaging radiation exposure. A key contribution is extending the "InDI" concept from 2D to a full 3D volumetric approach for medical images, implementing an iterative denoising process that refines the CBCT volume directly from sparse-view input. A further contribution is the generation of a large pseudo-CBCT dataset (16,182) from chest CT volumes of the CT-RATE public dataset to robustly train MInDI-3D. We performed a comprehensive evaluation, including quantitative metrics, scalability analysis, generalisation tests, and a clinical assessment by 11 clinicians. Our results show MInDI-3D's effectiveness, achieving a 12.96 (6.10) dB PSNR gain over uncorrected scans with only 50 projections on the CT-RATE pseudo-CBCT (independent real-world) test set and enabling an 8x reduction in imaging radiation exposure. We demonstrate its scalability by showing that performance improves with more training data. Importantly, MInDI-3D matches the performance of a 3D U-Net on real-world scans from 16 cancer patients across distortion and task-based metrics. It also generalises to new CBCT scanner geometries. Clinicians rated our model as sufficient for patient positioning across all anatomical sites and found it preserved lung tumour boundaries well.

Related papers

3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans. Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z)
DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays [15.83915012691264]
DIFR3CT is a 3D latent diffusion model that generates plausible CT volumes from one or few planar x-ray observations. We conduct experiments demonstrating that DIFR3CT is better than recent sparse CT reconstruction baselines in terms of standard pixel-level. We also show that DIFR3CT supports uncertainty quantification via Monte Carlo sampling, which provides an opportunity to measure reconstruction reliability.
arXiv Detail & Related papers (2024-08-27T14:58:08Z)
Generative Enhancement for 3D Medical Images [74.17066529847546]
We propose GEM-3D, a novel generative approach to the synthesis of 3D medical images. Our method begins with a 2D slice, noted as the informed slice to serve the patient prior, and propagates the generation process using a 3D segmentation mask. By decomposing the 3D medical images into masks and patient prior information, GEM-3D offers a flexible yet effective solution for generating versatile 3D images.
arXiv Detail & Related papers (2024-03-19T15:57:04Z)
Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z)
Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models [52.529394863331326]
We propose a novel approach using two perpendicular pre-trained 2D diffusion models to solve the 3D inverse problem. Our method is highly effective for 3D medical image reconstruction tasks, including MRI Z-axis super-resolution, compressed sensing MRI, and sparse-view CT.
arXiv Detail & Related papers (2023-03-15T08:28:06Z)
MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray [14.10611608681131]
Excessive ionising radiation can lead to deterministic and harmful effects on the body. This paper proposes a Deep Learning model that learns to reconstruct CT projections from a few or even a single-view X-ray.
arXiv Detail & Related papers (2022-02-02T13:25:23Z)
COVID-19 identification from volumetric chest CT scans using a progressively resized 3D-CNN incorporating segmentation, augmentation, and class-rebalancing [4.446085353384894]
COVID-19 is a global pandemic disease overgrowing worldwide. Computer-aided screening tools with greater sensitivity is imperative for disease diagnosis and prognosis. This article proposes a 3D Convolutional Neural Network (CNN)-based classification approach.
arXiv Detail & Related papers (2021-02-11T18:16:18Z)
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification. We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z)
Revisiting 3D Context Modeling with Supervised Pre-training for Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices. With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset. The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)
3D Probabilistic Segmentation and Volumetry from 2D projection images [10.32519161805588]
X-Ray imaging is quick, cheap and useful for front-line care assessment and intra-operative real-time imaging. It suffers from projective information loss and lacks vital information on which many essential diagnostic biomarkers are based. In this paper we explore probabilistic methods to reconstruct 3D volumetric images from 2D imaging modalities.
arXiv Detail & Related papers (2020-06-23T08:00:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.