SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model
- URL: http://arxiv.org/abs/2507.05148v1
- Date: Mon, 07 Jul 2025 15:58:11 GMT
- Title: SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model
- Authors: Chun Xie, Yuichi Yoshii, Itaru Kitahara,
- Abstract summary: We propose a novel view-conditioned model for multi-view X-ray images from a single view.<n>Our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation.<n> Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles.
- Score: 0.8670827427401335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis. Our code is available at GitHub.
Related papers
- Taming Stable Diffusion for Computed Tomography Blind Super-Resolution [20.195025131749944]
High-resolution computed tomography (CT) imaging is essential for medical diagnosis but requires increased radiation exposure.<n>While deep learning methods have shown promise in CT super-resolution, they face challenges with complex degradations and limited medical training data.<n>We propose a novel framework that adapts Stable Diffusion for CT blind super-resolution.
arXiv Detail & Related papers (2025-06-13T06:45:05Z) - PixCell: A generative foundation model for digital histopathology images [49.00921097924924]
We introduce PixCell, the first diffusion-based generative foundation model for histopathology.<n>We train PixCell on PanCan-30M, a vast, diverse dataset derived from 69,184 H&E-stained whole slide images covering various cancer types.
arXiv Detail & Related papers (2025-06-05T15:14:32Z) - Novel-view X-ray Projection Synthesis through Geometry-Integrated Deep Learning [3.4916237834391874]
The DL-GIPS model synthesizes X-ray projections from new viewpoints by leveraging a single existing projection.<n>The model strategically manipulates geometry and texture features extracted from an initial projection to match new viewing angles.<n>It then synthesizes the final projection by merging these modified geometry features with consistent texture information through an advanced image generation process.
arXiv Detail & Related papers (2025-04-16T10:30:08Z) - Random Token Fusion for Multi-View Medical Diagnosis [2.3458652461211935]
In multi-view medical datasets, deep learning models often fuse information from different imaging perspectives to improve diagnosis performance.
Existing approaches are prone to overfitting and rely heavily on view-specific features, which can lead to trivial solutions.
In this work, we introduce a novel technique designed to enhance image analysis using multi-view medical transformers.
arXiv Detail & Related papers (2024-10-21T10:19:45Z) - A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.<n>Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning [6.4136876268620115]
MLVICX is an approach to capture rich representations in the form of embeddings from chest X-ray images.
We demonstrate the performance of MLVICX in advancing self-supervised chest X-ray representation learning.
arXiv Detail & Related papers (2024-03-18T06:19:37Z) - MVC: A Multi-Task Vision Transformer Network for COVID-19 Diagnosis from
Chest X-ray Images [10.616065108433798]
We propose a new method, namely Multi-task Vision Transformer (MVC) for simultaneously classifying chest X-ray images and identifying affected regions from the input data.
Our method is built upon the Vision Transformer but extends its learning capability in a multi-task setting.
arXiv Detail & Related papers (2023-09-30T15:52:18Z) - XVertNet: Unsupervised Contrast Enhancement of Vertebral Structures with Dynamic Self-Tuning Guidance and Multi-Stage Analysis [1.3584858315758948]
Chest X-rays remain the primary diagnostic tool in emergency medicine, yet their limited ability to capture fine anatomical details can result in missed or delayed diagnoses.<n>We introduce XVertNet, a novel deep-learning framework designed to enhance vertebral structure visualization in X-ray images significantly.
arXiv Detail & Related papers (2023-06-06T19:36:11Z) - Fine-tuned Generative Adversarial Network-based Model for Medical Image Super-Resolution [2.647302105102753]
Real-Enhanced Super-Resolution Generative Adversarial Network (Real-ESRGAN) is a practical model for recovering HR images from real-world LR images.
We employ the high-order degradation model of the Real-ESRGAN which better simulates real-world image degradations.
The proposed model achieves superior perceptual quality compared to the Real-ESRGAN model, effectively preserving fine details and generating images with more realistic textures.
arXiv Detail & Related papers (2022-11-01T16:48:04Z) - Image Synthesis with Disentangled Attributes for Chest X-Ray Nodule
Augmentation and Detection [52.93342510469636]
Lung nodule detection in chest X-ray (CXR) images is common to early screening of lung cancers.
Deep-learning-based Computer-Assisted Diagnosis (CAD) systems can support radiologists for nodule screening in CXR.
To alleviate the limited availability of such datasets, lung nodule synthesis methods are proposed for the sake of data augmentation.
arXiv Detail & Related papers (2022-07-19T16:38:48Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.
In this paper, we develop a novel generative method named generative adversarial U-Net.
Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.