Taming Stable Diffusion for Computed Tomography Blind Super-Resolution
- URL: http://arxiv.org/abs/2506.11496v1
- Date: Fri, 13 Jun 2025 06:45:05 GMT
- Title: Taming Stable Diffusion for Computed Tomography Blind Super-Resolution
- Authors: Chunlei Li, Yilei Shi, Haoxi Hu, Jingliang Hu, Xiao Xiang Zhu, Lichao Mou,
- Abstract summary: High-resolution computed tomography (CT) imaging is essential for medical diagnosis but requires increased radiation exposure.<n>While deep learning methods have shown promise in CT super-resolution, they face challenges with complex degradations and limited medical training data.<n>We propose a novel framework that adapts Stable Diffusion for CT blind super-resolution.
- Score: 20.195025131749944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-resolution computed tomography (CT) imaging is essential for medical diagnosis but requires increased radiation exposure, creating a critical trade-off between image quality and patient safety. While deep learning methods have shown promise in CT super-resolution, they face challenges with complex degradations and limited medical training data. Meanwhile, large-scale pre-trained diffusion models, particularly Stable Diffusion, have demonstrated remarkable capabilities in synthesizing fine details across various vision tasks. Motivated by this, we propose a novel framework that adapts Stable Diffusion for CT blind super-resolution. We employ a practical degradation model to synthesize realistic low-quality images and leverage a pre-trained vision-language model to generate corresponding descriptions. Subsequently, we perform super-resolution using Stable Diffusion with a specialized controlling strategy, conditioned on both low-resolution inputs and the generated text descriptions. Extensive experiments show that our method outperforms existing approaches, demonstrating its potential for achieving high-quality CT imaging at reduced radiation doses. Our code will be made publicly available.
Related papers
- SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model [0.8670827427401335]
We propose a novel view-conditioned model for multi-view X-ray images from a single view.<n>Our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation.<n> Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles.
arXiv Detail & Related papers (2025-07-07T15:58:11Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.<n>We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - Multiscale Latent Diffusion Model for Enhanced Feature Extraction from Medical Images [5.395912799904941]
variations in CT scanner models and acquisition protocols introduce significant variability in the extracted radiomic features.<n> LTDiff++ is a multiscale latent diffusion model designed to enhance feature extraction in medical imaging.
arXiv Detail & Related papers (2024-10-05T02:13:57Z) - Histo-Diffusion: A Diffusion Super-Resolution Method for Digital Pathology with Comprehensive Quality Assessment [6.350679043444348]
Histo-Diffusion is a novel diffusion-based method specially designed for generating and evaluating super-resolution images in digital pathology.
It includes a restoration module for histopathology prior and a controllable diffusion module for generating high-quality images.
arXiv Detail & Related papers (2024-08-27T17:31:00Z) - Timestep-Aware Diffusion Model for Extreme Image Rescaling [47.89362819768323]
We propose a novel framework called Timestep-Aware Diffusion Model (TADM) for extreme image rescaling.<n>TADM performs rescaling operations in the latent space of a pre-trained autoencoder.<n>It effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models [48.87160158792048]
We introduce a cascaded amortized latent diffusion model (CA-LDM) that can synthesis high-resolution OCT volumes in a memory-efficient way.
Experiments on a public high-resolution OCT dataset show that our synthetic data have realistic high-resolution and global features, surpassing the capabilities of existing methods.
arXiv Detail & Related papers (2024-05-26T10:58:22Z) - Low-Resolution Chest X-ray Classification via Knowledge Distillation and Multi-task Learning [46.75992018094998]
This research addresses the challenges of diagnosing chest X-rays (CXRs) at low resolutions.
High-resolution CXR imaging is crucial for identifying small but critical anomalies, such as nodules or opacities.
This paper presents the Multilevel Collaborative Attention Knowledge (MLCAK) method.
arXiv Detail & Related papers (2024-05-22T06:10:54Z) - Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model [2.507050016527729]
Tri-modal medical image fusion can provide a more comprehensive view of the disease's shape, location, and biological activity.
Due to the limitations of imaging equipment and considerations for patient safety, the quality of medical images is usually limited.
There is an urgent need for a technology that can both enhance image resolution and integrate multi-modal information.
arXiv Detail & Related papers (2024-04-26T12:13:41Z) - On Sensitivity and Robustness of Normalization Schemes to Input
Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input.
DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases.
We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z) - Fine-tuned Generative Adversarial Network-based Model for Medical Image Super-Resolution [2.647302105102753]
Real-Enhanced Super-Resolution Generative Adversarial Network (Real-ESRGAN) is a practical model for recovering HR images from real-world LR images.
We employ the high-order degradation model of the Real-ESRGAN which better simulates real-world image degradations.
The proposed model achieves superior perceptual quality compared to the Real-ESRGAN model, effectively preserving fine details and generating images with more realistic textures.
arXiv Detail & Related papers (2022-11-01T16:48:04Z) - Incremental Cross-view Mutual Distillation for Self-supervised Medical
CT Synthesis [88.39466012709205]
This paper builds a novel medical slice to increase the between-slice resolution.
Considering that the ground-truth intermediate medical slices are always absent in clinical practice, we introduce the incremental cross-view mutual distillation strategy.
Our method outperforms state-of-the-art algorithms by clear margins.
arXiv Detail & Related papers (2021-12-20T03:38:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.