Super-Resolution Enhancement of Medical Images Based on Diffusion Model: An Optimization Scheme for Low-Resolution Gastric Images
- URL: http://arxiv.org/abs/2512.22209v1
- Date: Mon, 22 Dec 2025 03:37:47 GMT
- Title: Super-Resolution Enhancement of Medical Images Based on Diffusion Model: An Optimization Scheme for Low-Resolution Gastric Images
- Authors: Haozhe Jia,
- Abstract summary: Capsule endoscopy has enabled minimally invasive gastrointestinal imaging, but its clinical utility is limited by the inherently low resolution of captured images.<n>This work investigates a diffusion-based super-resolution framework to enhance capsule endoscopy images in a data-driven manner.
- Score: 1.416562653124415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Capsule endoscopy has enabled minimally invasive gastrointestinal imaging, but its clinical utility is limited by the inherently low resolution of captured images due to hardware, power, and transmission constraints. This limitation hampers the identification of fine-grained mucosal textures and subtle pathological features essential for early diagnosis. This work investigates a diffusion-based super-resolution framework to enhance capsule endoscopy images in a data-driven and anatomically consistent manner. We adopt the SR3 (Super-Resolution via Repeated Refinement) framework built upon Denoising Diffusion Probabilistic Models (DDPMs) to learn a probabilistic mapping from low-resolution to high-resolution images. Unlike GAN-based approaches that often suffer from training instability and hallucination artifacts, diffusion models provide stable likelihood-based training and improved structural fidelity. The HyperKvasir dataset, a large-scale publicly available gastrointestinal endoscopy dataset, is used for training and evaluation. Quantitative results demonstrate that the proposed method significantly outperforms bicubic interpolation and GAN-based super-resolution methods such as ESRGAN, achieving PSNR of 27.5 dB and SSIM of 0.65 for a baseline model, and improving to 29.3 dB and 0.71 with architectural enhancements including attention mechanisms. Qualitative results show improved preservation of anatomical boundaries, vascular patterns, and lesion structures. These findings indicate that diffusion-based super-resolution is a promising approach for enhancing non-invasive medical imaging, particularly in capsule endoscopy where image resolution is fundamentally constrained.
Related papers
- DSA-SRGS: Super-Resolution Gaussian Splatting for Dynamic Sparse-View DSA Reconstruction [67.42242016220122]
Digital subtraction angiography is a key imaging technique for the auxiliary diagnosis and treatment of cerebrovascular diseases.<n>Recent advancements in gaussian splatting and dynamic neural representations have enabled robust 3D vessel reconstruction from sparse dynamic inputs.<n>This paper proposes DSA-SRGS, the first super-resolution gaussian splatting framework for dynamic sparse-view DSA reconstruction.
arXiv Detail & Related papers (2026-03-05T03:41:08Z) - Structure-constrained Language-informed Diffusion Model for Unpaired Low-dose Computed Tomography Angiography Reconstruction [72.80209358480424]
overdose of iodinated contrast media (ICM) can cause kidney damage and life-threatening allergic reactions.<n>Deep learning methods can generate CT images of normal-dose ICM from low-dose ICM, reducing the required dose.<n>We propose a Structure-constrained Language-informed Diffusion Model (SLDM) that integrates structural synergy and spatial intelligence.
arXiv Detail & Related papers (2026-01-28T06:54:06Z) - Lesion-Aware Post-Training of Latent Diffusion Models for Synthesizing Diffusion MRI from CT Perfusion [12.777937532267105]
Latent diffusion models (LDMs) leverage efficient learning in compressed latent space.<n>This efficiency comes with a trade-off, potentially compromising crucial pixel-level detail.<n>We propose a novel post-training framework for LDMs in medical image-to-image translation by incorporating lesion-aware medical pixel space objectives.
arXiv Detail & Related papers (2025-10-10T06:51:27Z) - DGSSA: Domain generalization with structural and stylistic augmentation for retinal vessel segmentation [17.396365010722423]
Retinal vascular morphology is crucial for diagnosing diseases such as diabetes, glaucoma, and hypertension.<n>Traditional segmentation methods assume that training and testing data share similar distributions, which can lead to poor performance on unseen domains.<n>This paper presents a novel approach, DGSSA, for retinal vessel image segmentation that enhances model generalization by combining structural and style augmentation strategies.
arXiv Detail & Related papers (2025-01-07T01:47:57Z) - A Flow-based Truncated Denoising Diffusion Model for Super-resolution Magnetic Resonance Spectroscopic Imaging [34.32290273033808]
This work introduces a Flow-based Truncated Denoising Diffusion Model for super-resolution MRSI.
It shortens the diffusion process by truncating the diffusion chain, and the truncated steps are estimated using a normalizing flow-based network.
We demonstrate that FTDDM outperforms existing generative models while speeding up the sampling process by over 9-fold.
arXiv Detail & Related papers (2024-10-25T03:42:35Z) - A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.<n>Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - Timestep-Aware Diffusion Model for Extreme Image Rescaling [47.89362819768323]
We propose a novel framework called Timestep-Aware Diffusion Model (TADM) for extreme image rescaling.<n>TADM performs rescaling operations in the latent space of a pre-trained autoencoder.<n>It effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - Super-resolution of biomedical volumes with 2D supervision [84.5255884646906]
Masked slice diffusion for super-resolution exploits the inherent equivalence in the data-generating distribution across all spatial dimensions of biological specimens.
We focus on the application of SliceR to stimulated histology (SRH), characterized by its rapid acquisition of high-resolution 2D images but slow and costly optical z-sectioning.
arXiv Detail & Related papers (2024-04-15T02:41:55Z) - LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved
Wavelet Attention and Reverse Diffusion [24.560417980602928]
Wireless capsule endoscopy (WCE) is a painless and non-invasive diagnostic tool for gastrointestinal (GI) diseases.
Deep learning-based low-light image enhancement (LLIE) in the medical field gradually attracts researchers.
We introduce a WCE LLIE framework based on the multi-scale convolutional neural network (CNN) and reverse diffusion process.
arXiv Detail & Related papers (2023-07-05T17:23:42Z) - Does image resolution impact chest X-ray based fine-grained
Tuberculosis-consistent lesion segmentation? [3.3086274755158325]
Deep learning models are reportedly trained on reduced image resolutions citing reasons for the lack of computational resources.
This study investigated performance gains achieved through training an Inception-V3-based UNet model using various image/mask resolutions.
arXiv Detail & Related papers (2023-01-10T15:34:39Z) - Malignancy Prediction and Lesion Identification from Clinical
Dermatological Images [65.1629311281062]
We consider machine-learning-based malignancy prediction and lesion identification from clinical dermatological images.
We first identify all lesions present in the image regardless of sub-type or likelihood of malignancy, then it estimates their likelihood of malignancy, and through aggregation, it also generates an image-level likelihood of malignancy.
arXiv Detail & Related papers (2021-04-02T20:52:05Z) - Modeling and Enhancing Low-quality Retinal Fundus Images [167.02325845822276]
Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis.
We propose a clinically oriented fundus enhancement network (cofe-Net) to suppress global degradation factors.
Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details.
arXiv Detail & Related papers (2020-05-12T08:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.