EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy
- URL: http://arxiv.org/abs/2406.13705v2
- Date: Mon, 8 Jul 2024 15:51:29 GMT
- Title: EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy
- Authors: Long Bai, Tong Chen, Qiaozhi Tan, Wan Jun Nah, Yanheng Li, Zhicheng He, Sishen Yuan, Zhen Chen, Jinlin Wu, Mobarakol Islam, Zhen Li, Hongbin Liu, Hongliang Ren,
- Abstract summary: We introduce EndoUIC, a WCE unified illumination correction solution using an end-to-end promptable diffusion transformer (DiT) model.
In our work, the illumination prompt module shall navigate the model to adapt to different exposure levels and perform targeted image enhancement.
We present a novel Capsule-endoscopy Exposure Correction dataset, including ground-truth and corrupted image pairs annotated by expert photographers.
- Score: 17.075996698542035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wireless Capsule Endoscopy (WCE) is highly valued for its non-invasive and painless approach, though its effectiveness is compromised by uneven illumination from hardware constraints and complex internal dynamics, leading to overexposed or underexposed images. While researchers have discussed the challenges of low-light enhancement in WCE, the issue of correcting for different exposure levels remains underexplored. To tackle this, we introduce EndoUIC, a WCE unified illumination correction solution using an end-to-end promptable diffusion transformer (DiT) model. In our work, the illumination prompt module shall navigate the model to adapt to different exposure levels and perform targeted image enhancement, in which the Adaptive Prompt Integration (API) and Global Prompt Scanner (GPS) modules shall further boost the concurrent representation learning between the prompt parameters and features. Besides, the U-shaped restoration DiT model shall capture the long-range dependencies and contextual information for unified illumination restoration. Moreover, we present a novel Capsule-endoscopy Exposure Correction (CEC) dataset, including ground-truth and corrupted image pairs annotated by expert photographers. Extensive experiments against a variety of state-of-the-art (SOTA) methods on four datasets showcase the effectiveness of our proposed method and components in WCE illumination restoration, and the additional downstream experiments further demonstrate its utility for clinical diagnosis and surgical assistance.
Related papers
- ECMamba: Consolidating Selective State Space Model with Retinex Guidance for Efficient Multiple Exposure Correction [48.77198487543991]
We introduce a novel framework based on Mamba for Exposure Correction (ECMamba) with dual pathways, each dedicated to the restoration of reflectance and illumination map.
Specifically, we derive the Retinex theory and we train a Retinex estimator capable of mapping inputs into two intermediary spaces.
We develop a novel 2D Selective State-space layer guided by Retinex information (Retinex-SS2D) as the core operator of ECMM.
arXiv Detail & Related papers (2024-10-28T21:02:46Z) - LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion [23.729378821117123]
Denoising Diffusion Probabilistic Model (DDPM) holds promise for low-light image enhancement in medical field.
DDPMs are computationally demanding and slow, limiting their practical medical applications.
We propose a lightweight DDPM, dubbed LighTDiff, to capture global structural information using low-resolution images.
arXiv Detail & Related papers (2024-05-17T05:31:19Z) - Reti-Diff: Illumination Degradation Image Restoration with Retinex-based
Latent Diffusion Model [59.08821399652483]
Illumination degradation image restoration (IDIR) techniques aim to improve the visibility of degraded images and mitigate the adverse effects of deteriorated illumination.
Among these algorithms, diffusion model (DM)-based methods have shown promising performance but are often burdened by heavy computational demands and pixel misalignment issues when predicting the image-level distribution.
We propose to leverage DM within a compact latent space to generate concise guidance priors and introduce a novel solution called Reti-Diff for the IDIR task.
Reti-Diff comprises two key components: the Retinex-based latent DM (RLDM) and the Retinex-guided transformer (RG
arXiv Detail & Related papers (2023-11-20T09:55:06Z) - Improving Lens Flare Removal with General Purpose Pipeline and Multiple
Light Sources Recovery [69.71080926778413]
flare artifacts can affect image visual quality and downstream computer vision tasks.
Current methods do not consider automatic exposure and tone mapping in image signal processing pipeline.
We propose a solution to improve the performance of lens flare removal by revisiting the ISP and design a more reliable light sources recovery strategy.
arXiv Detail & Related papers (2023-08-31T04:58:17Z) - Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions.
The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result.
To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z) - LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved
Wavelet Attention and Reverse Diffusion [24.560417980602928]
Wireless capsule endoscopy (WCE) is a painless and non-invasive diagnostic tool for gastrointestinal (GI) diseases.
Deep learning-based low-light image enhancement (LLIE) in the medical field gradually attracts researchers.
We introduce a WCE LLIE framework based on the multi-scale convolutional neural network (CNN) and reverse diffusion process.
arXiv Detail & Related papers (2023-07-05T17:23:42Z) - This Intestine Does Not Exist: Multiscale Residual Variational
Autoencoder for Realistic Wireless Capsule Endoscopy Image Generation [7.430724826764835]
A novel Variational Autoencoder architecture is proposed, namely "This Intestine Does not Exist" (TIDE)
The proposed architecture comprises multiscale feature extraction convolutional blocks and residual connections, which enable the generation of high-quality and diverse datasets.
Contrary to the current approaches, which are oriented towards the augmentation of the available datasets, this study demonstrates that using TIDE, real WCE datasets can be fully substituted.
arXiv Detail & Related papers (2023-02-04T11:49:38Z) - Multi-Scale Structural-aware Exposure Correction for Endoscopic Imaging [0.879504058268139]
This contribution presents an extension to the objective function of LMSPEC, a method originally introduced to enhance images from natural scenes.
It is used here for the exposure correction in endoscopic imaging and the preservation of structural information.
Tested on the Endo4IE dataset, the proposed implementation has yielded a SSIM increase of 4.40% and 4.21% for over- and underexposed images, respectively.
arXiv Detail & Related papers (2022-10-26T21:04:54Z) - A Novel Hybrid Endoscopic Dataset for Evaluating Machine Learning-based
Photometric Image Enhancement Models [0.9236074230806579]
This work introduces a new synthetically generated data-set generated by a generative adversarial techniques.
It also explores both shallow based and deep learning-based image-enhancement methods in overexposed and underexposed lighting conditions.
arXiv Detail & Related papers (2022-07-06T01:47:17Z) - OADAT: Experimental and Synthetic Clinical Optoacoustic Data for
Standardized Image Processing [62.993663757843464]
Optoacoustic (OA) imaging is based on excitation of biological tissues with nanosecond-duration laser pulses followed by detection of ultrasound waves generated via light-absorption-mediated thermoelastic expansion.
OA imaging features a powerful combination between rich optical contrast and high resolution in deep tissues.
No standardized datasets generated with different types of experimental set-up and associated processing methods are available to facilitate advances in broader applications of OA in clinical settings.
arXiv Detail & Related papers (2022-06-17T08:11:26Z) - NuI-Go: Recursive Non-Local Encoder-Decoder Network for Retinal Image
Non-Uniform Illumination Removal [96.12120000492962]
The quality of retinal images is often clinically unsatisfactory due to eye lesions and imperfect imaging process.
One of the most challenging quality degradation issues in retinal images is non-uniform illumination.
We propose a non-uniform illumination removal network for retinal image, called NuI-Go.
arXiv Detail & Related papers (2020-08-07T04:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.