EndoCaver: Handling Fog, Blur and Glare in Endoscopic Images via Joint Deblurring-Segmentation
- URL: http://arxiv.org/abs/2601.22537v1
- Date: Fri, 30 Jan 2026 04:18:04 GMT
- Title: EndoCaver: Handling Fog, Blur and Glare in Endoscopic Images via Joint Deblurring-Segmentation
- Authors: Zhuoyu Wu, Wenhui Ou, Pei-Sze Tan, Jiayan Yang, Wenqi Fang, Zheng Wang, Raphaƫl C. -W. Phan,
- Abstract summary: EndoCaver is a lightweight transformer with a unidirectional-guided dual-decoder architecture.<n>It enables joint multi-task capability for image deblurring and segmentation.<n>It is well-suited for on-device clinical deployment.
- Score: 9.574713877383244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Endoscopic image analysis is vital for colorectal cancer screening, yet real-world conditions often suffer from lens fogging, motion blur, and specular highlights, which severely compromise automated polyp detection. We propose EndoCaver, a lightweight transformer with a unidirectional-guided dual-decoder architecture, enabling joint multi-task capability for image deblurring and segmentation while significantly reducing computational complexity and model parameters. Specifically, it integrates a Global Attention Module (GAM) for cross-scale aggregation, a Deblurring-Segmentation Aligner (DSA) to transfer restoration cues, and a cosine-based scheduler (LoCoS) for stable multi-task optimisation. Experiments on the Kvasir-SEG dataset show that EndoCaver achieves 0.922 Dice on clean data and 0.889 under severe image degradation, surpassing state-of-the-art methods while reducing model parameters by 90%. These results demonstrate its efficiency and robustness, making it well-suited for on-device clinical deployment. Code is available at https://github.com/ReaganWu/EndoCaver.
Related papers
- Cancer-Net PCa-MultiSeg: Multimodal Enhancement of Prostate Cancer Lesion Segmentation Using Synthetic Correlated Diffusion Imaging [55.62977326180104]
Current deep learning approaches for prostate cancer lesion segmentation achieve limited performance.<n>We investigate synthetic correlated diffusion imaging (CDI$s$) as an enhancement to standard diffusion-based protocols.<n>Our results establish validated integration pathways for CDI$s$ as a practical drop-in enhancement for PCa lesion segmentation tasks.
arXiv Detail & Related papers (2025-11-11T04:16:12Z) - Few-Step Diffusion via Score identity Distillation [67.07985339442703]
Diffusion distillation has emerged as a promising strategy for accelerating text-to-image (T2I) diffusion models.<n>Existing methods rely on real or teacher-synthesized images to perform well when distilling high-resolution T2I diffusion models.<n>We propose two new guidance strategies: Zero-CFG, which disables CFG in the teacher and removes text conditioning in the fake score network, and Anti-CFG, which applies negative CFG in the fake score network.
arXiv Detail & Related papers (2025-05-19T03:45:16Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [92.4205087439928]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose the Self-supervised Transfer (PST) and the FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models, effectively mitigating data scarcity.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.<n>This combined approach enables FUSE to construct a universal image-event that only requires lightweight decoder adaptation for target datasets.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - FusionLungNet: Multi-scale Fusion Convolution with Refinement Network for Lung CT Image Segmentation [1.3124513975412255]
Early detection of lung cancer increases the chances of successful treatment.
New lung segmentation methods face difficulties in identifying long-range relationships between image components.
We propose a hybrid approach using the FusionLungNet network, which has a multi-level structure with key components.
arXiv Detail & Related papers (2024-10-21T09:27:51Z) - Light-weight Retinal Layer Segmentation with Global Reasoning [14.558920359236572]
We propose LightReSeg for retinal layer segmentation which can be applied to OCT images.
Our approach achieves a better segmentation performance compared to the current state-of-the-art method TransUnet.
arXiv Detail & Related papers (2024-04-25T05:42:41Z) - CDSE-UNet: Enhancing COVID-19 CT Image Segmentation with Canny Edge
Detection and Dual-Path SENet Feature Fusion [10.831487161893305]
CDSE-UNet is a novel UNet-based segmentation model that integrates Canny operator edge detection and a dual-path SENet feature fusion mechanism.
We have developed a Multiscale Convolution approach, replacing the standard Convolution in UNet, to adapt to the varied lesion sizes and shapes.
Our evaluations on public datasets demonstrate CDSE-UNet's superior performance over other leading models.
arXiv Detail & Related papers (2024-03-03T13:36:07Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - An Attentive-based Generative Model for Medical Image Synthesis [18.94900480135376]
We propose an attention-based dual contrast generative model, called ADC-cycleGAN, which can synthesize medical images from unpaired data with multiple slices.
The model integrates a dual contrast loss term with the CycleGAN loss to ensure that the synthesized images are distinguishable from the source domain.
Experimental results demonstrate that the proposed ADC-cycleGAN model produces comparable samples to other state-of-the-art generative models.
arXiv Detail & Related papers (2023-06-02T14:17:37Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - Deep ensembles based on Stochastic Activation Selection for Polyp
Segmentation [82.61182037130406]
This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations.
Basic architecture in image segmentation consists of an encoder and a decoder.
We compare some variant of the DeepLab architecture obtained by varying the decoder backbone.
arXiv Detail & Related papers (2021-04-02T02:07:37Z) - D2A U-Net: Automatic Segmentation of COVID-19 Lesions from CT Slices
with Dilated Convolution and Dual Attention Mechanism [9.84838467721235]
We propose a dilated dual attention U-Net (D2A U-Net) for COVID-19 lesion segmentation in CT slices based on dilated convolution and a novel dual attention mechanism.
Our experiment results have shown that by introducing dilated convolution and dual attention mechanism, the number of false positives is significantly reduced.
arXiv Detail & Related papers (2021-02-10T01:21:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.