Underwater Diffusion Attention Network with Contrastive Language-Image Joint Learning for Underwater Image Enhancement
- URL: http://arxiv.org/abs/2505.19895v1
- Date: Mon, 26 May 2025 12:24:56 GMT
- Title: Underwater Diffusion Attention Network with Contrastive Language-Image Joint Learning for Underwater Image Enhancement
- Authors: Afrah Shaahid, Muzammil Behzad,
- Abstract summary: UDAN-CLIP is an image-to-image diffusion framework pre-trained on synthetic underwater datasets.<n>It is enhanced with a customized classifier based on vision-language model, a spatial attention module, and a novel CLIP-Diffusion loss.<n>The proposed contributions empower our UDAN-CLIP model to perform more effective underwater image enhancement.
- Score: 0.8747606955991707
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Underwater images are often affected by complex degradations such as light absorption, scattering, color casts, and artifacts, making enhancement critical for effective object detection, recognition, and scene understanding in aquatic environments. Existing methods, especially diffusion-based approaches, typically rely on synthetic paired datasets due to the scarcity of real underwater references, introducing bias and limiting generalization. Furthermore, fine-tuning these models can degrade learned priors, resulting in unrealistic enhancements due to domain shifts. To address these challenges, we propose UDAN-CLIP, an image-to-image diffusion framework pre-trained on synthetic underwater datasets and enhanced with a customized classifier based on vision-language model, a spatial attention module, and a novel CLIP-Diffusion loss. The classifier preserves natural in-air priors and semantically guides the diffusion process, while the spatial attention module focuses on correcting localized degradations such as haze and low contrast. The proposed CLIP-Diffusion loss further strengthens visual-textual alignment and helps maintain semantic consistency during enhancement. The proposed contributions empower our UDAN-CLIP model to perform more effective underwater image enhancement, producing results that are not only visually compelling but also more realistic and detail-preserving. These improvements are consistently validated through both quantitative metrics and qualitative visual comparisons, demonstrating the model's ability to correct distortions and restore natural appearance in challenging underwater conditions.
Related papers
- DACA-Net: A Degradation-Aware Conditional Diffusion Network for Underwater Image Enhancement [16.719513778795367]
Underwater images typically suffer from severe colour distortions, low visibility, and reduced structural clarity due to complex optical effects such as scattering and absorption.<n>Existing enhancement methods often struggle to adaptively handle diverse degradation conditions and fail to leverage underwater-specific physical priors effectively.<n>We propose a degradation-aware conditional diffusion model to enhance underwater images adaptively and robustly.
arXiv Detail & Related papers (2025-07-30T09:16:07Z) - Phaseformer: Phase-based Attention Mechanism for Underwater Image Restoration and Beyond [25.975859029063585]
We propose a lightweight phase-based transformer network with 1.77M parameters for underwater image restoration (UIR)<n>Our approach focuses on effectively extracting non-contaminated features using a phase-based self-attention mechanism.<n>We demonstrate its effectiveness for low-light image enhancement using the LOL dataset.
arXiv Detail & Related papers (2024-12-02T12:48:19Z) - HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning [62.264673293638175]
Existing underwater image enhancement methods primarily focus on improving visual quality while overlooking practical implications.<n>We propose a invertible network for underwater perception enhancement, dubbed H, which enhances visual quality and demonstrates flexibility in handling other downstream tasks.
arXiv Detail & Related papers (2024-11-27T12:37:03Z) - Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration [64.84134880709625]
We show that it is possible to perform domain adaptation via the noise space using diffusion models.<n>In particular, by leveraging the unique property of how auxiliary conditional inputs influence the multi-step denoising process, we derive a meaningful diffusion loss.<n>We present crucial strategies such as channel-shuffling layer and residual-swapping contrastive learning in the diffusion model.
arXiv Detail & Related papers (2024-06-26T17:40:30Z) - Underwater Image Enhancement by Diffusion Model with Customized CLIP-Classifier [5.352081564604589]
Underwater Image Enhancement (UIE) aims to improve the visual quality from a low-quality input.
We propose CLIP-UIE, a novel framework that leverages the potential of Contrastive Language-Image Pretraining (CLIP) for the UIE task.
Specifically, we propose employing color transfer to yield synthetic images by degrading in-air natural images into corresponding underwater images, guided by the real underwater domain.
arXiv Detail & Related papers (2024-05-25T12:56:15Z) - Global Structure-Aware Diffusion Process for Low-Light Image Enhancement [64.69154776202694]
This paper studies a diffusion-based framework to address the low-light image enhancement problem.
We advocate for the regularization of its inherent ODE-trajectory.
Experimental evaluations reveal that the proposed framework attains distinguished performance in low-light enhancement.
arXiv Detail & Related papers (2023-10-26T17:01:52Z) - WaterFlow: Heuristic Normalizing Flow for Underwater Image Enhancement
and Beyond [52.27796682972484]
Existing underwater image enhancement methods mainly focus on image quality improvement, ignoring the effect on practice.
We propose a normalizing flow for detection-driven underwater image enhancement, dubbed WaterFlow.
Considering the differentiability and interpretability, we incorporate the prior into the data-driven mapping procedure.
arXiv Detail & Related papers (2023-08-02T04:17:35Z) - LLDiffusion: Learning Degradation Representations in Diffusion Models
for Low-Light Image Enhancement [118.83316133601319]
Current deep learning methods for low-light image enhancement (LLIE) typically rely on pixel-wise mapping learned from paired data.
We propose a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process.
arXiv Detail & Related papers (2023-07-27T07:22:51Z) - PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with
Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people.
The task of underwater image enhancement (UIE) has also emerged as the times require.
In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN.
Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z) - Domain Adaptive Adversarial Learning Based on Physics Model Feedback for
Underwater Image Enhancement [10.143025577499039]
We propose a new robust adversarial learning framework via physics model based feedback control and domain adaptation mechanism for enhancing underwater images.
A new method for simulating underwater-like training dataset from RGB-D data by underwater image formation model is proposed.
Final enhanced results on synthetic and real underwater images demonstrate the superiority of the proposed method.
arXiv Detail & Related papers (2020-02-20T07:50:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.