VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather Conditions
- URL: http://arxiv.org/abs/2504.08219v1
- Date: Fri, 11 Apr 2025 02:59:06 GMT
- Title: VL-UR: Vision-Language-guided Universal Restoration of Images Degraded by Adverse Weather Conditions
- Authors: Ziyan Liu, Yuxu Lu, Huashan Yu, Dong yang,
- Abstract summary: We propose a vision-language-guided universal restoration framework (VL-UR)<n>VL-UR integrates visual and semantic information to enhance image restoration by integrating visual and semantic information.<n> experiments across eleven diverse degradation settings demonstrate VL-UR's state-of-the-art performance, robustness, and adaptability.
- Score: 3.0133850348026936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image restoration is critical for improving the quality of degraded images, which is vital for applications like autonomous driving, security surveillance, and digital content enhancement. However, existing methods are often tailored to specific degradation scenarios, limiting their adaptability to the diverse and complex challenges in real-world environments. Moreover, real-world degradations are typically non-uniform, highlighting the need for adaptive and intelligent solutions. To address these issues, we propose a novel vision-language-guided universal restoration (VL-UR) framework. VL-UR leverages a zero-shot contrastive language-image pre-training (CLIP) model to enhance image restoration by integrating visual and semantic information. A scene classifier is introduced to adapt CLIP, generating high-quality language embeddings aligned with degraded images while predicting degraded types for complex scenarios. Extensive experiments across eleven diverse degradation settings demonstrate VL-UR's state-of-the-art performance, robustness, and adaptability. This positions VL-UR as a transformative solution for modern image restoration challenges in dynamic, real-world environments.
Related papers
- Beyond Degradation Redundancy: Contrastive Prompt Learning for All-in-One Image Restoration [109.38288333994407]
Contrastive Prompt Learning (CPL) is a novel framework that fundamentally enhances prompt-task alignment.
Our framework establishes new state-of-the-art performance while maintaining parameter efficiency, offering a principled solution for unified image restoration.
arXiv Detail & Related papers (2025-04-14T08:24:57Z) - ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [58.99648692413168]
Current image fusion methods struggle to address the composite degradations encountered in real-world imaging scenarios.<n>We propose ControlFusion, which adaptively neutralizes composite degradations.<n>In experiments, ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling.
arXiv Detail & Related papers (2025-03-30T08:18:53Z) - Vision-Language Gradient Descent-driven All-in-One Deep Unfolding Networks [14.180694577459425]
Vision-Language-guided Unfolding Network (VLU-Net) is a unified DUN framework for handling multiple degradation types simultaneously.<n>VLU-Net is the first all-in-one DUN framework and outperforms current leading one-by-one and all-in-one end-to-end methods by 3.74 dB on the SOTS dehazing dataset and 1.70 dB on the Rain100L deraining dataset.
arXiv Detail & Related papers (2025-03-21T08:02:48Z) - Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding [67.57487747508179]
Multiple-in-one image restoration (IR) has made significant progress, aiming to handle all types of single degraded image restoration with a single model.
In this paper, we propose a novel multiple-in-one IR model that can effectively restore images with both single and mixed degradations.
arXiv Detail & Related papers (2024-11-25T09:26:34Z) - PromptHSI: Universal Hyperspectral Image Restoration with Vision-Language Modulated Frequency Adaptation [28.105125164852367]
We propose PromptHSI, the first universal AiO HSI restoration framework.<n>Our approach decomposes text prompts into intensity and bias controllers that effectively guide the restoration process.<n>Our architecture excels at both fine-grained recovery and global information restoration across diverse degradation scenarios.
arXiv Detail & Related papers (2024-11-24T17:08:58Z) - Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality [69.76121008898677]
Fine-grained Selective Calibrated CLIP integrates local hard negative loss and selective calibrated regularization.
Our evaluations show that FSC-CLIP not only achieves compositionality on par with state-of-the-art models but also retains strong multi-modal capabilities.
arXiv Detail & Related papers (2024-10-07T17:16:20Z) - Multi-Scale Representation Learning for Image Restoration with State-Space Model [13.622411683295686]
We propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration.
Our proposed method achieves new state-of-the-art performance while maintaining low computational complexity.
arXiv Detail & Related papers (2024-08-19T16:42:58Z) - DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration.
The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z) - DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion [27.52552274944687]
We introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images.
In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations.
In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations.
arXiv Detail & Related papers (2024-03-31T12:07:04Z) - Controlling Vision-Language Models for Multi-Task Image Restoration [6.239038964461397]
We present a degradation-aware vision-language model (DA-CLIP) to better transfer pretrained vision-language models to low-level vision tasks.
Our approach advances state-of-the-art performance on both emphdegradation-specific and emphunified image restoration tasks.
arXiv Detail & Related papers (2023-10-02T09:10:16Z) - Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video
Restoration [78.14941737723501]
We propose a Cross-consistent Deep Unfolding Network (CDUN) for All-In-One VR.
By orchestrating two cascading procedures, CDUN achieves adaptive processing for diverse degradations.
In addition, we introduce a window-based inter-frame fusion strategy to utilize information from more adjacent frames.
arXiv Detail & Related papers (2023-09-04T14:18:00Z) - Real-world Person Re-Identification via Degradation Invariance Learning [111.86722193694462]
Person re-identification (Re-ID) in real-world scenarios usually suffers from various degradation factors, e.g., low-resolution, weak illumination, blurring and adverse weather.
We propose a degradation invariance learning framework for real-world person Re-ID.
By introducing a self-supervised disentangled representation learning strategy, our method is able to simultaneously extract identity-related robust features.
arXiv Detail & Related papers (2020-04-10T07:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.