Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion
- URL: http://arxiv.org/abs/2510.11456v1
- Date: Mon, 13 Oct 2025 14:26:33 GMT
- Title: Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion
- Authors: Tianpei Zhang, Jufeng Zhao, Yiming Zhu, Guangmang Cui,
- Abstract summary: We propose a novel VLM-Guided Degradation-Coupled Fusion network (VGDCFusion)<n>Our VGDCFusion significantly outperforms existing state-of-the-art fusion approaches under various degraded image scenarios.
- Score: 9.915632806109555
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Infrared and Visible Image Fusion (IVIF) methods typically assume high-quality inputs. However, when handing degraded images, these methods heavily rely on manually switching between different pre-processing techniques. This decoupling of degradation handling and image fusion leads to significant performance degradation. In this paper, we propose a novel VLM-Guided Degradation-Coupled Fusion network (VGDCFusion), which tightly couples degradation modeling with the fusion process and leverages vision-language models (VLMs) for degradation-aware perception and guided suppression. Specifically, the proposed Specific-Prompt Degradation-Coupled Extractor (SPDCE) enables modality-specific degradation awareness and establishes a joint modeling of degradation suppression and intra-modal feature extraction. In parallel, the Joint-Prompt Degradation-Coupled Fusion (JPDCF) facilitates cross-modal degradation perception and couples residual degradation filtering with complementary cross-modal feature fusion. Extensive experiments demonstrate that our VGDCFusion significantly outperforms existing state-of-the-art fusion approaches under various degraded image scenarios. Our code is available at https://github.com/Lmmh058/VGDCFusion.
Related papers
- Reversible Efficient Diffusion for Image Fusion [66.35113261837469]
Multi-modal image fusion aims to consolidate complementary information from diverse source images into a unified representation.<n>While diffusion models have demonstrated impressive generative capabilities in image generation, they often suffer from detail loss when applied to image fusion tasks.<n>This issue arises from the accumulation of noise errors inherent in the Markov process, leading to inconsistency and degradation in the fused results.<n>We propose the Reversible Efficient Diffusion (RED) model - an explicitly supervised training framework that inherits the powerful generative capability of diffusion models while avoiding the distribution estimation.
arXiv Detail & Related papers (2026-01-28T05:14:55Z) - MdaIF: Robust One-Stop Multi-Degradation-Aware Image Fusion with Language-Driven Semantics [8.783211177601045]
Infrared and visible image fusion aims to integrate complementary multi-modal information into a single fused result.<n>We propose a one-stop degradation-aware image fusion framework for multi-degradation scenarios driven by a large language model (MdaIF)<n>To adaptively extract diverse weather-aware degradation knowledge and scene feature representations, we employ a pre-trained vision-language model (VLM) in our framework.
arXiv Detail & Related papers (2025-11-16T09:43:12Z) - Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark [58.61079960074608]
Existing infrared image enhancement methods focus on tackling individual degradations.<n>All-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness.
arXiv Detail & Related papers (2025-10-10T12:55:54Z) - Dual-Domain Perspective on Degradation-Aware Fusion: A VLM-Guided Robust Infrared and Visible Image Fusion Framework [9.915632806109555]
GD2Fusion is a novel framework that integrates vision-language models for degradation perception with dual-domain (frequency/spatial) joint optimization.<n>It achieves superior fusion performance compared with existing algorithms and strategies in dual-source degraded scenarios.
arXiv Detail & Related papers (2025-09-05T10:48:46Z) - SGDFuse: SAM-Guided Diffusion for High-Fidelity Infrared and Visible Image Fusion [65.80051636480836]
This paper proposes a conditional diffusion model guided by the Segment Anything Model (SAM) to achieve high-fidelity and semantically-aware image fusion.<n>The framework operates in a two-stage process: it first performs a preliminary fusion of multi-modal features, and then utilizes the semantic masks as a condition to drive the diffusion model's coarse-to-fine denoising generation.<n>Extensive experiments demonstrate that SGDFuse achieves state-of-the-art performance in both subjective and objective evaluations.
arXiv Detail & Related papers (2025-08-07T10:58:52Z) - UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration [16.493990086330985]
UniLDiff is a unified framework enhanced with degradation- and detail-aware mechanisms.<n>We introduce a Degradation-Aware Feature Fusion (DAFF) to dynamically inject low-quality features into each denoising step.<n>We also design a Detail-Aware Expert Module (DAEM) in the decoder to enhance texture and fine-structure recovery.
arXiv Detail & Related papers (2025-07-31T16:02:00Z) - DDFusion:Degradation-Decoupled Fusion Framework for Robust Infrared and Visible Images Fusion [9.242363983469346]
We propose a Degradation-Decoupled Fusion(DDFusion) framework.<n>DDFusion achieves superior fusion performance under both clean and degraded conditions.
arXiv Detail & Related papers (2025-04-15T05:02:49Z) - ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [58.99648692413168]
Current image fusion methods struggle to address the composite degradations encountered in real-world imaging scenarios.<n>We propose ControlFusion, which adaptively neutralizes composite degradations.<n>In experiments, ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling.
arXiv Detail & Related papers (2025-03-30T08:18:53Z) - DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance [48.84182709640984]
Existing fusion methods are tailored for high-quality images but struggle with degraded images captured under harsh circumstances.<n>This work presents a textbfDegradation and textbfSemantic textbfPrior dual-guided framework for degraded image textbfFusion (textbfDSPFusion)
arXiv Detail & Related papers (2025-03-30T08:18:50Z) - LLDiffusion: Learning Degradation Representations in Diffusion Models
for Low-Light Image Enhancement [118.83316133601319]
Current deep learning methods for low-light image enhancement (LLIE) typically rely on pixel-wise mapping learned from paired data.
We propose a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process.
arXiv Detail & Related papers (2023-07-27T07:22:51Z) - DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM)
Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.