Related papers: Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion

Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion

URL: http://arxiv.org/abs/2510.11456v1
Date: Mon, 13 Oct 2025 14:26:33 GMT
Title: Coupled Degradation Modeling and Fusion: A VLM-Guided Degradation-Coupled Network for Degradation-Aware Infrared and Visible Image Fusion
Authors: Tianpei Zhang, Jufeng Zhao, Yiming Zhu, Guangmang Cui,
Abstract summary: We propose a novel VLM-Guided Degradation-Coupled Fusion network (VGDCFusion)<n>Our VGDCFusion significantly outperforms existing state-of-the-art fusion approaches under various degraded image scenarios.
Score: 9.915632806109555
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing Infrared and Visible Image Fusion (IVIF) methods typically assume high-quality inputs. However, when handing degraded images, these methods heavily rely on manually switching between different pre-processing techniques. This decoupling of degradation handling and image fusion leads to significant performance degradation. In this paper, we propose a novel VLM-Guided Degradation-Coupled Fusion network (VGDCFusion), which tightly couples degradation modeling with the fusion process and leverages vision-language models (VLMs) for degradation-aware perception and guided suppression. Specifically, the proposed Specific-Prompt Degradation-Coupled Extractor (SPDCE) enables modality-specific degradation awareness and establishes a joint modeling of degradation suppression and intra-modal feature extraction. In parallel, the Joint-Prompt Degradation-Coupled Fusion (JPDCF) facilitates cross-modal degradation perception and couples residual degradation filtering with complementary cross-modal feature fusion. Extensive experiments demonstrate that our VGDCFusion significantly outperforms existing state-of-the-art fusion approaches under various degraded image scenarios. Our code is available at https://github.com/Lmmh058/VGDCFusion.

Related papers

Reversible Efficient Diffusion for Image Fusion [66.35113261837469]
Multi-modal image fusion aims to consolidate complementary information from diverse source images into a unified representation.<n>While diffusion models have demonstrated impressive generative capabilities in image generation, they often suffer from detail loss when applied to image fusion tasks.<n>This issue arises from the accumulation of noise errors inherent in the Markov process, leading to inconsistency and degradation in the fused results.<n>We propose the Reversible Efficient Diffusion (RED) model - an explicitly supervised training framework that inherits the powerful generative capability of diffusion models while avoiding the distribution estimation.
arXiv Detail & Related papers (2026-01-28T05:14:55Z)
MdaIF: Robust One-Stop Multi-Degradation-Aware Image Fusion with Language-Driven Semantics [8.783211177601045]
Infrared and visible image fusion aims to integrate complementary multi-modal information into a single fused result.<n>We propose a one-stop degradation-aware image fusion framework for multi-degradation scenarios driven by a large language model (MdaIF)<n>To adaptively extract diverse weather-aware degradation knowledge and scene feature representations, we employ a pre-trained vision-language model (VLM) in our framework.
arXiv Detail & Related papers (2025-11-16T09:43:12Z)
Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark [58.61079960074608]
Existing infrared image enhancement methods focus on tackling individual degradations.<n>All-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness.
arXiv Detail & Related papers (2025-10-10T12:55:54Z)
Dual-Domain Perspective on Degradation-Aware Fusion: A VLM-Guided Robust Infrared and Visible Image Fusion Framework [9.915632806109555]
GD2Fusion is a novel framework that integrates vision-language models for degradation perception with dual-domain (frequency/spatial) joint optimization.<n>It achieves superior fusion performance compared with existing algorithms and strategies in dual-source degraded scenarios.
arXiv Detail & Related papers (2025-09-05T10:48:46Z)
SGDFuse: SAM-Guided Diffusion for High-Fidelity Infrared and Visible Image Fusion [65.80051636480836]
This paper proposes a conditional diffusion model guided by the Segment Anything Model (SAM) to achieve high-fidelity and semantically-aware image fusion.<n>The framework operates in a two-stage process: it first performs a preliminary fusion of multi-modal features, and then utilizes the semantic masks as a condition to drive the diffusion model's coarse-to-fine denoising generation.<n>Extensive experiments demonstrate that SGDFuse achieves state-of-the-art performance in both subjective and objective evaluations.
arXiv Detail & Related papers (2025-08-07T10:58:52Z)
UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration [16.493990086330985]
UniLDiff is a unified framework enhanced with degradation- and detail-aware mechanisms.<n>We introduce a Degradation-Aware Feature Fusion (DAFF) to dynamically inject low-quality features into each denoising step.<n>We also design a Detail-Aware Expert Module (DAEM) in the decoder to enhance texture and fine-structure recovery.
arXiv Detail & Related papers (2025-07-31T16:02:00Z)
DDFusion:Degradation-Decoupled Fusion Framework for Robust Infrared and Visible Images Fusion [9.242363983469346]
We propose a Degradation-Decoupled Fusion(DDFusion) framework.<n>DDFusion achieves superior fusion performance under both clean and degraded conditions.
arXiv Detail & Related papers (2025-04-15T05:02:49Z)
ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [58.99648692413168]
Current image fusion methods struggle to address the composite degradations encountered in real-world imaging scenarios.<n>We propose ControlFusion, which adaptively neutralizes composite degradations.<n>In experiments, ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling.
arXiv Detail & Related papers (2025-03-30T08:18:53Z)
DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance [48.84182709640984]
Existing fusion methods are tailored for high-quality images but struggle with degraded images captured under harsh circumstances.<n>This work presents a textbfDegradation and textbfSemantic textbfPrior dual-guided framework for degraded image textbfFusion (textbfDSPFusion)
arXiv Detail & Related papers (2025-03-30T08:18:50Z)
LLDiffusion: Learning Degradation Representations in Diffusion Models for Low-Light Image Enhancement [118.83316133601319]
Current deep learning methods for low-light image enhancement (LLIE) typically rely on pixel-wise mapping learned from paired data. We propose a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process.
arXiv Detail & Related papers (2023-07-27T07:22:51Z)
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM) Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.