Related papers: Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws

Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws

URL: http://arxiv.org/abs/2510.26268v1
Date: Thu, 30 Oct 2025 08:53:13 GMT
Title: Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws
Authors: Lin Guo, Xiaoqing Luo, Wei Xie, Zhancheng Zhang, Hui Li, Rui Wang, Zhenhua Feng, Xiaoning Song,
Abstract summary: Existing infrared and visible image fusion methods often face the dilemma of balancing modal information.<n>This manuscript revisits the essence of generative image fusion under the inspiration of human cognitive laws.<n>It proposes a novel infrared and visible image fusion method, termed HCLFuse.
Score: 25.147481162887136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing infrared and visible image fusion methods often face the dilemma of balancing modal information. Generative fusion methods reconstruct fused images by learning from data distributions, but their generative capabilities remain limited. Moreover, the lack of interpretability in modal information selection further affects the reliability and consistency of fusion results in complex scenarios. This manuscript revisits the essence of generative image fusion under the inspiration of human cognitive laws and proposes a novel infrared and visible image fusion method, termed HCLFuse. First, HCLFuse investigates the quantification theory of information mapping in unsupervised fusion networks, which leads to the design of a multi-scale mask-regulated variational bottleneck encoder. This encoder applies posterior probability modeling and information decomposition to extract accurate and concise low-level modal information, thereby supporting the generation of high-fidelity structural details. Furthermore, the probabilistic generative capability of the diffusion model is integrated with physical laws, forming a time-varying physical guidance mechanism that adaptively regulates the generation process at different stages, thereby enhancing the ability of the model to perceive the intrinsic structure of data and reducing dependence on data quality. Experimental results show that the proposed method achieves state-of-the-art fusion performance in qualitative and quantitative evaluations across multiple datasets and significantly improves semantic segmentation metrics. This fully demonstrates the advantages of this generative image fusion method, drawing inspiration from human cognition, in enhancing structural consistency and detail quality.

Related papers

Reversible Efficient Diffusion for Image Fusion [66.35113261837469]
Multi-modal image fusion aims to consolidate complementary information from diverse source images into a unified representation.<n>While diffusion models have demonstrated impressive generative capabilities in image generation, they often suffer from detail loss when applied to image fusion tasks.<n>This issue arises from the accumulation of noise errors inherent in the Markov process, leading to inconsistency and degradation in the fused results.<n>We propose the Reversible Efficient Diffusion (RED) model - an explicitly supervised training framework that inherits the powerful generative capability of diffusion models while avoiding the distribution estimation.
arXiv Detail & Related papers (2026-01-28T05:14:55Z)
Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach [99.80480649258557]
DiTFuse is an instruction-driven framework that performs semantics-aware fusion within a single model.<n>Experiments on public IVIF, MFF, and MEF benchmarks confirm superior quantitative and qualitative performance, sharper textures, and better semantic retention.
arXiv Detail & Related papers (2025-12-08T05:04:54Z)
Efficient Rectified Flow for Image Fusion [48.330480065862474]
We propose RFfusion, an efficient one-step diffusion model for image fusion based on Rectified Flow.<n>We also propose a task-specific variational autoencoder architecture tailored for image fusion.<n>Our method outperforms other state-of-the-art methods in terms of both inference speed and fusion quality.
arXiv Detail & Related papers (2025-09-20T06:21:00Z)
Causality-Driven Infrared and Visible Image Fusion [7.454657847653563]
This paper re-examines the image fusion task from the causality perspective.<n>It disentangles the model from the impact of bias by constructing a tailored causal graph.<n>Back-door Adjustment based Feature Fusion Module (BAFFM) is proposed to eliminate confounder interference.
arXiv Detail & Related papers (2025-05-27T07:48:52Z)
Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism [3.388001684915793]
We present AMFusionNet, an innovative approach to infrared and visible image fusion (IVIF) By assimilating thermal details from infrared images with texture features from visible sources, our method produces images enriched with comprehensive information. Our method outperforms state-of-the-art algorithms in terms of quality and quantity.
arXiv Detail & Related papers (2023-08-25T21:05:11Z)
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM) Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z)
CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion [68.78897015832113]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.<n>Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z)
Interactive Feature Embedding for Infrared and Visible Image Fusion [94.77188069479155]
General deep learning-based methods for infrared and visible image fusion rely on the unsupervised mechanism for vital information retention. We propose a novel interactive feature embedding in self-supervised learning framework for infrared and visible image fusion.
arXiv Detail & Related papers (2022-11-09T13:34:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.