Direction-aware multi-scale gradient loss for infrared and visible image fusion
- URL: http://arxiv.org/abs/2510.13067v1
- Date: Wed, 15 Oct 2025 01:26:39 GMT
- Title: Direction-aware multi-scale gradient loss for infrared and visible image fusion
- Authors: Kaixuan Yang, Wei Xiang, Zhenshuai Chen, Tong Jin, Yunpeng Liu,
- Abstract summary: Infrared and visible image fusion aims to integrate complementary information from co-registered source images to produce a single, informative result.<n>We introduce a direction-aware, multi-scale gradient loss that supervises horizontal and vertical components separately and preserves their sign across scales.
- Score: 11.688147476759566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Infrared and visible image fusion aims to integrate complementary information from co-registered source images to produce a single, informative result. Most learning-based approaches train with a combination of structural similarity loss, intensity reconstruction loss, and a gradient-magnitude term. However, collapsing gradients to their magnitude removes directional information, yielding ambiguous supervision and suboptimal edge fidelity. We introduce a direction-aware, multi-scale gradient loss that supervises horizontal and vertical components separately and preserves their sign across scales. This axis-wise, sign-preserving objective provides clear directional guidance at both fine and coarse resolutions, promoting sharper, better-aligned edges and richer texture preservation without changing model architectures or training protocols. Experiments on open-source model and multiple public benchmarks demonstrate effectiveness of our approach.
Related papers
- AngularFuse: A Closer Look at Angle-based Perception for Spatial-Sensitive Multi-Modality Image Fusion [54.84069863008752]
This paper proposes an angle-based perception framework for spatial-sensitive image fusion (AngularFuse)<n>By combining Laplacian edge enhancement with adaptive histogram, reference images with richer details and more balanced brightness are generated.<n>Experiments on the MSRS, RoadScene, and M3FD public datasets show that AngularFuse outperforms existing mainstream methods with clear margin.
arXiv Detail & Related papers (2025-10-14T08:13:15Z) - Self-Supervised Enhancement of Forward-Looking Sonar Images: Bridging Cross-Modal Degradation Gaps through Feature Space Transformation and Multi-Frame Fusion [17.384482405769567]
Enhancing forward-looking sonar images is critical for accurate underwater target detection.<n>We propose a feature-space transformation that maps sonar images from the pixel domain to a robust feature domain.<n>Our method significantly outperforms existing approaches, effectively suppressing noise, preserving detailed edges, and substantially improving brightness.
arXiv Detail & Related papers (2025-04-15T08:34:56Z) - WTCL-Dehaze: Rethinking Real-world Image Dehazing via Wavelet Transform and Contrastive Learning [17.129068060454255]
Single image dehazing is essential for applications such as autonomous driving and surveillance.
We propose an enhanced semi-supervised dehazing network that integrates Contrastive Loss and Discrete Wavelet Transform.
Our proposed algorithm achieves superior performance and improved robustness compared to state-of-the-art single image dehazing methods.
arXiv Detail & Related papers (2024-10-07T05:36:11Z) - Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution [80.85121353651554]
We introduce kernel-wise differential operations within the convolutional kernel and develop several learnable directional gradient convolutions.
These convolutions are integrated in parallel with a novel linear weighting mechanism to form an Adaptive Directional Gradient Convolution (DGConv)
We further devise an Adaptive Information Interaction Block (AIIBlock) to adeptly balance the enhancement of texture and contrast while meticulously investigating the interdependencies, culminating in the creation of a DGPNet for Real-SR through simple stacking.
arXiv Detail & Related papers (2024-05-11T14:21:40Z) - PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant
Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes.
We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion [68.78897015832113]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.<n>Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Interactive Feature Embedding for Infrared and Visible Image Fusion [94.77188069479155]
General deep learning-based methods for infrared and visible image fusion rely on the unsupervised mechanism for vital information retention.
We propose a novel interactive feature embedding in self-supervised learning framework for infrared and visible image fusion.
arXiv Detail & Related papers (2022-11-09T13:34:42Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - Infrared and Visible Image Fusion via Interactive Compensatory Attention
Adversarial Learning [7.995162257955025]
We propose a novel end-to-end mode based on generative adversarial training to achieve better fusion balance.
In particular, in the generator, we construct a multi-level encoder-decoder network with a triple path, and adopt infrared and visible paths to provide additional intensity and information gradient.
In addition, dual discriminators are designed to identify the similar distribution between fused result and source images, and the generator is optimized to produce a more balanced result.
arXiv Detail & Related papers (2022-03-29T08:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.