DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion
- URL: http://arxiv.org/abs/2503.17673v1
- Date: Sat, 22 Mar 2025 07:01:58 GMT
- Title: DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion
- Authors: Jinyuan Liu, Bowei Zhang, Qingyun Mei, Xingyuan Li, Yang Zou, Zhiying Jiang, Long Ma, Risheng Liu, Xin Fan,
- Abstract summary: Infrared and visible image fusion integrates information from distinct spectral bands to enhance image quality.<n>Existing approaches treat image fusion and subsequent high-level tasks as separate processes.<n>We propose a Discriminative Cross- Dimension Evolutionary Learning Framework, termed DCEvo, which simultaneously enhances visual quality and perception accuracy.
- Score: 58.36400052566673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Infrared and visible image fusion integrates information from distinct spectral bands to enhance image quality by leveraging the strengths and mitigating the limitations of each modality. Existing approaches typically treat image fusion and subsequent high-level tasks as separate processes, resulting in fused images that offer only marginal gains in task performance and fail to provide constructive feedback for optimizing the fusion process. To overcome these limitations, we propose a Discriminative Cross-Dimension Evolutionary Learning Framework, termed DCEvo, which simultaneously enhances visual quality and perception accuracy. Leveraging the robust search capabilities of Evolutionary Learning, our approach formulates the optimization of dual tasks as a multi-objective problem by employing an Evolutionary Algorithm (EA) to dynamically balance loss function parameters. Inspired by visual neuroscience, we integrate a Discriminative Enhancer (DE) within both the encoder and decoder, enabling the effective learning of complementary features from different modalities. Additionally, our Cross-Dimensional Embedding (CDE) block facilitates mutual enhancement between high-dimensional task features and low-dimensional fusion features, ensuring a cohesive and efficient feature integration process. Experimental results on three benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches, achieving an average improvement of 9.32% in visual quality while also enhancing subsequent high-level tasks. The code is available at https://github.com/Beate-Suy-Zhang/DCEvo.
Related papers
- OCCO: LVM-guided Infrared and Visible Image Fusion Framework based on Object-aware and Contextual COntrastive Learning [19.22887628187884]
A novel LVM-guided fusion framework with Object-aware and Contextual COntrastive learning is proposed.<n>A novel feature interaction fusion network is also designed to resolve information conflicts in fusion images caused by modality differences.<n>The effectiveness of the proposed method is validated, and exceptional performance is also demonstrated on downstream visual task.
arXiv Detail & Related papers (2025-03-24T12:57:23Z) - Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond [52.486290612938895]
Multi-modality image fusion plays a crucial role in integrating diverse modalities to enhance scene understanding.<n>Recent approaches have shifted towards task-specific design, but struggle to achieve the The Best of Both Worlds'' due to inconsistent optimization goals.<n>We propose a novel method that leverages the semantic knowledge from the Segment Anything Model (SAM) to Grow the quality of fusion results and Establish downstream task adaptability, namely SAGE.
arXiv Detail & Related papers (2025-03-03T06:16:31Z) - Improving Neural Surface Reconstruction with Feature Priors from Multi-View Image [87.00660347447494]
Recent advancements in Neural Surface Reconstruction (NSR) have significantly improved multi-view reconstruction when coupled with volume rendering.
We propose an investigation into feature-level consistent loss, aiming to harness valuable feature priors from diverse pretext visual tasks.
Our results, analyzed on DTU and EPFL, reveal that feature priors from image matching and multi-view stereo datasets outperform other pretext tasks.
arXiv Detail & Related papers (2024-08-04T16:09:46Z) - From Text to Pixels: A Context-Aware Semantic Synergy Solution for
Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images.
Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z) - PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant
Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes.
We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z) - MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection [54.52102265418295]
We propose a novel and effective Multi-Level Fusion network, named as MLF-DET, for high-performance cross-modal 3D object DETection.
For the feature-level fusion, we present the Multi-scale Voxel Image fusion (MVI) module, which densely aligns multi-scale voxel features with image features.
For the decision-level fusion, we propose the lightweight Feature-cued Confidence Rectification (FCR) module, which exploits image semantics to rectify the confidence of detection candidates.
arXiv Detail & Related papers (2023-07-18T11:26:02Z) - Unsupervised Image Fusion Method based on Feature Mutual Mapping [16.64607158983448]
We propose an unsupervised adaptive image fusion method to address the above issues.
We construct a global map to measure the connections of pixels between the input source images.
Our method achieves superior performance in both visual perception and objective evaluation.
arXiv Detail & Related papers (2022-01-25T07:50:14Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - AE-Net: Autonomous Evolution Image Fusion Method Inspired by Human
Cognitive Mechanism [34.57055312296812]
We propose a robust and general image fusion method with autonomous evolution ability, denoted with AE-Net.
Through the collaborative optimization of multiple image fusion methods to simulate the cognitive process of human brain, unsupervised learning image fusion task can be transformed into semi-supervised image fusion task or supervised image fusion task.
Our image fusion method can effectively unify the cross-modal image fusion task and the same modal image fusion task, and effectively overcome the difference of data distribution between different datasets.
arXiv Detail & Related papers (2020-07-17T05:19:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.