DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once
- URL: http://arxiv.org/abs/2505.04526v1
- Date: Wed, 07 May 2025 15:59:45 GMT
- Title: DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once
- Authors: Qi Zhou, Yukai Shi, Xiaojun Yang, Xiaoyu Xian, Lunjia Liao, Ruimao Zhang, Liang Lin,
- Abstract summary: A Darkness-Free network is proposed to handle Visible and infrared image disentanglement and fusion all at Once (DFVO)<n>DFVO employs a cascaded multi-task approach to replace the traditional two-stage cascaded training (enhancement and fusion)<n>Our proposed approach outperforms state-of-the-art alternatives in terms of qualitative and quantitative evaluations.
- Score: 57.15043822199561
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visible and infrared image fusion is one of the most crucial tasks in the field of image fusion, aiming to generate fused images with clear structural information and high-quality texture features for high-level vision tasks. However, when faced with severe illumination degradation in visible images, the fusion results of existing image fusion methods often exhibit blurry and dim visual effects, posing major challenges for autonomous driving. To this end, a Darkness-Free network is proposed to handle Visible and infrared image disentanglement and fusion all at Once (DFVO), which employs a cascaded multi-task approach to replace the traditional two-stage cascaded training (enhancement and fusion), addressing the issue of information entropy loss caused by hierarchical data transmission. Specifically, we construct a latent-common feature extractor (LCFE) to obtain latent features for the cascaded tasks strategy. Firstly, a details-extraction module (DEM) is devised to acquire high-frequency semantic information. Secondly, we design a hyper cross-attention module (HCAM) to extract low-frequency information and preserve texture features from source images. Finally, a relevant loss function is designed to guide the holistic network learning, thereby achieving better image fusion. Extensive experiments demonstrate that our proposed approach outperforms state-of-the-art alternatives in terms of qualitative and quantitative evaluations. Particularly, DFVO can generate clearer, more informative, and more evenly illuminated fusion results in the dark environments, achieving best performance on the LLVIP dataset with 63.258 dB PSNR and 0.724 CC, providing more effective information for high-level vision tasks. Our code is publicly accessible at https://github.com/DaVin-Qi530/DFVO.
Related papers
- SGDFuse: SAM-Guided Diffusion for High-Fidelity Infrared and Visible Image Fusion [38.09521879556221]
This paper proposes a conditional diffusion model guided by the Segment Anything Model (SAM) to achieve high-fidelity and semantically-aware image fusion.<n>The framework operates in a two-stage process: it first performs a preliminary fusion of multi-modal features, and then utilizes the semantic masks as a condition to drive the diffusion model's coarse-to-fine denoising generation.<n>Extensive experiments demonstrate that SGDFuse achieves state-of-the-art performance in both subjective and objective evaluations.
arXiv Detail & Related papers (2025-08-07T10:58:52Z) - Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond [74.96466744512992]
The essence of image fusion is to integrate complementary information from source images.
DeFusion++ produces versatile fused representations that can enhance the quality of image fusion and the effectiveness of downstream high-level vision tasks.
arXiv Detail & Related papers (2024-10-16T06:28:49Z) - DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image Fusion [21.64382683858586]
Infrared and visible image fusion aims to combine complementary information from both modalities to provide a more comprehensive scene understanding.
We propose a dual-branch feature decomposition fusion network (DAF-Net) with Maximum domain adaptive.
By incorporating MK-MMD, the DAF-Net effectively aligns the latent feature spaces of visible and infrared images, thereby improving the quality of the fused images.
arXiv Detail & Related papers (2024-09-18T02:14:08Z) - A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion [41.34335755315773]
This paper focuses on how to model correlation-driven decomposing features and reason high-level graph representation.<n>We propose a three-branch encoder-decoder architecture along with corresponding fusion layers as the fusion strategy.<n> Experiments demonstrate the competitive results compared with state-of-the-art methods in visible/infrared image fusion and medical image fusion tasks.
arXiv Detail & Related papers (2024-06-11T09:32:40Z) - VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing [13.777195433138179]
This study aims to design a visible-infrared fusion network for image dehazing.
In particular, we propose a multi-scale Deep Structure Feature Extraction (DSFE) module to restore more spatial and marginal information.
To validate this, we construct a visible-infrared multimodal dataset called AirSim-VID based on the AirSim simulation platform.
arXiv Detail & Related papers (2024-04-11T14:31:11Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - Breaking Free from Fusion Rule: A Fully Semantic-driven Infrared and
Visible Image Fusion [51.22863068854784]
Infrared and visible image fusion plays a vital role in the field of computer vision.
Previous approaches make efforts to design various fusion rules in the loss functions.
We develop a semantic-level fusion network to sufficiently utilize the semantic guidance.
arXiv Detail & Related papers (2022-11-22T13:59:59Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion [68.78897015832113]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.<n>Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Degrade is Upgrade: Learning Degradation for Low-light Image Enhancement [52.49231695707198]
We investigate the intrinsic degradation and relight the low-light image while refining the details and color in two steps.
Inspired by the color image formulation, we first estimate the degradation from low-light inputs to simulate the distortion of environment illumination color, and then refine the content to recover the loss of diffuse illumination color.
Our proposed method has surpassed the SOTA by 0.95dB in PSNR on LOL1000 dataset and 3.18% in mAP on ExDark dataset.
arXiv Detail & Related papers (2021-03-19T04:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.