VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing
- URL: http://arxiv.org/abs/2404.07790v1
- Date: Thu, 11 Apr 2024 14:31:11 GMT
- Title: VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing
- Authors: Meng Yu, Te Cui, Haoyang Lu, Yufeng Yue,
- Abstract summary: This study aims to design a visible-infrared fusion network for image dehazing.
In particular, we propose a multi-scale Deep Structure Feature Extraction (DSFE) module to restore more spatial and marginal information.
To validate this, we construct a visible-infrared multimodal dataset called AirSim-VID based on the AirSim simulation platform.
- Score: 13.777195433138179
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image dehazing poses significant challenges in environmental perception. Recent research mainly focus on deep learning-based methods with single modality, while they may result in severe information loss especially in dense-haze scenarios. The infrared image exhibits robustness to the haze, however, existing methods have primarily treated the infrared modality as auxiliary information, failing to fully explore its rich information in dehazing. To address this challenge, the key insight of this study is to design a visible-infrared fusion network for image dehazing. In particular, we propose a multi-scale Deep Structure Feature Extraction (DSFE) module, which incorporates the Channel-Pixel Attention Block (CPAB) to restore more spatial and marginal information within the deep structural features. Additionally, we introduce an inconsistency weighted fusion strategy to merge the two modalities by leveraging the more reliable information. To validate this, we construct a visible-infrared multimodal dataset called AirSim-VID based on the AirSim simulation platform. Extensive experiments performed on challenging real and simulated image datasets demonstrate that VIFNet can outperform many state-of-the-art competing methods. The code and dataset are available at https://github.com/mengyu212/VIFNet_dehazing.
Related papers
- InfMAE: A Foundation Model in the Infrared Modality [38.23685358198649]
In this paper, we propose InfMAE, a foundation model in infrared modality.
We release an infrared dataset, called Inf30, to address the problem of lacking large-scale data for self-supervised learning.
We also design an information-aware masking strategy, which is suitable for infrared images.
arXiv Detail & Related papers (2024-02-01T08:02:10Z) - SSPFusion: A Semantic Structure-Preserving Approach for Infrared and
Visible Image Fusion [30.55433673796615]
Most existing learning-based infrared and visible image fusion (IVIF) methods exhibit massive redundant information in the fusion images.
We propose a semantic structure-preserving approach for IVIF, namely SSPFusion.
Our method is able to generate high-quality fusion images from pairs of infrared and visible images, which can boost the performance of downstream computer-vision tasks.
arXiv Detail & Related papers (2023-09-26T08:13:32Z) - An Interactively Reinforced Paradigm for Joint Infrared-Visible Image
Fusion and Saliency Object Detection [59.02821429555375]
This research focuses on the discovery and localization of hidden objects in the wild and serves unmanned systems.
Through empirical analysis, infrared and visible image fusion (IVIF) enables hard-to-find objects apparent.
multimodal salient object detection (SOD) accurately delineates the precise spatial location of objects within the picture.
arXiv Detail & Related papers (2023-05-17T06:48:35Z) - Local Contrast and Global Contextual Information Make Infrared Small
Object Salient Again [5.324958606516871]
Infrared small object detection (ISOS) aims to segment small objects only covered with several pixels from clutter background in infrared images.
It's of great challenge due to: 1) small objects lack of sufficient intensity, shape and texture information; 2) small objects are easily lost in the process where detection models, say deep neural networks, obtain high-level semantic features and image-level receptive fields through successive downsampling.
This paper proposes a reliable detection model for ISOS, dubbed UCFNet, which can handle well the two issues.
Experiments on several public datasets demonstrate that our method significantly outperforms the state-of-the
arXiv Detail & Related papers (2023-01-28T05:18:13Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Deep Burst Super-Resolution [165.90445859851448]
We propose a novel architecture for the burst super-resolution task.
Our network takes multiple noisy RAW images as input, and generates a denoised, super-resolved RGB image as output.
In order to enable training and evaluation on real-world data, we additionally introduce the BurstSR dataset.
arXiv Detail & Related papers (2021-01-26T18:57:21Z) - Drone-based RGB-Infrared Cross-Modality Vehicle Detection via
Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image.
We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle.
Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.