An Interactively Reinforced Paradigm for Joint Infrared-Visible Image
Fusion and Saliency Object Detection
- URL: http://arxiv.org/abs/2305.09999v1
- Date: Wed, 17 May 2023 06:48:35 GMT
- Title: An Interactively Reinforced Paradigm for Joint Infrared-Visible Image
Fusion and Saliency Object Detection
- Authors: Di Wang, Jinyuan Liu, Risheng Liu, Xin Fan
- Abstract summary: This research focuses on the discovery and localization of hidden objects in the wild and serves unmanned systems.
Through empirical analysis, infrared and visible image fusion (IVIF) enables hard-to-find objects apparent.
multimodal salient object detection (SOD) accurately delineates the precise spatial location of objects within the picture.
- Score: 59.02821429555375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This research focuses on the discovery and localization of hidden objects in
the wild and serves unmanned systems. Through empirical analysis, infrared and
visible image fusion (IVIF) enables hard-to-find objects apparent, whereas
multimodal salient object detection (SOD) accurately delineates the precise
spatial location of objects within the picture. Their common characteristic of
seeking complementary cues from different source images motivates us to explore
the collaborative relationship between Fusion and Salient object detection
tasks on infrared and visible images via an Interactively Reinforced multi-task
paradigm for the first time, termed IRFS. To the seamless bridge of multimodal
image fusion and SOD tasks, we specifically develop a Feature Screening-based
Fusion subnetwork (FSFNet) to screen out interfering features from source
images, thereby preserving saliency-related features. After generating the
fused image through FSFNet, it is then fed into the subsequent Fusion-Guided
Cross-Complementary SOD subnetwork (FC$^2$Net) as the third modality to drive
the precise prediction of the saliency map by leveraging the complementary
information derived from the fused image. In addition, we develop an
interactive loop learning strategy to achieve the mutual reinforcement of IVIF
and SOD tasks with a shorter training period and fewer network parameters.
Comprehensive experiment results demonstrate that the seamless bridge of IVIF
and SOD mutually enhances their performance, and highlights their superiority.
Related papers
- From Text to Pixels: A Context-Aware Semantic Synergy Solution for
Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images.
Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z) - Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images [1.662438436885552]
Multi-modal fusion has been determined to enhance the accuracy by fusing data from multiple modalities.
We propose a novel multi-modal fusion strategy for mapping relationships between different channels at the early stage.
By addressing fusion in the early stage, as opposed to mid or late-stage methods, our method achieves competitive and even superior performance compared to existing techniques.
arXiv Detail & Related papers (2023-10-21T00:56:11Z) - SSPFusion: A Semantic Structure-Preserving Approach for Infrared and
Visible Image Fusion [30.55433673796615]
Most existing learning-based infrared and visible image fusion (IVIF) methods exhibit massive redundant information in the fusion images.
We propose a semantic structure-preserving approach for IVIF, namely SSPFusion.
Our method is able to generate high-quality fusion images from pairs of infrared and visible images, which can boost the performance of downstream computer-vision tasks.
arXiv Detail & Related papers (2023-09-26T08:13:32Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.