TGFuse: An Infrared and Visible Image Fusion Approach Based on
Transformer and Generative Adversarial Network
- URL: http://arxiv.org/abs/2201.10147v1
- Date: Tue, 25 Jan 2022 07:43:30 GMT
- Title: TGFuse: An Infrared and Visible Image Fusion Approach Based on
Transformer and Generative Adversarial Network
- Authors: Dongyu Rao, Xiao-Jun Wu, Tianyang Xu
- Abstract summary: We propose an infrared and visible image fusion algorithm based on a lightweight transformer module and adversarial learning.
Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations.
The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art.
- Score: 15.541268697843037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The end-to-end image fusion framework has achieved promising performance,
with dedicated convolutional networks aggregating the multi-modal local
appearance. However, long-range dependencies are directly neglected in existing
CNN fusion approaches, impeding balancing the entire image-level perception for
complex scenario fusion. In this paper, therefore, we propose an infrared and
visible image fusion algorithm based on a lightweight transformer module and
adversarial learning. Inspired by the global interaction power, we use the
transformer technique to learn the effective global fusion relations. In
particular, shallow features extracted by CNN are interacted in the proposed
transformer fusion module to refine the fusion relationship within the spatial
scope and across channels simultaneously. Besides, adversarial learning is
designed in the training process to improve the output discrimination via
imposing competitive consistency from the inputs, reflecting the specific
characteristics in infrared and visible images. The experimental performance
demonstrates the effectiveness of the proposed modules, with superior
improvement against the state-of-the-art, generalising a novel paradigm via
transformer and adversarial learning in the fusion task.
Related papers
- From Text to Pixels: A Context-Aware Semantic Synergy Solution for
Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images.
Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z) - Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based
Transformer Network for Remote Sensing Image Super-Resolution [13.894645293832044]
Transformer-based models have shown competitive performance in remote sensing image super-resolution (RSISR)
We propose a novel transformer architecture called Cross-Spatial Pixel Integration and Cross-Stage Feature Fusion Based Transformer Network (SPIFFNet) for RSISR.
Our proposed model effectively enhances global cognition and understanding of the entire image, facilitating efficient integration of features cross-stages.
arXiv Detail & Related papers (2023-07-06T13:19:06Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - Breaking Free from Fusion Rule: A Fully Semantic-driven Infrared and
Visible Image Fusion [51.22863068854784]
Infrared and visible image fusion plays a vital role in the field of computer vision.
Previous approaches make efforts to design various fusion rules in the loss functions.
We develop a semantic-level fusion network to sufficiently utilize the semantic guidance.
arXiv Detail & Related papers (2022-11-22T13:59:59Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Infrared and Visible Image Fusion via Interactive Compensatory Attention
Adversarial Learning [7.995162257955025]
We propose a novel end-to-end mode based on generative adversarial training to achieve better fusion balance.
In particular, in the generator, we construct a multi-level encoder-decoder network with a triple path, and adopt infrared and visible paths to provide additional intensity and information gradient.
In addition, dual discriminators are designed to identify the similar distribution between fused result and source images, and the generator is optimized to produce a more balanced result.
arXiv Detail & Related papers (2022-03-29T08:28:14Z) - Unsupervised Image Fusion Method based on Feature Mutual Mapping [16.64607158983448]
We propose an unsupervised adaptive image fusion method to address the above issues.
We construct a global map to measure the connections of pixels between the input source images.
Our method achieves superior performance in both visual perception and objective evaluation.
arXiv Detail & Related papers (2022-01-25T07:50:14Z) - Cross-Modality Fusion Transformer for Multispectral Object Detection [0.0]
Multispectral image pairs can provide the combined information, making object detection applications more reliable and robust.
We present a simple yet effective cross-modality feature fusion approach, named Cross-Modality Fusion Transformer (CFT) in this paper.
arXiv Detail & Related papers (2021-10-30T15:34:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.