Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration
- URL: http://arxiv.org/abs/2205.11876v1
- Date: Tue, 24 May 2022 07:51:57 GMT
- Title: Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration
- Authors: Di Wang, Jinyuan Liu, Xin Fan, Risheng Liu
- Abstract summary: We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
- Score: 59.02821429555375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent learning-based image fusion methods have marked numerous progress in
pre-registered multi-modality data, but suffered serious ghosts dealing with
misaligned multi-modality data, due to the spatial deformation and the
difficulty narrowing cross-modality discrepancy. To overcome the obstacles, in
this paper, we present a robust cross-modality generation-registration paradigm
for unsupervised misaligned infrared and visible image fusion (IVIF).
Specifically, we propose a Cross-modality Perceptual Style Transfer Network
(CPSTN) to generate a pseudo infrared image taking a visible image as input.
Benefiting from the favorable geometry preservation ability of the CPSTN, the
generated pseudo infrared image embraces a sharp structure, which is more
conducive to transforming cross-modality image alignment into mono-modality
registration coupled with the structure-sensitive of the infrared image. In
this case, we introduce a Multi-level Refinement Registration Network (MRRN) to
predict the displacement vector field between distorted and pseudo infrared
images and reconstruct registered infrared image under the mono-modality
setting. Moreover, to better fuse the registered infrared images and visible
images, we present a feature Interaction Fusion Module (IFM) to adaptively
select more meaningful features for fusion in the Dual-path Interaction Fusion
Network (DIFN). Extensive experimental results suggest that the proposed method
performs superior capability on misaligned cross-modality image fusion.
Related papers
- BusReF: Infrared-Visible images registration and fusion focus on
reconstructible area using one set of features [39.575353043949725]
In a scenario where multi-modal cameras are operating together, the problem of working with non-aligned images cannot be avoided.
Existing image fusion algorithms rely heavily on strictly registered input image pairs to produce more precise fusion results.
This paper aims to address the problem of image registration and fusion in a single framework, called BusRef.
arXiv Detail & Related papers (2023-12-30T17:32:44Z) - Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images [1.662438436885552]
Multi-modal fusion has been determined to enhance the accuracy by fusing data from multiple modalities.
We propose a novel multi-modal fusion strategy for mapping relationships between different channels at the early stage.
By addressing fusion in the early stage, as opposed to mid or late-stage methods, our method achieves competitive and even superior performance compared to existing techniques.
arXiv Detail & Related papers (2023-10-21T00:56:11Z) - Improving Misaligned Multi-modality Image Fusion with One-stage
Progressive Dense Registration [67.23451452670282]
Misalignments between multi-modality images pose challenges in image fusion.
We propose a Cross-modality Multi-scale Progressive Dense Registration scheme.
This scheme accomplishes the coarse-to-fine registration exclusively using a one-stage optimization.
arXiv Detail & Related papers (2023-08-22T03:46:24Z) - Breaking Modality Disparity: Harmonized Representation for Infrared and
Visible Image Registration [66.33746403815283]
We propose a scene-adaptive infrared and visible image registration.
We employ homography to simulate the deformation between different planes.
We propose the first ground truth available misaligned infrared and visible image dataset.
arXiv Detail & Related papers (2023-04-12T06:49:56Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature
Ensemble for Multi-modality Image Fusion [72.8898811120795]
We propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion.
Our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation.
arXiv Detail & Related papers (2022-11-20T12:02:07Z) - SA-DNet: A on-demand semantic object registration network adapting to
non-rigid deformation [3.3843451892622576]
We propose a Semantic-Aware on-Demand registration network (SA-DNet) to confine the feature matching process to the semantic region of interest.
Our method adapts better to the presence of non-rigid distortions in the images and provides semantically well-registered images.
arXiv Detail & Related papers (2022-10-18T14:41:28Z) - Towards Homogeneous Modality Learning and Multi-Granularity Information
Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views.
Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z) - Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared
Person Re-Identification [84.32086702849338]
We propose a novel modality-adaptive mixup and invariant decomposition (MID) approach for RGB-infrared person re-identification.
MID designs a modality-adaptive mixup scheme to generate suitable mixed modality images between RGB and infrared images.
Experiments on two challenging benchmarks demonstrate superior performance of MID over state-of-the-art methods.
arXiv Detail & Related papers (2022-03-03T14:26:49Z) - TGFuse: An Infrared and Visible Image Fusion Approach Based on
Transformer and Generative Adversarial Network [15.541268697843037]
We propose an infrared and visible image fusion algorithm based on a lightweight transformer module and adversarial learning.
Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations.
The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art.
arXiv Detail & Related papers (2022-01-25T07:43:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.