Improving Misaligned Multi-modality Image Fusion with One-stage
Progressive Dense Registration
- URL: http://arxiv.org/abs/2308.11165v1
- Date: Tue, 22 Aug 2023 03:46:24 GMT
- Title: Improving Misaligned Multi-modality Image Fusion with One-stage
Progressive Dense Registration
- Authors: Di Wang, Jinyuan Liu, Long Ma, Risheng Liu, Xin Fan
- Abstract summary: Misalignments between multi-modality images pose challenges in image fusion.
We propose a Cross-modality Multi-scale Progressive Dense Registration scheme.
This scheme accomplishes the coarse-to-fine registration exclusively using a one-stage optimization.
- Score: 67.23451452670282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Misalignments between multi-modality images pose challenges in image fusion,
manifesting as structural distortions and edge ghosts. Existing efforts
commonly resort to registering first and fusing later, typically employing two
cascaded stages for registration,i.e., coarse registration and fine
registration. Both stages directly estimate the respective target deformation
fields. In this paper, we argue that the separated two-stage registration is
not compact, and the direct estimation of the target deformation fields is not
accurate enough. To address these challenges, we propose a Cross-modality
Multi-scale Progressive Dense Registration (C-MPDR) scheme, which accomplishes
the coarse-to-fine registration exclusively using a one-stage optimization,
thus improving the fusion performance of misaligned multi-modality images.
Specifically, two pivotal components are involved, a dense Deformation Field
Fusion (DFF) module and a Progressive Feature Fine (PFF) module. The DFF
aggregates the predicted multi-scale deformation sub-fields at the current
scale, while the PFF progressively refines the remaining misaligned features.
Both work together to accurately estimate the final deformation fields. In
addition, we develop a Transformer-Conv-based Fusion (TCF) subnetwork that
considers local and long-range feature dependencies, allowing us to capture
more informative features from the registered infrared and visible images for
the generation of high-quality fused images. Extensive experimental analysis
demonstrates the superiority of the proposed method in the fusion of misaligned
cross-modality images.
Related papers
- Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task.
MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities.
We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z) - SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation [2.336821026049481]
We propose SRIF, a novel Semantic shape Registration framework based on diffusion-based Image morphing and flow estimation.
SRIF achieves high-quality dense correspondences on challenging shape pairs, but also delivers smooth, semantically meaningful in between.
arXiv Detail & Related papers (2024-09-18T03:47:24Z) - BusReF: Infrared-Visible images registration and fusion focus on
reconstructible area using one set of features [39.575353043949725]
In a scenario where multi-modal cameras are operating together, the problem of working with non-aligned images cannot be avoided.
Existing image fusion algorithms rely heavily on strictly registered input image pairs to produce more precise fusion results.
This paper aims to address the problem of image registration and fusion in a single framework, called BusRef.
arXiv Detail & Related papers (2023-12-30T17:32:44Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Breaking Modality Disparity: Harmonized Representation for Infrared and
Visible Image Registration [66.33746403815283]
We propose a scene-adaptive infrared and visible image registration.
We employ homography to simulate the deformation between different planes.
We propose the first ground truth available misaligned infrared and visible image dataset.
arXiv Detail & Related papers (2023-04-12T06:49:56Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Joint Progressive and Coarse-to-fine Registration of Brain MRI via
Deformation Field Integration and Non-Rigid Feature Fusion [9.19672265043614]
We propose a unified framework for robust brain MRI registration in both progressive and coarse-to-fine manners.
Specifically, building on a dual-encoder U-Net, the fixed-moving MRI pair is encoded and decoded into multi-scale deformation sub-fields.
arXiv Detail & Related papers (2021-09-25T15:20:52Z) - TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task.
We employ a restrictive CNN with small and non-overlapping RF for token representation.
In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z) - Unsupervised Multimodal Image Registration with Adaptative Gradient
Guidance [23.461130560414805]
Unsupervised learning-based methods have demonstrated promising performance over accuracy and efficiency in deformable image registration.
The estimated deformation fields of the existing methods fully rely on the to-be-registered image pair.
We propose a novel multimodal registration framework, which leverages the deformation fields estimated from both.
arXiv Detail & Related papers (2020-11-12T05:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.