SA-DNet: A on-demand semantic object registration network adapting to
non-rigid deformation
- URL: http://arxiv.org/abs/2210.09900v1
- Date: Tue, 18 Oct 2022 14:41:28 GMT
- Title: SA-DNet: A on-demand semantic object registration network adapting to
non-rigid deformation
- Authors: Housheng Xie and Junhui Qiu and Yang Yang and Yukuan Zhang
- Abstract summary: We propose a Semantic-Aware on-Demand registration network (SA-DNet) to confine the feature matching process to the semantic region of interest.
Our method adapts better to the presence of non-rigid distortions in the images and provides semantically well-registered images.
- Score: 3.3843451892622576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As an essential processing step before the fusing of infrared and visible
images, the performance of image registration determines whether the two images
can be fused at correct spatial position. In the actual scenario, the varied
imaging devices may lead to a change in perspective or time gap between shots,
making significant non-rigid spatial relationship in infrared and visible
images. Even if a large number of feature points are matched, the registration
accuracy may still be inadequate, affecting the result of image fusion and
other vision tasks. To alleviate this problem, we propose a Semantic-Aware
on-Demand registration network (SA-DNet), which mainly purpose is to confine
the feature matching process to the semantic region of interest (sROI) by
designing semantic-aware module (SAM) and HOL-Deep hybrid matching module
(HDM). After utilizing TPS to transform infrared and visible images based on
the corresponding feature points in sROI, the registered images are fused using
image fusion module (IFM) to achieve a fully functional registration and fusion
network. Moreover, we point out that for different demands, this type of
approach allows us to select semantic objects for feature matching as needed
and accomplishes task-specific registration based on specific requirements. To
demonstrate the robustness of SA-DNet for non-rigid distortions, we conduct
extensive experiments by comparing SA-DNet with five state-of-the-art infrared
and visible image feature matching methods, and the experimental results show
that our method adapts better to the presence of non-rigid distortions in the
images and provides semantically well-registered images.
Related papers
- Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Modular Anti-noise Deep Learning Network for Robotic Grasp Detection
Based on RGB Images [2.759223695383734]
This paper introduces an interesting approach to detect grasping pose from a single RGB image.
We propose a modular learning network augmented with grasp detection and semantic segmentation.
We demonstrate the feasibility and accuracy of our proposed approach through practical experiments and evaluations.
arXiv Detail & Related papers (2023-10-30T02:01:49Z) - An Interactively Reinforced Paradigm for Joint Infrared-Visible Image
Fusion and Saliency Object Detection [59.02821429555375]
This research focuses on the discovery and localization of hidden objects in the wild and serves unmanned systems.
Through empirical analysis, infrared and visible image fusion (IVIF) enables hard-to-find objects apparent.
multimodal salient object detection (SOD) accurately delineates the precise spatial location of objects within the picture.
arXiv Detail & Related papers (2023-05-17T06:48:35Z) - Breaking Modality Disparity: Harmonized Representation for Infrared and
Visible Image Registration [66.33746403815283]
We propose a scene-adaptive infrared and visible image registration.
We employ homography to simulate the deformation between different planes.
We propose the first ground truth available misaligned infrared and visible image dataset.
arXiv Detail & Related papers (2023-04-12T06:49:56Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Towards Homogeneous Modality Learning and Multi-Granularity Information
Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views.
Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z) - Efficient and Accurate Multi-scale Topological Network for Single Image
Dehazing [31.543771270803056]
In this paper, we pay attention to the feature extraction and utilization of the input image itself.
We propose a Multi-scale Topological Network (MSTN) to fully explore the features at different scales.
Meanwhile, we design a Multi-scale Feature Fusion Module (MFFM) and an Adaptive Feature Selection Module (AFSM) to achieve the selection and fusion of features at different scales.
arXiv Detail & Related papers (2021-02-24T08:53:14Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.