RoRD: Rotation-Robust Descriptors and Orthographic Views for Local
Feature Matching
- URL: http://arxiv.org/abs/2103.08573v1
- Date: Mon, 15 Mar 2021 17:40:25 GMT
- Title: RoRD: Rotation-Robust Descriptors and Orthographic Views for Local
Feature Matching
- Authors: Udit Singh Parihar, Aniket Gujarathi, Kinal Mehta, Satyajit Tourani,
Sourav Garg, Michael Milford and K. Madhava Krishna
- Abstract summary: We present a novel framework that combines learning of invariant descriptors through data augmentation and viewpoint projection.
We evaluate the effectiveness of the proposed approach on key tasks including pose estimation and visual place recognition.
- Score: 32.10261486751993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of local detectors and descriptors in typical computer vision
pipelines work well until variations in viewpoint and appearance change become
extreme. Past research in this area has typically focused on one of two
approaches to this challenge: the use of projections into spaces more suitable
for feature matching under extreme viewpoint changes, and attempting to learn
features that are inherently more robust to viewpoint change. In this paper, we
present a novel framework that combines learning of invariant descriptors
through data augmentation and orthographic viewpoint projection. We propose
rotation-robust local descriptors, learnt through training data augmentation
based on rotation homographies, and a correspondence ensemble technique that
combines vanilla feature correspondences with those obtained through
rotation-robust features. Using a range of benchmark datasets as well as
contributing a new bespoke dataset for this research domain, we evaluate the
effectiveness of the proposed approach on key tasks including pose estimation
and visual place recognition. Our system outperforms a range of baseline and
state-of-the-art techniques, including enabling higher levels of place
recognition precision across opposing place viewpoints and achieves
practically-useful performance levels even under extreme viewpoint changes.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - RADA: Robust and Accurate Feature Learning with Domain Adaptation [7.905594146253435]
We introduce a multi-level feature aggregation network that incorporates two pivotal components to facilitate the learning of robust and accurate features.
Our method, RADA, achieves excellent results in image matching, camera pose estimation, and visual localization tasks.
arXiv Detail & Related papers (2024-07-22T16:49:58Z) - GRA: Detecting Oriented Objects through Group-wise Rotating and Attention [64.21917568525764]
Group-wise Rotating and Attention (GRA) module is proposed to replace the convolution operations in backbone networks for oriented object detection.
GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention.
GRA achieves a new state-of-the-art (SOTA) on the DOTA-v2.0 benchmark, while saving the parameters by nearly 50% compared to the previous SOTA method.
arXiv Detail & Related papers (2024-03-17T07:29:32Z) - Local Feature Matching Using Deep Learning: A Survey [19.322545965903608]
Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition.
In recent years, the introduction of deep learning models has sparked widespread exploration into local feature matching techniques.
The paper also explores the practical application of local feature matching in diverse domains such as Structure from Motion, Remote Sensing Image Registration, and Medical Image Registration.
arXiv Detail & Related papers (2024-01-31T04:32:41Z) - Enhancing Deformable Local Features by Jointly Learning to Detect and
Describe Keypoints [8.390939268280235]
Local feature extraction is a standard approach in computer vision for tackling important tasks such as image matching and retrieval.
We propose DALF, a novel deformation-aware network for jointly detecting and describing keypoints.
Our approach also enhances the performance of two real-world applications: deformable object retrieval and non-rigid 3D surface registration.
arXiv Detail & Related papers (2023-04-02T18:01:51Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - ReF -- Rotation Equivariant Features for Local Feature Matching [30.459559206664427]
We propose an alternative, complementary approach that centers on inducing bias in the model architecture itself to generate rotation-specific' features.
We demonstrate that this high performance, rotation-specific coverage from the steerable CNNs can be expanded to all rotation angles.
We present a detailed analysis of the performance effects of ensembling, robust estimation, network architecture variations, and the use of rotation priors.
arXiv Detail & Related papers (2022-03-10T07:36:09Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Looking Beyond Corners: Contrastive Learning of Visual Representations
for Keypoint Detection and Description Extraction [1.5749416770494706]
Learnable keypoint detectors and descriptors are beginning to outperform classical hand-crafted feature extraction methods.
Recent studies on self-supervised learning of visual representations have driven the increasing performance of learnable models based on deep networks.
We propose the Correspondence Network (CorrNet) that learns to detect repeatable keypoints and to extract discriminative descriptions.
arXiv Detail & Related papers (2021-12-22T16:27:11Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.