Early Bird: Loop Closures from Opposing Viewpoints for
Perceptually-Aliased Indoor Environments
- URL: http://arxiv.org/abs/2010.01421v3
- Date: Sun, 20 Dec 2020 13:41:24 GMT
- Title: Early Bird: Loop Closures from Opposing Viewpoints for
Perceptually-Aliased Indoor Environments
- Authors: Satyajit Tourani, Dhagash Desai, Udit Singh Parihar, Sourav Garg, Ravi
Kiran Sarvadevabhatla, Michael Milford, K. Madhava Krishna
- Abstract summary: We present novel research that simultaneously addresses viewpoint change and perceptual aliasing.
We show that our integration of VPR with SLAM significantly boosts the performance of VPR, feature correspondence, and pose graph submodules.
For the first time, we demonstrate a localization system capable of state-of-the-art performance despite perceptual aliasing and extreme 180-degree-rotated viewpoint change.
- Score: 35.663671249819124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Significant advances have been made recently in Visual Place Recognition
(VPR), feature correspondence, and localization due to the proliferation of
deep-learning-based methods. However, existing approaches tend to address,
partially or fully, only one of two key challenges: viewpoint change and
perceptual aliasing. In this paper, we present novel research that
simultaneously addresses both challenges by combining deep-learned features
with geometric transformations based on reasonable domain assumptions about
navigation on a ground-plane, whilst also removing the requirement for
specialized hardware setup (e.g. lighting, downwards facing cameras). In
particular, our integration of VPR with SLAM by leveraging the robustness of
deep-learned features and our homography-based extreme viewpoint invariance
significantly boosts the performance of VPR, feature correspondence, and pose
graph submodules of the SLAM pipeline. For the first time, we demonstrate a
localization system capable of state-of-the-art performance despite perceptual
aliasing and extreme 180-degree-rotated viewpoint change in a range of
real-world and simulated experiments. Our system is able to achieve early loop
closures that prevent significant drifts in SLAM trajectories. We also compare
extensively several deep architectures for VPR and descriptor matching. We also
show that superior place recognition and descriptor matching across opposite
views results in a similar performance gain in back-end pose graph
optimization.
Related papers
- A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives.
We also develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts.
Our proposed ReSFU framework consistently achieves satisfactory performance on different segmentation applications.
arXiv Detail & Related papers (2024-07-02T14:12:21Z) - EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition [6.996304653818122]
We propose a simple yet powerful approach to better exploit the potential of a foundation model for Visual Place Recognition.
We first demonstrate that features extracted from self-attention layers can serve as a powerful re-ranker for VPR.
We then demonstrate that a single-stage method leveraging internal ViT layers for pooling can generate global features that achieve state-of-the-art results.
arXiv Detail & Related papers (2024-05-28T11:24:41Z) - Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in
Dynamic Environments [55.864869961717424]
It is typically challenging for visual or visual-inertial odometry systems to handle the problems of dynamic scenes and pure rotation.
We design a novel visual-inertial odometry (VIO) system called RD-VIO to handle both of these problems.
arXiv Detail & Related papers (2023-10-23T16:30:39Z) - DH-PTAM: A Deep Hybrid Stereo Events-Frames Parallel Tracking And Mapping System [1.443696537295348]
This paper presents a robust approach for a visual parallel tracking and mapping (PTAM) system that excels in challenging environments.
Our proposed method combines the strengths of heterogeneous multi-modal visual sensors, in a unified reference frame.
Our implementation's research-based Python API is publicly available on GitHub.
arXiv Detail & Related papers (2023-06-02T19:52:13Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Consistency Regularization for Deep Face Anti-Spoofing [69.70647782777051]
Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems.
Motivated by this exciting observation, we conjecture that encouraging feature consistency of different views may be a promising way to boost FAS models.
We enhance both Embedding-level and Prediction-level Consistency Regularization (EPCR) in FAS.
arXiv Detail & Related papers (2021-11-24T08:03:48Z) - PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains.
Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z) - RoRD: Rotation-Robust Descriptors and Orthographic Views for Local
Feature Matching [32.10261486751993]
We present a novel framework that combines learning of invariant descriptors through data augmentation and viewpoint projection.
We evaluate the effectiveness of the proposed approach on key tasks including pose estimation and visual place recognition.
arXiv Detail & Related papers (2021-03-15T17:40:25Z) - ConvSequential-SLAM: A Sequence-based, Training-less Visual Place
Recognition Technique for Changing Environments [19.437998213418446]
Visual Place Recognition (VPR) is the ability to correctly recall a previously visited place under changing viewpoints and appearances.
We present a new handcrafted VPR technique that achieves state-of-the-art place matching performance under challenging conditions.
arXiv Detail & Related papers (2020-09-28T16:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.