Improving Worst Case Visual Localization Coverage via Place-specific
Sub-selection in Multi-camera Systems
- URL: http://arxiv.org/abs/2206.13883v1
- Date: Tue, 28 Jun 2022 10:59:39 GMT
- Title: Improving Worst Case Visual Localization Coverage via Place-specific
Sub-selection in Multi-camera Systems
- Authors: Stephen Hausler, Ming Xu, Sourav Garg, Punarjay Chakravarty, Shubham
Shrivastava, Ankit Vora, Michael Milford
- Abstract summary: 6-DoF visual localization systems utilize principled approaches rooted in 3D geometry to perform accurate camera pose estimation of images to a map.
We demonstrate substantially improved worst-case localization performance compared to using off-the-shelf pipelines.
Our proposed approach is particularly applicable to the crowdsharing model of autonomous vehicle deployment.
- Score: 29.519262914510396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 6-DoF visual localization systems utilize principled approaches rooted in 3D
geometry to perform accurate camera pose estimation of images to a map. Current
techniques use hierarchical pipelines and learned 2D feature extractors to
improve scalability and increase performance. However, despite gains in typical
recall@0.25m type metrics, these systems still have limited utility for
real-world applications like autonomous vehicles because of their `worst' areas
of performance - the locations where they provide insufficient recall at a
certain required error tolerance. Here we investigate the utility of using
`place specific configurations', where a map is segmented into a number of
places, each with its own configuration for modulating the pose estimation
step, in this case selecting a camera within a multi-camera system. On the Ford
AV benchmark dataset, we demonstrate substantially improved worst-case
localization performance compared to using off-the-shelf pipelines - minimizing
the percentage of the dataset which has low recall at a certain error
tolerance, as well as improved overall localization performance. Our proposed
approach is particularly applicable to the crowdsharing model of autonomous
vehicle deployment, where a fleet of AVs are regularly traversing a known
route.
Related papers
- Neural Semantic Map-Learning for Autonomous Vehicles [85.8425492858912]
We present a mapping system that fuses local submaps gathered from a fleet of vehicles at a central instance to produce a coherent map of the road environment.
Our method jointly aligns and merges the noisy and incomplete local submaps using a scene-specific Neural Signed Distance Field.
We leverage memory-efficient sparse feature-grids to scale to large areas and introduce a confidence score to model uncertainty in scene reconstruction.
arXiv Detail & Related papers (2024-10-10T10:10:03Z) - SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality [50.179377002092416]
We propose an efficient visual localization method capable of high-quality rendering with fewer parameters.
Our method achieves superior or comparable rendering and localization performance to state-of-the-art implicit-based visual localization approaches.
arXiv Detail & Related papers (2024-09-21T08:46:16Z) - Structured Pruning for Efficient Visual Place Recognition [24.433604332415204]
Visual Place Recognition (VPR) is fundamental for the global re-localization of robots and devices.
Our work introduces a novel structured pruning method to streamline common VPR architectures.
This dual focus significantly enhances the efficiency of the system, reducing both map and model memory requirements and decreasing feature extraction and retrieval latencies.
arXiv Detail & Related papers (2024-09-12T08:32:25Z) - FaVoR: Features via Voxel Rendering for Camera Relocalization [23.7893950095252]
Camera relocalization methods range from dense image alignment to direct camera pose regression from a query image.
We propose a novel approach that leverages a globally sparse yet locally dense 3D representation of 2D features.
By tracking and triangulating landmarks over a sequence of frames, we construct a sparse voxel map optimized to render image patch descriptors observed during tracking.
arXiv Detail & Related papers (2024-09-11T18:58:16Z) - LoLep: Single-View View Synthesis with Locally-Learned Planes and
Self-Attention Occlusion Inference [66.45326873274908]
We propose a novel method, LoLep, which regresses Locally-Learned planes from a single RGB image to represent scenes accurately.
Compared to MINE, our approach has an LPIPS reduction of 4.8%-9.0% and an RV reduction of 73.9%-83.5%.
arXiv Detail & Related papers (2023-07-23T03:38:55Z) - Fast and Lightweight Scene Regressor for Camera Relocalization [1.6708069984516967]
Estimating the camera pose directly with respect to pre-built 3D models can be prohibitively expensive for several applications.
This study proposes a simple scene regression method that requires only a multi-layer perceptron network for mapping scene coordinates.
The proposed approach uses sparse descriptors to regress the scene coordinates, instead of a dense RGB image.
arXiv Detail & Related papers (2022-12-04T14:41:20Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - A comparison of uncertainty estimation approaches for DNN-based camera
localization [6.053739577423792]
This work compares the performances of three uncertainty estimation methods: Monte Carlo Dropout (MCD), Deep Ensemble (DE), and Deep Evidential Regression (DER)
We achieve accurate camera localization and a calibrated uncertainty, to the point that some method can be used for detecting localization failures.
arXiv Detail & Related papers (2022-11-02T16:15:28Z) - Coarse-to-fine Semantic Localization with HD Map for Autonomous Driving
in Structural Scenes [1.1024591739346292]
We propose a cost-effective vehicle localization system with HD map for autonomous driving using cameras as primary sensors.
We formulate vision-based localization as a data association problem that maps visual semantics to landmarks in HD map.
We evaluate our method on two datasets and demonstrate that the proposed approach yields promising localization results in different driving scenarios.
arXiv Detail & Related papers (2021-07-06T11:58:55Z) - Learning Condition Invariant Features for Retrieval-Based Localization
from 1M Images [85.81073893916414]
We develop a novel method for learning more accurate and better generalizing localization features.
On the challenging Oxford RobotCar night condition, our method outperforms the well-known triplet loss by 24.4% in localization accuracy within 5m.
arXiv Detail & Related papers (2020-08-27T14:46:22Z) - Multi-View Optimization of Local Feature Geometry [70.18863787469805]
We address the problem of refining the geometry of local image features from multiple views without known scene or camera geometry.
Our proposed method naturally complements the traditional feature extraction and matching paradigm.
We show that our method consistently improves the triangulation and camera localization performance for both hand-crafted and learned local features.
arXiv Detail & Related papers (2020-03-18T17:22:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.