Lazy Visual Localization via Motion Averaging
- URL: http://arxiv.org/abs/2307.09981v1
- Date: Wed, 19 Jul 2023 13:40:45 GMT
- Title: Lazy Visual Localization via Motion Averaging
- Authors: Siyan Dong, Shaohui Liu, Hengkai Guo, Baoquan Chen, Marc Pollefeys
- Abstract summary: We show that it is possible to achieve high localization accuracy without reconstructing the scene from the database.
Experiments show that our visual localization proposal, LazyLoc, achieves comparable performance against state-of-the-art structure-based methods.
- Score: 89.8709956317671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual (re)localization is critical for various applications in computer
vision and robotics. Its goal is to estimate the 6 degrees of freedom (DoF)
camera pose for each query image, based on a set of posed database images.
Currently, all leading solutions are structure-based that either explicitly
construct 3D metric maps from the database with structure-from-motion, or
implicitly encode the 3D information with scene coordinate regression models.
On the contrary, visual localization without reconstructing the scene in 3D
offers clear benefits. It makes deployment more convenient by reducing database
pre-processing time, releasing storage requirements, and remaining unaffected
by imperfect reconstruction, etc. In this technical report, we demonstrate that
it is possible to achieve high localization accuracy without reconstructing the
scene from the database. The key to achieving this owes to a tailored motion
averaging over database-query pairs. Experiments show that our visual
localization proposal, LazyLoc, achieves comparable performance against
state-of-the-art structure-based methods. Furthermore, we showcase the
versatility of LazyLoc, which can be easily extended to handle complex
configurations such as multi-query co-localization and camera rigs.
Related papers
- KRONC: Keypoint-based Robust Camera Optimization for 3D Car Reconstruction [58.04846444985808]
This paper introduces KRONC, a novel approach aimed at inferring view poses by leveraging prior knowledge about the object to reconstruct and its representation through semantic keypoints.
With a focus on vehicle scenes, KRONC is able to estimate the position of the views as a solution to a light optimization problem targeting the convergence of keypoints' back-projections to a singular point.
arXiv Detail & Related papers (2024-09-09T08:08:05Z) - SACReg: Scene-Agnostic Coordinate Regression for Visual Localization [16.866303169903237]
We propose a generalized SCR model trained once in new test scenes, regardless of their scale, without any finetuning.
Instead of encoding the scene coordinates into the network weights, our model takes as input a database image with some sparse 2D pixel to 3D coordinate annotations.
We show that the database representation of images and their 2D-3D annotations can be highly compressed with negligible loss of localization performance.
arXiv Detail & Related papers (2023-07-21T16:56:36Z) - Visual Localization using Imperfect 3D Models from the Internet [54.731309449883284]
This paper studies how imperfections in 3D models affect localization accuracy.
We show that 3D models from the Internet show promise as an easy-to-obtain scene representation.
arXiv Detail & Related papers (2023-04-12T16:15:05Z) - Map-free Visual Relocalization: Metric Pose Relative to a Single Image [21.28513803531557]
We propose Map-free Relocalization, using only one photo of a scene to enable instant, metric scaled relocalization.
Existing datasets are not suitable to benchmark map-free relocalization, due to their focus on large scenes or their limited variability.
We have constructed a new dataset of 655 small places of interest, such as sculptures, murals and fountains, collected worldwide.
arXiv Detail & Related papers (2022-10-11T14:49:49Z) - Visual Localization via Few-Shot Scene Region Classification [84.34083435501094]
Visual (re)localization addresses the problem of estimating the 6-DoF camera pose of a query image captured in a known scene.
Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates.
We propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images.
arXiv Detail & Related papers (2022-08-14T22:39:02Z) - RenderNet: Visual Relocalization Using Virtual Viewpoints in Large-Scale
Indoor Environments [36.91498676137178]
We propose a virtual view synthesis-based approach, RenderNet, to enrich the database and refine poses regarding this particular scenario.
The proposed method can largely improve the performance in large-scale indoor environments, achieving an improvement of 7.1% and 12.2% on the Inloc dataset.
arXiv Detail & Related papers (2022-07-26T00:08:43Z) - MeshLoc: Mesh-Based Visual Localization [54.731309449883284]
We explore a more flexible alternative based on dense 3D meshes that does not require features matching between database images to build the scene representation.
Surprisingly competitive results can be obtained when extracting features on renderings of these meshes, without any neural rendering stage.
Our results show that dense 3D model-based representations are a promising alternative to existing representations and point to interesting and challenging directions for future research.
arXiv Detail & Related papers (2022-07-21T21:21:10Z) - SCFusion: Real-time Incremental Scene Reconstruction with Semantic
Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner.
Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.