3D Surfel Map-Aided Visual Relocalization with Learned Descriptors
- URL: http://arxiv.org/abs/2104.03856v1
- Date: Thu, 8 Apr 2021 15:59:57 GMT
- Title: 3D Surfel Map-Aided Visual Relocalization with Learned Descriptors
- Authors: Haoyang Ye, Huaiyang Huang, Marco Hutter, Timothy Sandy, Ming Liu
- Abstract summary: We introduce a method for visual relocalization using the geometric information from a 3D surfel map.
A visual database is first built by global indices from the 3D surfel map rendering, which provides associations between image points and 3D surfels.
A hierarchical camera relocalization algorithm then utilizes the visual database to estimate 6-DoF camera poses.
- Score: 15.608529165143718
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce a method for visual relocalization using the
geometric information from a 3D surfel map. A visual database is first built by
global indices from the 3D surfel map rendering, which provides associations
between image points and 3D surfels. Surfel reprojection constraints are
utilized to optimize the keyframe poses and map points in the visual database.
A hierarchical camera relocalization algorithm then utilizes the visual
database to estimate 6-DoF camera poses. Learned descriptors are further used
to improve the performance in challenging cases. We present evaluation under
real-world conditions and simulation to show the effectiveness and efficiency
of our method, and make the final camera poses consistently well aligned with
the 3D environment.
Related papers
- Visual Localization in 3D Maps: Comparing Point Cloud, Mesh, and NeRF Representations [8.522160106746478]
We present a global visual localization system capable of localizing a single camera image across various 3D map representations.
Our system generates a database by synthesizing novel views of the scene, creating RGB and depth image pairs.
NeRF synthesized images show superior performance, localizing query images at an average success rate of 72%.
arXiv Detail & Related papers (2024-08-21T19:37:17Z) - 3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization [13.868258945395326]
This paper presents a novel system designed for 3D mapping and visual relocalization using 3D Gaussian Splatting.
Our proposed method uses LiDAR and camera data to create accurate and visually plausible representations of the environment.
arXiv Detail & Related papers (2024-03-17T23:06:12Z) - Neural Voting Field for Camera-Space 3D Hand Pose Estimation [106.34750803910714]
We present a unified framework for camera-space 3D hand pose estimation from a single RGB image based on 3D implicit representation.
We propose a novel unified 3D dense regression scheme to estimate camera-space 3D hand pose via dense 3D point-wise voting in camera frustum.
arXiv Detail & Related papers (2023-05-07T16:51:34Z) - Tracking by 3D Model Estimation of Unknown Objects in Videos [122.56499878291916]
We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation.
Our representation tackles a complex long-term dense correspondence problem between all 3D points on the object for all video frames.
The proposed optimization minimizes a novel loss function to estimate the best 3D shape, texture, and 6DoF pose.
arXiv Detail & Related papers (2023-04-13T11:32:36Z) - CAPE: Camera View Position Embedding for Multi-View 3D Object Detection [100.02565745233247]
Current query-based methods rely on global 3D position embeddings to learn the geometric correspondence between images and 3D space.
We propose a novel method based on CAmera view Position Embedding, called CAPE.
CAPE achieves state-of-the-art performance (61.0% NDS and 52.5% mAP) among all LiDAR-free methods on nuScenes dataset.
arXiv Detail & Related papers (2023-03-17T18:59:54Z) - Neural Correspondence Field for Object Pose Estimation [67.96767010122633]
We propose a method for estimating the 6DoF pose of a rigid object with an available 3D model from a single RGB image.
Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum.
arXiv Detail & Related papers (2022-07-30T01:48:23Z) - Improved Modeling of 3D Shapes with Multi-view Depth Maps [48.8309897766904]
We present a general-purpose framework for modeling 3D shapes using CNNs.
Using just a single depth image of the object, we can output a dense multi-view depth map representation of 3D objects.
arXiv Detail & Related papers (2020-09-07T17:58:27Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.