Improved Scene Landmark Detection for Camera Localization
- URL: http://arxiv.org/abs/2401.18083v1
- Date: Wed, 31 Jan 2024 18:59:12 GMT
- Title: Improved Scene Landmark Detection for Camera Localization
- Authors: Tien Do and Sudipta N. Sinha
- Abstract summary: Method based on scene landmark detection (SLD) was recently proposed to address these limitations.
It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks.
We show that the accuracy gap was due to insufficient model capacity and noisy labels during training.
- Score: 11.56648898250606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Camera localization methods based on retrieval, local feature matching, and
3D structure-based pose estimation are accurate but require high storage, are
slow, and are not privacy-preserving. A method based on scene landmark
detection (SLD) was recently proposed to address these limitations. It involves
training a convolutional neural network (CNN) to detect a few predetermined,
salient, scene-specific 3D points or landmarks and computing camera pose from
the associated 2D-3D correspondences. Although SLD outperformed existing
learning-based approaches, it was notably less accurate than 3D structure-based
methods. In this paper, we show that the accuracy gap was due to insufficient
model capacity and noisy labels during training. To mitigate the capacity
issue, we propose to split the landmarks into subgroups and train a separate
network for each subgroup. To generate better training labels, we propose using
dense reconstructions to estimate visibility of scene landmarks. Finally, we
present a compact architecture to improve memory efficiency. Accuracy wise, our
approach is on par with state of the art structure based methods on the
INDOOR-6 dataset but runs significantly faster and uses less storage. Code and
models can be found at https://github.com/microsoft/SceneLandmarkLocalization.
Related papers
- ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals [9.201550006194994]
Learnable matchers often underperform when there exists only small regions of co-visibility between image pairs.
We propose LFM-3D, a Learnable Feature Matching framework that uses models based on graph neural networks.
We show that the resulting improved correspondences lead to much higher relative posing accuracy for in-the-wild image pairs.
arXiv Detail & Related papers (2023-03-22T17:46:27Z) - Fast and Lightweight Scene Regressor for Camera Relocalization [1.6708069984516967]
Estimating the camera pose directly with respect to pre-built 3D models can be prohibitively expensive for several applications.
This study proposes a simple scene regression method that requires only a multi-layer perceptron network for mapping scene coordinates.
The proposed approach uses sparse descriptors to regress the scene coordinates, instead of a dense RGB image.
arXiv Detail & Related papers (2022-12-04T14:41:20Z) - Visual Localization via Few-Shot Scene Region Classification [84.34083435501094]
Visual (re)localization addresses the problem of estimating the 6-DoF camera pose of a query image captured in a known scene.
Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates.
We propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images.
arXiv Detail & Related papers (2022-08-14T22:39:02Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Learning Camera Localization via Dense Scene Matching [45.0957383562443]
Camera localization aims to estimate 6 DoF camera poses from RGB images.
Recent learning-based approaches encode structures into a specific convolutional neural network (CNN)
We present a new method for camera localization using dense matching (DSM)
arXiv Detail & Related papers (2021-03-31T03:47:42Z) - DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF
Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points.
For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner.
Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.