Visual Localization via Few-Shot Scene Region Classification
- URL: http://arxiv.org/abs/2208.06933v1
- Date: Sun, 14 Aug 2022 22:39:02 GMT
- Title: Visual Localization via Few-Shot Scene Region Classification
- Authors: Siyan Dong, Shuzhe Wang, Yixin Zhuang, Juho Kannala, Marc Pollefeys,
Baoquan Chen
- Abstract summary: Visual (re)localization addresses the problem of estimating the 6-DoF camera pose of a query image captured in a known scene.
Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates.
We propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images.
- Score: 84.34083435501094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual (re)localization addresses the problem of estimating the 6-DoF (Degree
of Freedom) camera pose of a query image captured in a known scene, which is a
key building block of many computer vision and robotics applications. Recent
advances in structure-based localization solve this problem by memorizing the
mapping from image pixels to scene coordinates with neural networks to build
2D-3D correspondences for camera pose optimization. However, such memorization
requires training by amounts of posed images in each scene, which is heavy and
inefficient. On the contrary, few-shot images are usually sufficient to cover
the main regions of a scene for a human operator to perform visual
localization. In this paper, we propose a scene region classification approach
to achieve fast and effective scene memorization with few-shot images. Our
insight is leveraging a) pre-learned feature extractor, b) scene region
classifier, and c) meta-learning strategy to accelerate training while
mitigating overfitting. We evaluate our method on both indoor and outdoor
benchmarks. The experiments validate the effectiveness of our method in the
few-shot setting, and the training time is significantly reduced to only a few
minutes. Code available at: \url{https://github.com/siyandong/SRC}
Related papers
- Self-supervised Learning of Neural Implicit Feature Fields for Camera Pose Refinement [32.335953514942474]
This paper proposes to jointly learn the scene representation along with a 3D dense feature field and a 2D feature extractor.
We learn the underlying geometry of the scene with an implicit field through volumetric rendering and design our feature field to leverage intermediate geometric information encoded in the implicit field.
Visual localization is then achieved by aligning the image-based features and the rendered volumetric features.
arXiv Detail & Related papers (2024-06-12T17:51:53Z) - Improved Scene Landmark Detection for Camera Localization [11.56648898250606]
Method based on scene landmark detection (SLD) was recently proposed to address these limitations.
It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks.
We show that the accuracy gap was due to insufficient model capacity and noisy labels during training.
arXiv Detail & Related papers (2024-01-31T18:59:12Z) - Lazy Visual Localization via Motion Averaging [89.8709956317671]
We show that it is possible to achieve high localization accuracy without reconstructing the scene from the database.
Experiments show that our visual localization proposal, LazyLoc, achieves comparable performance against state-of-the-art structure-based methods.
arXiv Detail & Related papers (2023-07-19T13:40:45Z) - PixSelect: Less but Reliable Pixels for Accurate and Efficient
Localization [0.0]
We address the problem of estimating the global 6 DoF camera pose from a single RGB image in a given environment.
Our work exceeds state-ofthe-art methods on outdoor Cambridge Landmarks dataset.
arXiv Detail & Related papers (2022-06-08T09:46:03Z) - Continual Learning for Image-Based Camera Localization [14.47046413243358]
We study the problem of visual localization in a continual learning setup.
Our results show that similar to the classification domain, non-stationary data induces catastrophic forgetting in deep networks for visual localization.
We propose a new sampling method based on coverage score (Buff-CS) that adapts the existing sampling strategies in the buffering process to the problem of visual localization.
arXiv Detail & Related papers (2021-08-20T11:18:05Z) - VS-Net: Voting with Segmentation for Visual Localization [72.8165619061249]
We propose a novel visual localization framework that establishes 2D-to-3D correspondences between the query image and the 3D map with a series of learnable scene-specific landmarks.
Our proposed VS-Net is extensively tested on multiple public benchmarks and can outperform state-of-the-art visual localization methods.
arXiv Detail & Related papers (2021-05-23T08:44:11Z) - Learning Camera Localization via Dense Scene Matching [45.0957383562443]
Camera localization aims to estimate 6 DoF camera poses from RGB images.
Recent learning-based approaches encode structures into a specific convolutional neural network (CNN)
We present a new method for camera localization using dense matching (DSM)
arXiv Detail & Related papers (2021-03-31T03:47:42Z) - Back to the Feature: Learning Robust Camera Localization from Pixels to
Pose [114.89389528198738]
We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model.
The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching.
arXiv Detail & Related papers (2021-03-16T17:40:12Z) - TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization
Tasks [79.01176229586855]
We propose a novel supervised pretraining paradigm for clip features that considers background clips and global video information to improve temporal sensitivity.
Extensive experiments show that using features trained with our novel pretraining strategy significantly improves the performance of recent state-of-the-art methods on three tasks.
arXiv Detail & Related papers (2020-11-23T15:40:15Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.