Related papers: Learning Camera Localization via Dense Scene Matching

Learning Camera Localization via Dense Scene Matching

URL: http://arxiv.org/abs/2103.16792v1
Date: Wed, 31 Mar 2021 03:47:42 GMT
Title: Learning Camera Localization via Dense Scene Matching
Authors: Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu, Ping Tan
Abstract summary: Camera localization aims to estimate 6 DoF camera poses from RGB images. Recent learning-based approaches encode structures into a specific convolutional neural network (CNN) We present a new method for camera localization using dense matching (DSM)
Score: 45.0957383562443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Camera localization aims to estimate 6 DoF camera poses from RGB images. Traditional methods detect and match interest points between a query image and a pre-built 3D model. Recent learning-based approaches encode scene structures into a specific convolutional neural network (CNN) and thus are able to predict dense coordinates from RGB images. However, most of them require re-training or re-adaption for a new scene and have difficulties in handling large-scale scenes due to limited network capacity. We present a new method for scene agnostic camera localization using dense scene matching (DSM), where a cost volume is constructed between a query image and a scene. The cost volume and the corresponding coordinates are processed by a CNN to predict dense coordinates. Camera poses can then be solved by PnP algorithms. In addition, our method can be extended to temporal domain, which leads to extra performance boost during testing time. Our scene-agnostic approach achieves comparable accuracy as the existing scene-specific approaches, such as KFNet, on the 7scenes and Cambridge benchmark. This approach also remarkably outperforms state-of-the-art scene-agnostic dense coordinate regression network SANet. The Code is available at https://github.com/Tangshitao/Dense-Scene-Matching.

Related papers

GLACE: Global Local Accelerated Coordinate Encoding [66.87005863868181]
Scene coordinate regression methods are effective in small-scale scenes but face significant challenges in large-scale scenes. We propose GLACE, which integrates pre-trained global and local encodings and enables SCR to scale to large scenes with only a single small-sized network. Our method achieves state-of-the-art results on large-scale scenes with a low-map-size model.
arXiv Detail & Related papers (2024-06-06T17:59:50Z)
Improved Scene Landmark Detection for Camera Localization [11.56648898250606]
Method based on scene landmark detection (SLD) was recently proposed to address these limitations. It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks. We show that the accuracy gap was due to insufficient model capacity and noisy labels during training.
arXiv Detail & Related papers (2024-01-31T18:59:12Z)
SACReg: Scene-Agnostic Coordinate Regression for Visual Localization [16.866303169903237]
We propose a generalized SCR model trained once in new test scenes, regardless of their scale, without any finetuning. Instead of encoding the scene coordinates into the network weights, our model takes as input a database image with some sparse 2D pixel to 3D coordinate annotations. We show that the database representation of images and their 2D-3D annotations can be highly compressed with negligible loss of localization performance.
arXiv Detail & Related papers (2023-07-21T16:56:36Z)
HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer [23.920690073252636]
We present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarse-to-fine manner from a single RGB image. The proposed method, which is an extension of HSCNet, allows us to train compact models which scale robustly to large environments.
arXiv Detail & Related papers (2023-05-05T15:00:14Z)
NeuMap: Neural Coordinate Mapping by Auto-Transdecoder for Camera Localization [60.73541222862195]
NeuMap is an end-to-end neural mapping method for camera localization. It encodes a whole scene into a grid of latent codes, with which a Transformer-based auto-decoder regresses 3D coordinates of query pixels.
arXiv Detail & Related papers (2022-11-21T04:46:22Z)
Graph Attention Network for Camera Relocalization on Dynamic Scenes [1.0398909602421018]
We devise a graph attention network-based approach for learning a scene triangle mesh representation in order to estimate an image camera position in a dynamic environment. Our approach significantly improves the camera pose accuracy of the state-of-the-art method from $0.358$ to $0.506$ on the RIO10 benchmark for dynamic indoor camera relocalization.
arXiv Detail & Related papers (2022-09-29T18:57:52Z)
Visual Localization via Few-Shot Scene Region Classification [84.34083435501094]
Visual (re)localization addresses the problem of estimating the 6-DoF camera pose of a query image captured in a known scene. Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates. We propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images.
arXiv Detail & Related papers (2022-08-14T22:39:02Z)
VS-Net: Voting with Segmentation for Visual Localization [72.8165619061249]
We propose a novel visual localization framework that establishes 2D-to-3D correspondences between the query image and the 3D map with a series of learnable scene-specific landmarks. Our proposed VS-Net is extensively tested on multiple public benchmarks and can outperform state-of-the-art visual localization methods.
arXiv Detail & Related papers (2021-05-23T08:44:11Z)
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose [114.89389528198738]
We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching.
arXiv Detail & Related papers (2021-03-16T17:40:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.