SC-wLS: Towards Interpretable Feed-forward Camera Re-localization
- URL: http://arxiv.org/abs/2210.12748v1
- Date: Sun, 23 Oct 2022 15:15:48 GMT
- Title: SC-wLS: Towards Interpretable Feed-forward Camera Re-localization
- Authors: Xin Wu, Hao Zhao, Shunkai Li, Yingdian Cao, Hongbin Zha
- Abstract summary: Visual re-localization aims to recover camera poses in a known environment, which is vital for applications like robotics or augmented reality.
Feed-forward absolute camera pose regression methods directly output poses by a network, but suffer from low accuracy.
We propose a feed-forward method termed SC-wLS that exploits all scene coordinate estimates for weighted least squares pose regression.
- Score: 29.332038781334443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual re-localization aims to recover camera poses in a known environment,
which is vital for applications like robotics or augmented reality.
Feed-forward absolute camera pose regression methods directly output poses by a
network, but suffer from low accuracy. Meanwhile, scene coordinate based
methods are accurate, but need iterative RANSAC post-processing, which brings
challenges to efficient end-to-end training and inference. In order to have the
best of both worlds, we propose a feed-forward method termed SC-wLS that
exploits all scene coordinate estimates for weighted least squares pose
regression. This differentiable formulation exploits a weight network imposed
on 2D-3D correspondences, and requires pose supervision only. Qualitative
results demonstrate the interpretability of learned weights. Evaluations on
7Scenes and Cambridge datasets show significantly promoted performance when
compared with former feed-forward counterparts. Moreover, our SC-wLS method
enables a new capability: self-supervised test-time adaptation on the weight
network. Codes and models are publicly available.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios.
It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed.
It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z) - SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning.
SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z) - WSCLoc: Weakly-Supervised Sparse-View Camera Relocalization [42.85368902409545]
WSCLoc is a system capable of being customized to various deep learning-based relocalization models.
In the initial stage, WSCLoc employs a multilayer perceptron-based structure called WFT-NeRF to co-optimize image reconstruction quality.
In the second stage, we co-optimize the pre-trained WFT-NeRF and WFT-Pose.
arXiv Detail & Related papers (2024-03-22T15:15:44Z) - Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural
Network Pruning [9.33753001494221]
Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks.
In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation.
arXiv Detail & Related papers (2023-04-08T22:48:30Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - Domain Adaptation of Networks for Camera Pose Estimation: Learning
Camera Pose Estimation Without Pose Labels [8.409695277909421]
One of the key criticisms of deep learning is that large amounts of expensive and difficult-to-acquire training data are required to train models.
DANCE enables the training of models without access to any labels on the target task.
renders labeled synthetic images from the 3D model, and bridges the inevitable domain gap between synthetic and real images.
arXiv Detail & Related papers (2021-11-29T17:45:38Z) - LENS: Localization enhanced by NeRF synthesis [3.4386226615580107]
We demonstrate improvement of camera pose regression thanks to an additional synthetic dataset rendered by the NeRF class of algorithm.
We further improved localization accuracy of pose regressors using synthesized realistic and geometry consistent images as data augmentation during training.
arXiv Detail & Related papers (2021-10-13T08:15:08Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - Locally Aware Piecewise Transformation Fields for 3D Human Mesh
Registration [67.69257782645789]
We propose piecewise transformation fields that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space.
We show that fitting parametric models with poses by our network results in much better registration quality, especially for extreme poses.
arXiv Detail & Related papers (2021-04-16T15:16:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.