Related papers: KS-APR: Keyframe Selection for Robust Absolute Pose Regression

KS-APR: Keyframe Selection for Robust Absolute Pose Regression

URL: http://arxiv.org/abs/2308.05459v2
Date: Sun, 28 Apr 2024 22:11:48 GMT
Title: KS-APR: Keyframe Selection for Robust Absolute Pose Regression
Authors: Changkun Liu, Yukun Zhao, Tristan Braud,
Abstract summary: Markerless Mobile Augmented Reality (AR) aims to anchor digital content in the physical world without using specific 2D or 3D objects. End-to-end machine learning solutions infer the device's pose from a single monocular image. APR methods tend to yield significant inaccuracies for input images that are too distant from the training set. This paper introduces KS-APR, a pipeline that assesses the reliability of an estimated pose with minimal overhead.
Score: 2.541264438930729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Markerless Mobile Augmented Reality (AR) aims to anchor digital content in the physical world without using specific 2D or 3D objects. Absolute Pose Regressors (APR) are end-to-end machine learning solutions that infer the device's pose from a single monocular image. Thanks to their low computation cost, they can be directly executed on the constrained hardware of mobile AR devices. However, APR methods tend to yield significant inaccuracies for input images that are too distant from the training set. This paper introduces KS-APR, a pipeline that assesses the reliability of an estimated pose with minimal overhead by combining the inference results of the APR and the prior images in the training set. Mobile AR systems tend to rely upon visual-inertial odometry to track the relative pose of the device during the experience. As such, KS-APR favours reliability over frequency, discarding unreliable poses. This pipeline can integrate most existing APR methods to improve accuracy by filtering unreliable images with their pose estimates. We implement the pipeline on three types of APR models on indoor and outdoor datasets. The median error on position and orientation is reduced for all models, and the proportion of large errors is minimized across datasets. Our method enables state-of-the-art APRs such as DFNetdm to outperform single-image and sequential APR methods. These results demonstrate the scalability and effectiveness of KS-APR for visual localization tasks that do not require one-shot decisions.

Related papers

Scene-agnostic Pose Regression for Visual Localization [38.653251516665804]
We introduce a new task, Scene-agnostic Pose Regression (SPR), which can achieve accurate pose regression in a flexible way. In the unknown scenes of both 360SPR and 360Loc datasets, our method consistently outperforms APR, RPR and VO.
arXiv Detail & Related papers (2025-03-25T10:58:40Z)
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler [62.06785782635153]
We propose a new model, UniDepthV2, capable of reconstructing metric 3D scenes from solely single images across domains. UniDepthV2 directly predicts metric 3D points from the input image at inference time without any additional information. Our model exploits a pseudo-spherical output representation, which disentangles the camera and depth representations.
arXiv Detail & Related papers (2025-02-27T14:03:15Z)
Cameras as Rays: Pose Estimation via Ray Diffusion [54.098613859015856]
Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views. We propose a distributed representation of camera pose that treats a camera as a bundle of rays. Our proposed methods, both regression- and diffusion-based, demonstrate state-of-the-art performance on camera pose estimation on CO3D.
arXiv Detail & Related papers (2024-02-22T18:59:56Z)
HR-APR: APR-agnostic Framework with Uncertainty Estimation and Hierarchical Refinement for Camera Relocalisation [12.333674270678552]
Absolute Pose Regressors (APRs) directly estimate camera poses from monocular images, but their accuracy is unstable for different queries. Uncertainty-aware APRs provide uncertainty information on the estimated pose, alleviating the impact of these unreliable predictions. This work introduces a novel APR-agnostic framework, HR-APR, that formulates uncertainty estimation as cosine similarity estimation between the query and database features.
arXiv Detail & Related papers (2024-02-22T08:21:46Z)
MobileARLoc: On-device Robust Absolute Localisation for Pervasive Markerless Mobile AR [2.856126556871729]
This paper introduces MobileARLoc, a new framework for on-device large-scale markerless mobile AR. MobileARLoc combines an absolute pose regressor (APR) with a local VIO tracking system. We show that MobileARLoc halves the error compared to the underlying APR and achieve fast (80,ms) on-device inference speed.
arXiv Detail & Related papers (2024-01-21T14:48:38Z)
Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR [2.856126556871729]
This paper introduces VIO-APR, a new framework for markerless mobile AR that combines an absolute pose regressor with a local VIO tracking system. VIO-APR uses VIO to assess the reliability of the APR and the APR to identify and compensate for VIO drift. We implement VIO-APR into a mobile AR application using Unity to demonstrate its capabilities.
arXiv Detail & Related papers (2023-08-10T07:21:35Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Neural Refinement for Absolute Pose Regression with Feature Synthesis [33.2608395824548]
Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images. In this work, we propose a test-time refinement pipeline that leverages implicit geometric constraints. We also introduce a novel Neural Feature Synthesizer (NeFeS) model, which encodes 3D geometric features during training and directly renders dense novel view features at test time to refine APR methods.
arXiv Detail & Related papers (2023-03-17T16:10:50Z)
DiffIR: Efficient Diffusion Model for Image Restoration [108.82579440308267]
Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. Traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient for image restoration. We propose DiffIR, which consists of a compact IR prior extraction network (CPEN), dynamic IR transformer (DIRformer), and denoising network.
arXiv Detail & Related papers (2023-03-16T16:47:14Z)
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression [6.557612703872671]
Visual-inertial localization is a key problem in computer vision and robotics applications such as virtual reality, self-driving cars, and aerial vehicles. In this work, we conduct a benchmark to evaluate deep multimodal fusion based on pose graph optimization and attention networks. We show improvements for the APR-RPR task and for the RPR-RPR task for aerial vehicles and handheld devices.
arXiv Detail & Related papers (2022-08-01T15:05:26Z)
DeepRM: Deep Recurrent Matching for 6D Pose Refinement [77.34726150561087]
DeepRM is a novel recurrent network architecture for 6D pose refinement. The architecture incorporates LSTM units to propagate information through each refinement step. DeepRM achieves state-of-the-art performance on two widely accepted challenging datasets.
arXiv Detail & Related papers (2022-05-28T16:18:08Z)
Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z)
FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose. We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence. Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.