Related papers: MobileARLoc: On-device Robust Absolute Localisation for Pervasive Markerless Mobile AR

MobileARLoc: On-device Robust Absolute Localisation for Pervasive Markerless Mobile AR

URL: http://arxiv.org/abs/2401.11511v3
Date: Sun, 4 Feb 2024 18:26:50 GMT
Title: MobileARLoc: On-device Robust Absolute Localisation for Pervasive Markerless Mobile AR
Authors: Changkun Liu, Yukun Zhao, Tristan Braud
Abstract summary: This paper introduces MobileARLoc, a new framework for on-device large-scale markerless mobile AR. MobileARLoc combines an absolute pose regressor (APR) with a local VIO tracking system. We show that MobileARLoc halves the error compared to the underlying APR and achieve fast (80,ms) on-device inference speed.
Score: 2.856126556871729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent years have seen significant improvement in absolute camera pose estimation, paving the way for pervasive markerless Augmented Reality (AR). However, accurate absolute pose estimation techniques are computation- and storage-heavy, requiring computation offloading. As such, AR systems rely on visual-inertial odometry (VIO) to track the device's relative pose between requests to the server. However, VIO suffers from drift, requiring frequent absolute repositioning. This paper introduces MobileARLoc, a new framework for on-device large-scale markerless mobile AR that combines an absolute pose regressor (APR) with a local VIO tracking system. Absolute pose regressors (APRs) provide fast on-device pose estimation at the cost of reduced accuracy. To address APR accuracy and reduce VIO drift, MobileARLoc creates a feedback loop where VIO pose estimations refine the APR predictions. The VIO system identifies reliable predictions of APR, which are then used to compensate for the VIO drift. We comprehensively evaluate MobileARLoc through dataset simulations. MobileARLoc halves the error compared to the underlying APR and achieve fast (80\,ms) on-device inference speed.

Related papers

NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer [62.11796778482088]
We present a novel model-agnostic sparse vision transformer, dubbed SparseFormer, to bridge the gap of object detection between close-up and HRW shots. The proposed SparseFormer selectively uses attentive tokens to scrutinize the sparsely distributed windows that may contain objects. experiments on two HRW benchmarks, PANDA and DOTA-v1.0, demonstrate that the proposed SparseFormer significantly improves detection accuracy (up to 5.8%) and speed (up to 3x) over the state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-11T03:21:25Z)
RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark [61.987291551925516]
We introduce the Unit Cycle Resolver, which incorporates a unit circle constraint loss to improve angle prediction accuracy. Our approach can effectively improve the performance of existing state-of-the-art weakly supervised methods. With the aid of UCR, we further annotate and introduce RSAR, the largest multi-class rotated SAR object detection dataset to date.
arXiv Detail & Related papers (2025-01-08T11:41:47Z)
Mobile Augmented Reality Framework with Fusional Localization and Pose Estimation [9.73202312695815]
GPS-based mobile AR systems usually perform poorly due to the inaccurate positioning in the indoor environment. This paper first conducts a comprehensive study of the state-of-the-art AR and localization systems on mobile platforms. Then, we propose an effective indoor mobile AR framework. In the framework, a fusional localization method and a new pose estimation implementation are developed to increase the overall matching rate and thus improving AR display accuracy.
arXiv Detail & Related papers (2025-01-06T19:02:39Z)
ReLoc-PDR: Visual Relocalization Enhanced Pedestrian Dead Reckoning via Graph Optimization [4.188058836787458]
This work proposes ReLoc-PDR, a fusion framework combining pedestrian dead reckoning and visual relocalization. A graph optimization-based fusion mechanism with the Tukey kernel effectively corrects cumulative errors and mitigates the impact of abnormal visual observations. Real-world experiments demonstrate that our ReLoc-PDR surpasses representative methods in accuracy and robustness.
arXiv Detail & Related papers (2023-09-04T14:54:47Z)
KS-APR: Keyframe Selection for Robust Absolute Pose Regression [2.541264438930729]
Markerless Mobile Augmented Reality (AR) aims to anchor digital content in the physical world without using specific 2D or 3D objects. End-to-end machine learning solutions infer the device's pose from a single monocular image. APR methods tend to yield significant inaccuracies for input images that are too distant from the training set. This paper introduces KS-APR, a pipeline that assesses the reliability of an estimated pose with minimal overhead.
arXiv Detail & Related papers (2023-08-10T09:32:20Z)
Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR [2.856126556871729]
This paper introduces VIO-APR, a new framework for markerless mobile AR that combines an absolute pose regressor with a local VIO tracking system. VIO-APR uses VIO to assess the reliability of the APR and the APR to identify and compensate for VIO drift. We implement VIO-APR into a mobile AR application using Unity to demonstrate its capabilities.
arXiv Detail & Related papers (2023-08-10T07:21:35Z)
A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for Mobile Devices [3.4836209951879957]
We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices. Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side. Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
arXiv Detail & Related papers (2022-10-22T15:26:50Z)
LaMAR: Benchmarking Localization and Mapping for Augmented Reality [80.23361950062302]
We introduce LaMAR, a new benchmark with a comprehensive capture and GT pipeline that co-registers realistic trajectories and sensor streams captured by heterogeneous AR devices. We publish a benchmark dataset of diverse and large-scale scenes recorded with head-mounted and hand-held AR devices.
arXiv Detail & Related papers (2022-10-19T17:58:17Z)
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression [6.557612703872671]
Visual-inertial localization is a key problem in computer vision and robotics applications such as virtual reality, self-driving cars, and aerial vehicles. In this work, we conduct a benchmark to evaluate deep multimodal fusion based on pose graph optimization and attention networks. We show improvements for the APR-RPR task and for the RPR-RPR task for aerial vehicles and handheld devices.
arXiv Detail & Related papers (2022-08-01T15:05:26Z)
iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [64.80458128766254]
iSDF is a continuous learning system for real-time signed distance field reconstruction. It produces more accurate reconstructions and better approximations of collision costs and gradients.
arXiv Detail & Related papers (2022-04-05T15:48:39Z)
Rethinking Drone-Based Search and Rescue with Aerial Person Detection [79.76669658740902]
The visual inspection of aerial drone footage is an integral part of land search and rescue (SAR) operations today. We propose a novel deep learning algorithm to automate this aerial person detection (APD) task. We present the novel Aerial Inspection RetinaNet (AIR) algorithm as the combination of these contributions.
arXiv Detail & Related papers (2021-11-17T21:48:31Z)
FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose. We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence. Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z)
VSAC: Efficient and Accurate Estimator for H and F [68.65610177368617]
VSAC is a RANSAC-type robust estimator with a number of novelties. It is significantly faster than all its predecessors and runs on average in 1-2 ms, on a CPU. It is two orders of magnitude faster and yet as precise as MAGSAC++, the currently most accurate estimator of two-view geometry.
arXiv Detail & Related papers (2021-06-18T17:04:57Z)
Confidence Adaptive Anytime Pixel-Level Recognition [86.75784498879354]
Anytime inference requires a model to make a progression of predictions which might be halted at any time. We propose the first unified and end-to-end model approach for anytime pixel-level recognition.
arXiv Detail & Related papers (2021-04-01T20:01:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.