Related papers: Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR

Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR

URL: http://arxiv.org/abs/2308.05394v2
Date: Fri, 15 Sep 2023 07:20:43 GMT
Title: Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR
Authors: Changkun Liu, Yukun Zhao, Tristan Braud
Abstract summary: This paper introduces VIO-APR, a new framework for markerless mobile AR that combines an absolute pose regressor with a local VIO tracking system. VIO-APR uses VIO to assess the reliability of the APR and the APR to identify and compensate for VIO drift. We implement VIO-APR into a mobile AR application using Unity to demonstrate its capabilities.
Score: 2.856126556871729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual Inertial Odometry (VIO) is an essential component of modern Augmented Reality (AR) applications. However, VIO only tracks the relative pose of the device, leading to drift over time. Absolute pose estimation methods infer the device's absolute pose, but their accuracy depends on the input quality. This paper introduces VIO-APR, a new framework for markerless mobile AR that combines an absolute pose regressor (APR) with a local VIO tracking system. VIO-APR uses VIO to assess the reliability of the APR and the APR to identify and compensate for VIO drift. This feedback loop results in more accurate positioning and more stable AR experiences. To evaluate VIO-APR, we created a dataset that combines camera images with ARKit's VIO system output for six indoor and outdoor scenes of various scales. Over this dataset, VIO-APR improves the median accuracy of popular APR by up to 36\% in position and 29\% in orientation, increases the percentage of frames in the high ($0.25 m, 2^{\circ}$) accuracy level by up to 112\% and reduces the percentage of frames predicted below the low ($5 m, 10^\circ$) accuracy greatly. We implement VIO-APR into a mobile AR application using Unity to demonstrate its capabilities. VIO-APR results in noticeably more accurate localization and a more stable overall experience.

Related papers

SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer [62.11796778482088]
We present a novel model-agnostic sparse vision transformer, dubbed SparseFormer, to bridge the gap of object detection between close-up and HRW shots. The proposed SparseFormer selectively uses attentive tokens to scrutinize the sparsely distributed windows that may contain objects. experiments on two HRW benchmarks, PANDA and DOTA-v1.0, demonstrate that the proposed SparseFormer significantly improves detection accuracy (up to 5.8%) and speed (up to 3x) over the state-of-the-art approaches.
arXiv Detail & Related papers (2025-02-11T03:21:25Z)
RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark [61.987291551925516]
We introduce the Unit Cycle Resolver, which incorporates a unit circle constraint loss to improve angle prediction accuracy. Our approach can effectively improve the performance of existing state-of-the-art weakly supervised methods. With the aid of UCR, we further annotate and introduce RSAR, the largest multi-class rotated SAR object detection dataset to date.
arXiv Detail & Related papers (2025-01-08T11:41:47Z)
Mobile Augmented Reality Framework with Fusional Localization and Pose Estimation [9.73202312695815]
GPS-based mobile AR systems usually perform poorly due to the inaccurate positioning in the indoor environment. This paper first conducts a comprehensive study of the state-of-the-art AR and localization systems on mobile platforms. Then, we propose an effective indoor mobile AR framework. In the framework, a fusional localization method and a new pose estimation implementation are developed to increase the overall matching rate and thus improving AR display accuracy.
arXiv Detail & Related papers (2025-01-06T19:02:39Z)
RoMeO: Robust Metric Visual Odometry [11.381243799745729]
Visual odometry (VO) aims to estimate camera poses from visual inputs -- a fundamental building block for many applications such as VR/AR and robotics. Existing approaches lack robustness under this challenging scenario and fail to generalize to unseen data (especially outdoors) We propose Robust Metric Visual Odometry (RoMeO), a novel method that resolves these issues leveraging priors from pre-trained depth models.
arXiv Detail & Related papers (2024-12-16T08:08:35Z)
SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios. It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed. It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z)
MobileARLoc: On-device Robust Absolute Localisation for Pervasive Markerless Mobile AR [2.856126556871729]
This paper introduces MobileARLoc, a new framework for on-device large-scale markerless mobile AR. MobileARLoc combines an absolute pose regressor (APR) with a local VIO tracking system. We show that MobileARLoc halves the error compared to the underlying APR and achieve fast (80,ms) on-device inference speed.
arXiv Detail & Related papers (2024-01-21T14:48:38Z)
DINO-Mix: Enhancing Visual Place Recognition with Foundational Vision Model and Feature Mixing [4.053793612295086]
We propose a novel VPR architecture called DINO-Mix, which combines a foundational vision model with feature aggregation. We experimentally demonstrate that the proposed DINO-Mix architecture significantly outperforms current state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2023-11-01T02:22:17Z)
RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments [55.864869961717424]
It is typically challenging for visual or visual-inertial odometry systems to handle the problems of dynamic scenes and pure rotation. We design a novel visual-inertial odometry (VIO) system called RD-VIO to handle both of these problems.
arXiv Detail & Related papers (2023-10-23T16:30:39Z)
KS-APR: Keyframe Selection for Robust Absolute Pose Regression [2.541264438930729]
Markerless Mobile Augmented Reality (AR) aims to anchor digital content in the physical world without using specific 2D or 3D objects. End-to-end machine learning solutions infer the device's pose from a single monocular image. APR methods tend to yield significant inaccuracies for input images that are too distant from the training set. This paper introduces KS-APR, a pipeline that assesses the reliability of an estimated pose with minimal overhead.
arXiv Detail & Related papers (2023-08-10T09:32:20Z)
Enhanced Stable View Synthesis [86.69338893753886]
We introduce an approach to enhance the novel view synthesis from images taken from a freely moving camera. The introduced approach focuses on outdoor scenes where recovering accurate geometric scaffold and camera pose is challenging.
arXiv Detail & Related papers (2023-03-30T01:53:14Z)
LaMAR: Benchmarking Localization and Mapping for Augmented Reality [80.23361950062302]
We introduce LaMAR, a new benchmark with a comprehensive capture and GT pipeline that co-registers realistic trajectories and sensor streams captured by heterogeneous AR devices. We publish a benchmark dataset of diverse and large-scale scenes recorded with head-mounted and hand-held AR devices.
arXiv Detail & Related papers (2022-10-19T17:58:17Z)
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression [6.557612703872671]
Visual-inertial localization is a key problem in computer vision and robotics applications such as virtual reality, self-driving cars, and aerial vehicles. In this work, we conduct a benchmark to evaluate deep multimodal fusion based on pose graph optimization and attention networks. We show improvements for the APR-RPR task and for the RPR-RPR task for aerial vehicles and handheld devices.
arXiv Detail & Related papers (2022-08-01T15:05:26Z)
Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data. We first train a scale-aware disparity network using both monocular real images and stereo virtual data. The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z)
Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification. We propose new techniques to push its frontier in two aspects. Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z)
Object Detection in the Context of Mobile Augmented Reality [16.49070406578342]
We propose a novel approach that combines the geometric information from VIO with semantic information from object detectors to improve the performance of object detection on mobile devices. Our approach includes three components: (1) an image orientation correction method, (2) a scale-based filtering approach, and (3) an online semantic map. The results show that our approach can improve on the accuracy of generic object detectors by 12% on our dataset.
arXiv Detail & Related papers (2020-08-15T05:15:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.