LaMAR: Benchmarking Localization and Mapping for Augmented Reality
- URL: http://arxiv.org/abs/2210.10770v1
- Date: Wed, 19 Oct 2022 17:58:17 GMT
- Title: LaMAR: Benchmarking Localization and Mapping for Augmented Reality
- Authors: Paul-Edouard Sarlin, Mihai Dusmanu, Johannes L. Sch\"onberger, Pablo
Speciale, Lukas Gruber, Viktor Larsson, Ondrej Miksik, Marc Pollefeys
- Abstract summary: We introduce LaMAR, a new benchmark with a comprehensive capture and GT pipeline that co-registers realistic trajectories and sensor streams captured by heterogeneous AR devices.
We publish a benchmark dataset of diverse and large-scale scenes recorded with head-mounted and hand-held AR devices.
- Score: 80.23361950062302
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Localization and mapping is the foundational technology for augmented reality
(AR) that enables sharing and persistence of digital content in the real world.
While significant progress has been made, researchers are still mostly driven
by unrealistic benchmarks not representative of real-world AR scenarios. These
benchmarks are often based on small-scale datasets with low scene diversity,
captured from stationary cameras, and lack other sensor inputs like inertial,
radio, or depth data. Furthermore, their ground-truth (GT) accuracy is mostly
insufficient to satisfy AR requirements. To close this gap, we introduce LaMAR,
a new benchmark with a comprehensive capture and GT pipeline that co-registers
realistic trajectories and sensor streams captured by heterogeneous AR devices
in large, unconstrained scenes. To establish an accurate GT, our pipeline
robustly aligns the trajectories against laser scans in a fully automated
manner. As a result, we publish a benchmark dataset of diverse and large-scale
scenes recorded with head-mounted and hand-held AR devices. We extend several
state-of-the-art methods to take advantage of the AR-specific setup and
evaluate them on our benchmark. The results offer new insights on current
research and reveal promising avenues for future work in the field of
localization and mapping for AR.
Related papers
- SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - Mobile AR Depth Estimation: Challenges & Prospects -- Extended Version [12.887748044339913]
We investigate the challenges and opportunities of achieving accurate metric depth estimation in mobile AR.
We tested four different state-of-the-art monocular depth estimation models on a newly introduced dataset (ARKitScenes)
Our research provides promising future directions to explore and solve those challenges.
arXiv Detail & Related papers (2023-10-22T22:47:51Z) - Cross-View Visual Geo-Localization for Outdoor Augmented Reality [11.214903134756888]
We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database.
We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation.
Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-03-28T01:58:03Z) - Benchmark Dataset and Effective Inter-Frame Alignment for Real-World
Video Super-Resolution [65.20905703823965]
Video super-resolution (VSR) aiming to reconstruct a high-resolution (HR) video from its low-resolution (LR) counterpart has made tremendous progress in recent years.
It remains challenging to deploy existing VSR methods to real-world data with complex degradations.
EAVSR takes the proposed multi-layer adaptive spatial transform network (MultiAdaSTN) to refine the offsets provided by the pre-trained optical flow estimation network.
arXiv Detail & Related papers (2022-12-10T17:41:46Z) - Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge
TPU [58.720142291102135]
In this paper we propose a pose estimation software exploiting neural network architectures.
We show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.
arXiv Detail & Related papers (2022-04-07T08:53:18Z) - AirDet: Few-Shot Detection without Fine-tuning for Autonomous
Exploration [16.032316550612336]
We present AirDet, which is free of fine-tuning by learning class relation with support images.
AirDet achieves comparable or even better results than the exhaustively finetuned methods, reaching up to 40-60% improvements on the baseline.
We present evaluation results on real-world exploration tests from the DARPA Subterranean Challenge.
arXiv Detail & Related papers (2021-12-03T06:41:07Z) - Rethinking Drone-Based Search and Rescue with Aerial Person Detection [79.76669658740902]
The visual inspection of aerial drone footage is an integral part of land search and rescue (SAR) operations today.
We propose a novel deep learning algorithm to automate this aerial person detection (APD) task.
We present the novel Aerial Inspection RetinaNet (AIR) algorithm as the combination of these contributions.
arXiv Detail & Related papers (2021-11-17T21:48:31Z) - AR Mapping: Accurate and Efficient Mapping for Augmented Reality [35.420264042749146]
We introduce the AR Map for a specific scene to be composed of 1) color images with 6-DOF poses; 2) dense depth maps for each image and 3) a complete point cloud map.
For efficient data capture, a backpack scanning device is presented with a unified calibration pipeline. Secondly, we propose an AR mapping pipeline which takes the input from the scanning device and produces accurate AR Maps.
arXiv Detail & Related papers (2021-03-27T08:57:48Z) - StrObe: Streaming Object Detection from LiDAR Packets [73.27333924964306]
Rolling shutter LiDARs emitted as a stream of packets, each covering a sector of the 360deg coverage.
Modern perception algorithms wait for the full sweep to be built before processing the data, which introduces an additional latency.
In this paper we propose StrObe, a novel approach that minimizes latency by ingesting LiDAR packets and emitting a stream of detections without waiting for the full sweep to be built.
arXiv Detail & Related papers (2020-11-12T14:57:44Z) - LE-HGR: A Lightweight and Efficient RGB-based Online Gesture Recognition
Network for Embedded AR Devices [8.509059894058947]
We propose a lightweight and computationally efficient HGR framework, namely LE-HGR, to enable real-time gesture recognition on embedded devices with low computing power.
We show that the proposed method is of high accuracy and robustness, which is able to reach high-end performance in a variety of complicated interaction environments.
arXiv Detail & Related papers (2020-01-16T05:23:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.