Benchmarking Egocentric Visual-Inertial SLAM at City Scale
- URL: http://arxiv.org/abs/2509.26639v1
- Date: Tue, 30 Sep 2025 17:59:31 GMT
- Title: Benchmarking Egocentric Visual-Inertial SLAM at City Scale
- Authors: Anusha Krishnan, Shaohui Liu, Paul-Edouard Sarlin, Oscar Gentilhomme, David Caruso, Maurizio Monge, Richard Newcombe, Jakob Engel, Marc Pollefeys,
- Abstract summary: This paper introduces a new dataset and benchmark for visual-inertial SLAM with egocentric, multi-modal data.<n>We record hours and kilometers of trajectories through a city center with glasses-like devices equipped with various sensors.<n>We show that state-of-the-art systems developed by academia are not robust to these challenges and we identify components that are responsible for this.
- Score: 50.1245744173948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precise 6-DoF simultaneous localization and mapping (SLAM) from onboard sensors is critical for wearable devices capturing egocentric data, which exhibits specific challenges, such as a wider diversity of motions and viewpoints, prevalent dynamic visual content, or long sessions affected by time-varying sensor calibration. While recent progress on SLAM has been swift, academic research is still driven by benchmarks that do not reflect these challenges or do not offer sufficiently accurate ground truth poses. In this paper, we introduce a new dataset and benchmark for visual-inertial SLAM with egocentric, multi-modal data. We record hours and kilometers of trajectories through a city center with glasses-like devices equipped with various sensors. We leverage surveying tools to obtain control points as indirect pose annotations that are metric, centimeter-accurate, and available at city scale. This makes it possible to evaluate extreme trajectories that involve walking at night or traveling in a vehicle. We show that state-of-the-art systems developed by academia are not robust to these challenges and we identify components that are responsible for this. In addition, we design tracks with different levels of difficulty to ease in-depth analysis and evaluation of less mature approaches. The dataset and benchmark are available at https://www.lamaria.ethz.ch.
Related papers
- The Monado SLAM Dataset for Egocentric Visual-Inertial Tracking [38.93284476165776]
Humanoid robots and mixed reality headsets benefit from the use of head-mounted sensors for tracking.<n>We show that state-of-the-art tracking systems are still unable to gracefully handle many of the challenging settings presented in head-mounted use cases.<n>We present the Monado SLAM dataset, a set of real sequences taken from multiple virtual reality headsets.
arXiv Detail & Related papers (2025-07-31T18:28:07Z) - Addressing Data Annotation Challenges in Multiple Sensors: A Solution for Scania Collected Datasets [41.68378073302622]
Data annotation in autonomous vehicles is a critical step in the development of Deep Neural Network (DNN) based models.
This article focuses on addressing this challenge, primarily within the context of Scania collected datasets.
The proposed solution takes a track of an annotated object as input and uses the Moving Horizon Estimation (MHE) to robustly estimate its speed.
arXiv Detail & Related papers (2024-03-27T14:56:44Z) - Amirkabir campus dataset: Real-world challenges and scenarios of Visual
Inertial Odometry (VIO) for visually impaired people [3.7998592843098336]
We introduce the Amirkabir campus dataset (AUT-VI) to address the mentioned problem and improve the navigation systems.
AUT-VI is a novel and super-challenging dataset with 126 diverse sequences in 17 different locations.
In support of ongoing development efforts, we have released the Android application for data capture to the public.
arXiv Detail & Related papers (2024-01-07T23:13:51Z) - LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry [53.5449912019877]
We present the Long-term Effective Any Point Tracking (LEAP) module.<n>LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation.<n>Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes.
arXiv Detail & Related papers (2024-01-03T18:57:27Z) - Event-based Simultaneous Localization and Mapping: A Comprehensive Survey [52.73728442921428]
Review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks.
Paper categorizes event-based vSLAM methods into four main categories: feature-based, direct, motion-compensation, and deep learning methods.
arXiv Detail & Related papers (2023-04-19T16:21:14Z) - On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks [61.74608497496841]
Training on inaccurate or corrupt data induces model bias and hampers generalisation capabilities.
This paper investigates the effect of sensor errors for the dense 3D vision tasks of depth estimation and reconstruction.
arXiv Detail & Related papers (2023-03-26T22:32:44Z) - 4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions [46.03430162297781]
We present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset.<n>The proposed benchmark provides drastic appearance variations caused by seasonal changes and diverse weather and illumination conditions.<n>We introduce a new unified benchmark for jointly evaluating visual odometry, global place recognition, and map-based visual localization performance.
arXiv Detail & Related papers (2022-12-31T13:52:36Z) - The Hilti SLAM Challenge Dataset [41.091844019181735]
Construction environments pose challenging problem to Simultaneous Localization and Mapping (SLAM) algorithms.
To help this research, we propose a new dataset, the Hilti SLAM Challenge dataset.
Each dataset includes accurate ground truth to allow direct testing of SLAM results.
arXiv Detail & Related papers (2021-09-23T12:02:40Z) - Benchmarking high-fidelity pedestrian tracking systems for research,
real-time monitoring and crowd control [55.41644538483948]
High-fidelity pedestrian tracking in real-life conditions has been an important tool in fundamental crowd dynamics research.
As this technology advances, it is becoming increasingly useful also in society.
To successfully employ pedestrian tracking techniques in research and technology, it is crucial to validate and benchmark them for accuracy.
We present and discuss a benchmark suite, towards an open standard in the community, for privacy-respectful pedestrian tracking techniques.
arXiv Detail & Related papers (2021-08-26T11:45:26Z) - Deep Soft Procrustes for Markerless Volumetric Sensor Alignment [81.13055566952221]
In this work, we improve markerless data-driven correspondence estimation to achieve more robust multi-sensor spatial alignment.
We incorporate geometric constraints in an end-to-end manner into a typical segmentation based model and bridge the intermediate dense classification task with the targeted pose estimation one.
Our model is experimentally shown to achieve similar results with marker-based methods and outperform the markerless ones, while also being robust to the pose variations of the calibration structure.
arXiv Detail & Related papers (2020-03-23T10:51:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.