Related papers: A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for Mobile Devices

A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for Mobile Devices

URL: http://arxiv.org/abs/2210.12476v1
Date: Sat, 22 Oct 2022 15:26:50 GMT
Title: A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for Mobile Devices
Authors: Yo-Chung Lau, Kuan-Wei Tseng, I-Ju Hsieh, Hsiao-Ching Tseng, Yi-Ping Hung
Abstract summary: We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices. Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side. Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
Score: 3.4836209951879957
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Real-time object pose estimation and tracking is challenging but essential for emerging augmented reality (AR) applications. In general, state-of-the-art methods address this problem using deep neural networks which indeed yield satisfactory results. Nevertheless, the high computational cost of these methods makes them unsuitable for mobile devices where real-world applications usually take place. In addition, head-mounted displays such as AR glasses require at least 90~FPS to avoid motion sickness, which further complicates the problem. We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices. It is a monocular visual-inertial-based system with a client-server architecture. Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side to obtain accurate poses, after which the pose is sent to the client side for visual-inertial fusion, where we propose a bias self-correction mechanism to reduce drift. We also propose a pose inspection algorithm to detect tracking failures and incorrect pose estimation. Connected by high-speed networking, our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices. Both simulations and real world experiments show that our method achieves accurate and robust object tracking.

Related papers

NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video [52.33896173943054]
Egocentric motion capture with a head-mounted body-facing stereo camera is crucial for VR and AR applications. Existing methods rely on synthetic pretraining and struggle to generate smooth and accurate predictions in real-world settings. We propose FRAME, a simple yet effective architecture that combines device pose and camera feeds for state-of-the-art body pose prediction.
arXiv Detail & Related papers (2025-03-29T14:26:06Z)
CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection [11.714072240331518]
CorrDiff is designed to tackle the challenge of delays in real-time detection systems. It is able to utilize runtime-estimated temporal cues to predict objects' locations for multiple future frames. It meets the stringent real-time processing requirements on all kinds of devices.
arXiv Detail & Related papers (2025-01-09T10:34:25Z)
Street Gaussians without 3D Object Tracker [86.62329193275916]
Existing methods rely on labor-intensive manual labeling of object poses to reconstruct dynamic objects in canonical space. We propose a stable object tracking module by leveraging associations from 2D deep trackers within a 3D object fusion strategy. We address inevitable tracking errors by further introducing a motion learning strategy in an implicit feature space that autonomously corrects trajectory errors and recovers missed detections.
arXiv Detail & Related papers (2024-12-07T05:49:42Z)
A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations. We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT. We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z)
ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras [33.81592783496106]
Event-based visual odometry aims at solving tracking and mapping sub-problems in parallel. We build an event-based stereo visual-inertial odometry system on top of our previous direct pipeline Event-based Stereo Visual Odometry.
arXiv Detail & Related papers (2024-10-12T05:35:27Z)
DragPoser: Motion Reconstruction from Variable Sparse Tracking Signals via Latent Space Optimization [1.5603779307797123]
DragPoser is a novel deep-learning-based motion reconstruction system. It accurately represents hard and dynamic on-the-fly constraints. It produces natural poses and temporally coherent motion.
arXiv Detail & Related papers (2024-04-29T15:00:50Z)
VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques. We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step. Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z)
PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search [64.28335667655129]
Multiple object tracking is a critical task in autonomous driving. As tracking accuracy improves, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency. In this paper, we explore the use of the neural architecture search (NAS) methods to search for efficient architectures for tracking, aiming for low real-time latency while maintaining relatively high accuracy.
arXiv Detail & Related papers (2024-03-23T04:18:49Z)
DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem. DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden. It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z)
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow. A novel neural network architecture is proposed for processing irregular point trajectory data. Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z)
Real-Time Human Pose Estimation on a Smart Walker using Convolutional Neural Networks [4.076099054649463]
We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers. It is able to extract a complete and compact body representation in real-time and from inexpensive sensors. Despite promising results, more data should be collected on users with impairments to assess its performance as a rehabilitation tool in real-world scenarios.
arXiv Detail & Related papers (2021-06-28T14:11:48Z)
Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving. We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)
Unified Multi-Modal Landmark Tracking for Tightly Coupled Lidar-Visual-Inertial Odometry [5.131684964386192]
We present an efficient multi-sensor odometry system for mobile platforms that jointly optimize visual, lidar, and inertial information. New method to extract 3D line and planar primitives from lidar point clouds is presented. System has been tested on a variety of platforms and scenarios, including underground exploration with a legged robot and outdoor scanning with a dynamically moving handheld device.
arXiv Detail & Related papers (2020-11-13T09:54:03Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
Instant 3D Object Tracking with Applications in Augmented Reality [4.893345190925178]
Tracking object poses in 3D is a crucial building block for Augmented Reality applications. We propose an instant motion tracking system that tracks an object's pose in space in real-time on mobile devices.
arXiv Detail & Related papers (2020-06-23T17:48:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.