A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for
Mobile Devices
- URL: http://arxiv.org/abs/2210.12476v1
- Date: Sat, 22 Oct 2022 15:26:50 GMT
- Title: A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for
Mobile Devices
- Authors: Yo-Chung Lau, Kuan-Wei Tseng, I-Ju Hsieh, Hsiao-Ching Tseng, Yi-Ping
Hung
- Abstract summary: We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices.
Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side.
Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
- Score: 3.4836209951879957
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Real-time object pose estimation and tracking is challenging but essential
for emerging augmented reality (AR) applications. In general, state-of-the-art
methods address this problem using deep neural networks which indeed yield
satisfactory results. Nevertheless, the high computational cost of these
methods makes them unsuitable for mobile devices where real-world applications
usually take place. In addition, head-mounted displays such as AR glasses
require at least 90~FPS to avoid motion sickness, which further complicates the
problem. We propose a flexible-frame-rate object pose estimation and tracking
system for mobile devices. It is a monocular visual-inertial-based system with
a client-server architecture. Inertial measurement unit (IMU) pose propagation
is performed on the client side for high speed tracking, and RGB image-based 3D
pose estimation is performed on the server side to obtain accurate poses, after
which the pose is sent to the client side for visual-inertial fusion, where we
propose a bias self-correction mechanism to reduce drift. We also propose a
pose inspection algorithm to detect tracking failures and incorrect pose
estimation. Connected by high-speed networking, our system supports flexible
frame rates up to 120 FPS and guarantees high precision and real-time tracking
on low-end devices. Both simulations and real world experiments show that our
method achieves accurate and robust object tracking.
Related papers
- ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras [33.81592783496106]
Event-based visual odometry aims at solving tracking and mapping sub-problems in parallel.
We build an event-based stereo visual-inertial odometry system on top of our previous direct pipeline Event-based Stereo Visual Odometry.
arXiv Detail & Related papers (2024-10-12T05:35:27Z) - DragPoser: Motion Reconstruction from Variable Sparse Tracking Signals via Latent Space Optimization [1.5603779307797123]
DragPoser is a novel deep-learning-based motion reconstruction system.
It accurately represents hard and dynamic on-the-fly constraints.
It produces natural poses and temporally coherent motion.
arXiv Detail & Related papers (2024-04-29T15:00:50Z) - VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques.
We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step.
Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z) - PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search [64.28335667655129]
Multiple object tracking is a critical task in autonomous driving.
As tracking accuracy improves, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency.
In this paper, we explore the use of the neural architecture search (NAS) methods to search for efficient architectures for tracking, aiming for low real-time latency while maintaining relatively high accuracy.
arXiv Detail & Related papers (2024-03-23T04:18:49Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - Real-Time Human Pose Estimation on a Smart Walker using Convolutional
Neural Networks [4.076099054649463]
We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers.
It is able to extract a complete and compact body representation in real-time and from inexpensive sensors.
Despite promising results, more data should be collected on users with impairments to assess its performance as a rehabilitation tool in real-world scenarios.
arXiv Detail & Related papers (2021-06-28T14:11:48Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Unified Multi-Modal Landmark Tracking for Tightly Coupled
Lidar-Visual-Inertial Odometry [5.131684964386192]
We present an efficient multi-sensor odometry system for mobile platforms that jointly optimize visual, lidar, and inertial information.
New method to extract 3D line and planar primitives from lidar point clouds is presented.
System has been tested on a variety of platforms and scenarios, including underground exploration with a legged robot and outdoor scanning with a dynamically moving handheld device.
arXiv Detail & Related papers (2020-11-13T09:54:03Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - Instant 3D Object Tracking with Applications in Augmented Reality [4.893345190925178]
Tracking object poses in 3D is a crucial building block for Augmented Reality applications.
We propose an instant motion tracking system that tracks an object's pose in space in real-time on mobile devices.
arXiv Detail & Related papers (2020-06-23T17:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.