A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for
Mobile Devices
- URL: http://arxiv.org/abs/2210.12476v1
- Date: Sat, 22 Oct 2022 15:26:50 GMT
- Title: A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for
Mobile Devices
- Authors: Yo-Chung Lau, Kuan-Wei Tseng, I-Ju Hsieh, Hsiao-Ching Tseng, Yi-Ping
Hung
- Abstract summary: We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices.
Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side.
Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
- Score: 3.4836209951879957
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Real-time object pose estimation and tracking is challenging but essential
for emerging augmented reality (AR) applications. In general, state-of-the-art
methods address this problem using deep neural networks which indeed yield
satisfactory results. Nevertheless, the high computational cost of these
methods makes them unsuitable for mobile devices where real-world applications
usually take place. In addition, head-mounted displays such as AR glasses
require at least 90~FPS to avoid motion sickness, which further complicates the
problem. We propose a flexible-frame-rate object pose estimation and tracking
system for mobile devices. It is a monocular visual-inertial-based system with
a client-server architecture. Inertial measurement unit (IMU) pose propagation
is performed on the client side for high speed tracking, and RGB image-based 3D
pose estimation is performed on the server side to obtain accurate poses, after
which the pose is sent to the client side for visual-inertial fusion, where we
propose a bias self-correction mechanism to reduce drift. We also propose a
pose inspection algorithm to detect tracking failures and incorrect pose
estimation. Connected by high-speed networking, our system supports flexible
frame rates up to 120 FPS and guarantees high precision and real-time tracking
on low-end devices. Both simulations and real world experiments show that our
method achieves accurate and robust object tracking.
Related papers
- CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection [11.714072240331518]
CorrDiff is designed to tackle the challenge of delays in real-time detection systems.
It is able to utilize runtime-estimated temporal cues to predict objects' locations for multiple future frames.
It meets the stringent real-time processing requirements on all kinds of devices.
arXiv Detail & Related papers (2025-01-09T10:34:25Z) - Street Gaussians without 3D Object Tracker [86.62329193275916]
Existing methods rely on labor-intensive manual labeling of object poses to reconstruct dynamic objects in canonical space and move them based on these poses during rendering.
We propose a stable object tracking module by leveraging associations from 2D deep trackers within a 3D object fusion strategy.
We address inevitable tracking errors by further introducing a motion learning strategy in an implicit feature space that autonomously corrects trajectory errors and recovers missed detections.
arXiv Detail & Related papers (2024-12-07T05:49:42Z) - A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations.
We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT.
We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z) - ESVO2: Direct Visual-Inertial Odometry with Stereo Event Cameras [33.81592783496106]
Event-based visual odometry aims at solving tracking and mapping subproblems (typically in parallel)
We build an event-based stereo visual-inertial odometry system on top of a direct pipeline.
The resulting system scales well with modern high-resolution event cameras.
arXiv Detail & Related papers (2024-10-12T05:35:27Z) - DragPoser: Motion Reconstruction from Variable Sparse Tracking Signals via Latent Space Optimization [1.5603779307797123]
DragPoser is a novel deep-learning-based motion reconstruction system.
It accurately represents hard and dynamic on-the-fly constraints.
It produces natural poses and temporally coherent motion.
arXiv Detail & Related papers (2024-04-29T15:00:50Z) - VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques.
We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step.
Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z) - PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search [64.28335667655129]
Multiple object tracking is a critical task in autonomous driving.
As tracking accuracy improves, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency.
In this paper, we explore the use of the neural architecture search (NAS) methods to search for efficient architectures for tracking, aiming for low real-time latency while maintaining relatively high accuracy.
arXiv Detail & Related papers (2024-03-23T04:18:49Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - Real-Time Human Pose Estimation on a Smart Walker using Convolutional
Neural Networks [4.076099054649463]
We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers.
It is able to extract a complete and compact body representation in real-time and from inexpensive sensors.
Despite promising results, more data should be collected on users with impairments to assess its performance as a rehabilitation tool in real-world scenarios.
arXiv Detail & Related papers (2021-06-28T14:11:48Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - Instant 3D Object Tracking with Applications in Augmented Reality [4.893345190925178]
Tracking object poses in 3D is a crucial building block for Augmented Reality applications.
We propose an instant motion tracking system that tracks an object's pose in space in real-time on mobile devices.
arXiv Detail & Related papers (2020-06-23T17:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.