DivaTrack: Diverse Bodies and Motions from Acceleration-Enhanced
Three-Point Trackers
- URL: http://arxiv.org/abs/2402.09211v1
- Date: Wed, 14 Feb 2024 14:46:03 GMT
- Title: DivaTrack: Diverse Bodies and Motions from Acceleration-Enhanced
Three-Point Trackers
- Authors: Dongseok Yang, Jiho Kang, Lingni Ma, Joseph Greer, Yuting Ye and
Sung-Hee Lee
- Abstract summary: Full-body avatar presence is crucial for immersive social and environmental interactions in digital reality.
Current devices only provide three six degrees of freedom (DOF) poses from the headset and two controllers.
We propose a deep learning framework, DivaTrack, which outperforms existing methods when applied to diverse body sizes and activities.
- Score: 13.258923087528354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Full-body avatar presence is crucial for immersive social and environmental
interactions in digital reality. However, current devices only provide three
six degrees of freedom (DOF) poses from the headset and two controllers (i.e.
three-point trackers). Because it is a highly under-constrained problem,
inferring full-body pose from these inputs is challenging, especially when
supporting the full range of body proportions and use cases represented by the
general population. In this paper, we propose a deep learning framework,
DivaTrack, which outperforms existing methods when applied to diverse body
sizes and activities. We augment the sparse three-point inputs with linear
accelerations from Inertial Measurement Units (IMU) to improve foot contact
prediction. We then condition the otherwise ambiguous lower-body pose with the
predictions of foot contact and upper-body pose in a two-stage model. We
further stabilize the inferred full-body pose in a wide range of configurations
by learning to blend predictions that are computed in two reference frames,
each of which is designed for different types of motions. We demonstrate the
effectiveness of our design on a large dataset that captures 22 subjects
performing challenging locomotion for three-point tracking, including lunges,
hula-hooping, and sitting. As shown in a live demo using the Meta VR headset
and Xsens IMUs, our method runs in real-time while accurately tracking a user's
motion when they perform a diverse set of movements.
Related papers
- Universal Humanoid Motion Representations for Physics-Based Control [71.46142106079292]
We present a universal motion representation that encompasses a comprehensive range of motor skills for physics-based humanoid control.
We first learn a motion imitator that can imitate all of human motion from a large, unstructured motion dataset.
We then create our motion representation by distilling skills directly from the imitator.
arXiv Detail & Related papers (2023-10-06T20:48:43Z) - Realistic Full-Body Tracking from Sparse Observations via Joint-Level
Modeling [13.284947022380404]
We propose a two-stage framework that can obtain accurate and smooth full-body motions with three tracking signals of head and hands only.
Our framework explicitly models the joint-level features in the first stage and utilizes them astemporal tokens for alternating spatial and temporal transformer blocks to capture joint-level correlations in the second stage.
With extensive experiments on the AMASS motion dataset and real-captured data, we show our proposed method can achieve more accurate and smooth motion compared to existing approaches.
arXiv Detail & Related papers (2023-08-17T08:27:55Z) - Physics-based Motion Retargeting from Sparse Inputs [73.94570049637717]
Commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose.
We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies.
We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available.
arXiv Detail & Related papers (2023-07-04T21:57:05Z) - Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking
Inputs with Diffusion Model [18.139630622759636]
We present AGRoL, a novel conditional diffusion model specifically designed to track full bodies given sparse upper-body tracking signals.
Our model is based on a simple multi-layer perceptron (MLP) architecture and a novel conditioning scheme for motion data.
Unlike common diffusion architectures, our compact architecture can run in real-time, making it suitable for online body-tracking applications.
arXiv Detail & Related papers (2023-04-17T19:35:13Z) - An Effective Motion-Centric Paradigm for 3D Single Object Tracking in
Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z) - MotionBERT: A Unified Perspective on Learning Human Motion
Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations.
We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z) - QuestSim: Human Motion Tracking from Sparse Sensors with Simulated
Avatars [80.05743236282564]
Real-time tracking of human body motion is crucial for immersive experiences in AR/VR.
We present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers.
We show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
arXiv Detail & Related papers (2022-09-20T00:25:54Z) - Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single
Object Tracking in Point Clouds [39.41305358466479]
3D single object tracking in LiDAR point clouds plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle 3D SOT from a new perspective.
arXiv Detail & Related papers (2022-03-03T14:20:10Z) - Neural Monocular 3D Human Motion Capture with Physical Awareness [76.55971509794598]
We present a new trainable system for physically plausible markerless 3D human motion capture.
Unlike most neural methods for human motion capture, our approach is aware of physical and environmental constraints.
It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes.
arXiv Detail & Related papers (2021-05-03T17:57:07Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.