Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking
Inputs with Diffusion Model
- URL: http://arxiv.org/abs/2304.08577v1
- Date: Mon, 17 Apr 2023 19:35:13 GMT
- Title: Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking
Inputs with Diffusion Model
- Authors: Yuming Du, Robin Kips, Albert Pumarola, Sebastian Starke, Ali Thabet,
Artsiom Sanakoyeu
- Abstract summary: We present AGRoL, a novel conditional diffusion model specifically designed to track full bodies given sparse upper-body tracking signals.
Our model is based on a simple multi-layer perceptron (MLP) architecture and a novel conditioning scheme for motion data.
Unlike common diffusion architectures, our compact architecture can run in real-time, making it suitable for online body-tracking applications.
- Score: 18.139630622759636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the recent surge in popularity of AR/VR applications, realistic and
accurate control of 3D full-body avatars has become a highly demanded feature.
A particular challenge is that only a sparse tracking signal is available from
standalone HMDs (Head Mounted Devices), often limited to tracking the user's
head and wrists. While this signal is resourceful for reconstructing the upper
body motion, the lower body is not tracked and must be synthesized from the
limited information provided by the upper body joints. In this paper, we
present AGRoL, a novel conditional diffusion model specifically designed to
track full bodies given sparse upper-body tracking signals. Our model is based
on a simple multi-layer perceptron (MLP) architecture and a novel conditioning
scheme for motion data. It can predict accurate and smooth full-body motion,
particularly the challenging lower body movement. Unlike common diffusion
architectures, our compact architecture can run in real-time, making it
suitable for online body-tracking applications. We train and evaluate our model
on AMASS motion capture dataset, and demonstrate that our approach outperforms
state-of-the-art methods in generated motion accuracy and smoothness. We
further justify our design choices through extensive experiments and ablation
studies.
Related papers
- WANDR: Intention-guided Human Motion Generation [67.07028110459787]
We introduce WANDR, a data-driven model that takes an avatar's initial pose and a goal's 3D position and generates natural human motions that place the end effector (wrist) on the goal location.
Intention guides the agent to the goal, and interactively adapts the generation to novel situations without needing to define sub-goals or the entire motion path.
We evaluate our method extensively and demonstrate its ability to generate natural and long-term motions that reach 3D goals and to unseen goal locations.
arXiv Detail & Related papers (2024-04-23T10:20:17Z) - LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
Environment [59.320414108383055]
We present LiveHPS, a novel single-LiDAR-based approach for scene-level human pose and shape estimation.
We propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses.
arXiv Detail & Related papers (2024-02-27T03:08:44Z) - DivaTrack: Diverse Bodies and Motions from Acceleration-Enhanced
Three-Point Trackers [13.258923087528354]
Full-body avatar presence is crucial for immersive social and environmental interactions in digital reality.
Current devices only provide three six degrees of freedom (DOF) poses from the headset and two controllers.
We propose a deep learning framework, DivaTrack, which outperforms existing methods when applied to diverse body sizes and activities.
arXiv Detail & Related papers (2024-02-14T14:46:03Z) - HMP: Hand Motion Priors for Pose and Shape Estimation from Video [52.39020275278984]
We develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions.
Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios.
We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets.
arXiv Detail & Related papers (2023-12-27T22:35:33Z) - SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data [1.494051815405093]
We introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from sparse data.
Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses.
We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices.
arXiv Detail & Related papers (2023-11-03T18:48:01Z) - Realistic Full-Body Tracking from Sparse Observations via Joint-Level
Modeling [13.284947022380404]
We propose a two-stage framework that can obtain accurate and smooth full-body motions with three tracking signals of head and hands only.
Our framework explicitly models the joint-level features in the first stage and utilizes them astemporal tokens for alternating spatial and temporal transformer blocks to capture joint-level correlations in the second stage.
With extensive experiments on the AMASS motion dataset and real-captured data, we show our proposed method can achieve more accurate and smooth motion compared to existing approaches.
arXiv Detail & Related papers (2023-08-17T08:27:55Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion
Synthesis [14.331548412833513]
Mixed reality applications require tracking the user's full-body motion to enable an immersive experience.
We propose BoDiffusion -- a generative diffusion model for motion synthesis to tackle this under-constrained reconstruction problem.
We present a time and space conditioning scheme that allows BoDiffusion to leverage sparse tracking inputs while generating smooth and realistic full-body motion sequences.
arXiv Detail & Related papers (2023-04-21T16:39:05Z) - QuestSim: Human Motion Tracking from Sparse Sensors with Simulated
Avatars [80.05743236282564]
Real-time tracking of human body motion is crucial for immersive experiences in AR/VR.
We present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers.
We show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
arXiv Detail & Related papers (2022-09-20T00:25:54Z) - Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.