BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
- URL: http://arxiv.org/abs/2211.14304v3
- Date: Wed, 2 Aug 2023 08:52:44 GMT
- Title: BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
- Authors: German Barquero, Sergio Escalera, and Cristina Palmero
- Abstract summary: We present BeLFusion, a model that leverages latent diffusion models in human motion prediction (HMP) to sample from a latent space where behavior is disentangled from pose and motion.
Thanks to our behavior coupler's ability to transfer sampled behavior to ongoing motion, BeLFusion's predictions display a variety of behaviors that are significantly more realistic than the state of the art.
- Score: 26.306489700180627
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Stochastic human motion prediction (HMP) has generally been tackled with
generative adversarial networks and variational autoencoders. Most prior works
aim at predicting highly diverse movements in terms of the skeleton joints'
dispersion. This has led to methods predicting fast and motion-divergent
movements, which are often unrealistic and incoherent with past motion. Such
methods also neglect contexts that need to anticipate diverse low-range
behaviors, or actions, with subtle joint displacements. To address these
issues, we present BeLFusion, a model that, for the first time, leverages
latent diffusion models in HMP to sample from a latent space where behavior is
disentangled from pose and motion. As a result, diversity is encouraged from a
behavioral perspective. Thanks to our behavior coupler's ability to transfer
sampled behavior to ongoing motion, BeLFusion's predictions display a variety
of behaviors that are significantly more realistic than the state of the art.
To support it, we introduce two metrics, the Area of the Cumulative Motion
Distribution, and the Average Pairwise Distance Error, which are correlated to
our definition of realism according to a qualitative study with 126
participants. Finally, we prove BeLFusion's generalization power in a new
cross-dataset scenario for stochastic HMP.
Related papers
- MDMP: Multi-modal Diffusion for supervised Motion Predictions with uncertainty [7.402769693163035]
This paper introduces a Multi-modal Diffusion model for Motion Prediction (MDMP)
It integrates skeletal data and textual descriptions of actions to generate refined long-term motion predictions with quantifiable uncertainty.
Our model consistently outperforms existing generative techniques in accurately predicting long-term motions.
arXiv Detail & Related papers (2024-10-04T18:49:00Z) - Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction [25.965711897002016]
We introduce Semantic Latent Directions (SLD) as a solution to this challenge.
SLD constrains the latent space to learn meaningful motion semantics.
We showcase the superiority of our method in accurately predicting motions while maintaining a balance of realism and diversity.
arXiv Detail & Related papers (2024-07-16T08:31:59Z) - Towards Generalizable and Interpretable Motion Prediction: A Deep
Variational Bayes Approach [54.429396802848224]
This paper proposes an interpretable generative model for motion prediction with robust generalizability to out-of-distribution cases.
For interpretability, the model achieves the target-driven motion prediction by estimating the spatial distribution of long-term destinations.
Experiments on motion prediction datasets validate that the fitted model can be interpretable and generalizable.
arXiv Detail & Related papers (2024-03-10T04:16:04Z) - Priority-Centric Human Motion Generation in Discrete Latent Space [59.401128190423535]
We introduce a Priority-Centric Motion Discrete Diffusion Model (M2DM) for text-to-motion generation.
M2DM incorporates a global self-attention mechanism and a regularization term to counteract code collapse.
We also present a motion discrete diffusion model that employs an innovative noise schedule, determined by the significance of each motion token.
arXiv Detail & Related papers (2023-08-28T10:40:16Z) - TransFusion: A Practical and Effective Transformer-based Diffusion Model
for 3D Human Motion Prediction [1.8923948104852863]
We propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction.
Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers.
In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization, we treat all inputs, including conditions, as tokens to create a more lightweight model.
arXiv Detail & Related papers (2023-07-30T01:52:07Z) - Human Joint Kinematics Diffusion-Refinement for Stochastic Motion
Prediction [22.354538952573158]
MotionDiff is a diffusion probabilistic model to treat the kinematics of human joints as heated particles.
MotionDiff consists of two parts: a spatial-temporal transformer-based diffusion network to generate diverse yet plausible motions, and a graph convolutional network to further refine the outputs.
arXiv Detail & Related papers (2022-10-12T07:38:33Z) - Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion [88.45326906116165]
We present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID)
We encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories.
Experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-03-25T16:59:08Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - Dyadic Human Motion Prediction [119.3376964777803]
We introduce a motion prediction framework that explicitly reasons about the interactions of two observed subjects.
Specifically, we achieve this by introducing a pairwise attention mechanism that models the mutual dependencies in the motion history of the two subjects.
This allows us to preserve the long-term motion dynamics in a more realistic way and more robustly predict unusual and fast-paced movements.
arXiv Detail & Related papers (2021-12-01T10:30:40Z) - Learning to Predict Diverse Human Motions from a Single Image via
Mixture Density Networks [9.06677862854201]
We propose a novel approach to predict future human motions from a single image, with mixture density networks (MDN) modeling.
Contrary to most existing deep human motion prediction approaches, the multimodal nature of MDN enables the generation of diverse future motion hypotheses.
Our trained model directly takes an image as input and generates multiple plausible motions that satisfy the given condition.
arXiv Detail & Related papers (2021-09-13T08:49:33Z) - Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction.
Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.