Towards Consistent Stochastic Human Motion Prediction via Motion
Diffusion
- URL: http://arxiv.org/abs/2305.12554v2
- Date: Tue, 19 Dec 2023 23:52:51 GMT
- Title: Towards Consistent Stochastic Human Motion Prediction via Motion
Diffusion
- Authors: Jiarui Sun, Girish Chowdhary
- Abstract summary: We propose DiffMotion as an end-to-end diffusion-based Human Motion Prediction framework.
Our results on benchmark datasets show that DiffMotion significantly outperforms previous methods in terms of both accuracy and fidelity.
- Score: 8.10696589962658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic Human Motion Prediction (HMP) aims to predict multiple possible
upcoming pose sequences based on past human motion trajectories. Although
previous approaches have shown impressive performance, they face several
issues, including complex training processes and a tendency to generate
predictions that are often inconsistent with the provided history, and
sometimes even becoming entirely unreasonable. To overcome these issues, we
propose DiffMotion, an end-to-end diffusion-based stochastic HMP framework.
DiffMotion's motion predictor is composed of two modules, including (1) a
Transformer-based network for initial motion reconstruction from corrupted
motion, and (2) a Graph Convolutional Network (GCN) to refine the generated
motion considering past observations. Our method, facilitated by this novel
Transformer-GCN module design and a proposed variance scheduler, excels in
predicting accurate, realistic, and consistent motions, while maintaining an
appropriate level of diversity. Our results on benchmark datasets show that
DiffMotion significantly outperforms previous methods in terms of both accuracy
and fidelity, while demonstrating superior robustness.
Related papers
- Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction [25.965711897002016]
We introduce Semantic Latent Directions (SLD) as a solution to this challenge.
SLD constrains the latent space to learn meaningful motion semantics.
We showcase the superiority of our method in accurately predicting motions while maintaining a balance of realism and diversity.
arXiv Detail & Related papers (2024-07-16T08:31:59Z) - Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications.
Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z) - TransFusion: A Practical and Effective Transformer-based Diffusion Model
for 3D Human Motion Prediction [1.8923948104852863]
We propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction.
Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers.
In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization, we treat all inputs, including conditions, as tokens to create a more lightweight model.
arXiv Detail & Related papers (2023-07-30T01:52:07Z) - STGlow: A Flow-based Generative Framework with Dual Graphormer for
Pedestrian Trajectory Prediction [22.553356096143734]
We propose a novel generative flow based framework with dual graphormer for pedestrian trajectory prediction (STGlow)
Our method can more precisely model the underlying data distribution by optimizing the exact log-likelihood of motion behaviors.
Experimental results on several benchmarks demonstrate that our method achieves much better performance compared to previous state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-21T07:29:24Z) - Human Joint Kinematics Diffusion-Refinement for Stochastic Motion
Prediction [22.354538952573158]
MotionDiff is a diffusion probabilistic model to treat the kinematics of human joints as heated particles.
MotionDiff consists of two parts: a spatial-temporal transformer-based diffusion network to generate diverse yet plausible motions, and a graph convolutional network to further refine the outputs.
arXiv Detail & Related papers (2022-10-12T07:38:33Z) - Motion Transformer with Global Intention Localization and Local Movement
Refinement [103.75625476231401]
Motion TRansformer (MTR) models motion prediction as the joint optimization of global intention localization and local movement refinement.
MTR achieves state-of-the-art performance on both the marginal and joint motion prediction challenges.
arXiv Detail & Related papers (2022-09-27T16:23:14Z) - Weakly-supervised Action Transition Learning for Stochastic Human Motion
Prediction [81.94175022575966]
We introduce the task of action-driven human motion prediction.
It aims to predict multiple plausible future motions given a sequence of action labels and a short motion history.
arXiv Detail & Related papers (2022-05-31T08:38:07Z) - Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion [88.45326906116165]
We present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID)
We encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories.
Experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-03-25T16:59:08Z) - Learning to Predict Diverse Human Motions from a Single Image via
Mixture Density Networks [9.06677862854201]
We propose a novel approach to predict future human motions from a single image, with mixture density networks (MDN) modeling.
Contrary to most existing deep human motion prediction approaches, the multimodal nature of MDN enables the generation of diverse future motion hypotheses.
Our trained model directly takes an image as input and generates multiple plausible motions that satisfy the given condition.
arXiv Detail & Related papers (2021-09-13T08:49:33Z) - Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction.
Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z) - MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying
Motions [70.30211294212603]
This paper tackles video prediction from a new dimension of predicting spacetime-varying motions that are incessantly across both space and time.
We propose the MotionRNN framework, which can capture the complex variations within motions and adapt to spacetime-varying scenarios.
arXiv Detail & Related papers (2021-03-03T08:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.