Learning to Predict Diverse Human Motions from a Single Image via
Mixture Density Networks
- URL: http://arxiv.org/abs/2109.05776v1
- Date: Mon, 13 Sep 2021 08:49:33 GMT
- Title: Learning to Predict Diverse Human Motions from a Single Image via
Mixture Density Networks
- Authors: Chunzhi Gu, Yan Zhao, Chao Zhang
- Abstract summary: We propose a novel approach to predict future human motions from a single image, with mixture density networks (MDN) modeling.
Contrary to most existing deep human motion prediction approaches, the multimodal nature of MDN enables the generation of diverse future motion hypotheses.
Our trained model directly takes an image as input and generates multiple plausible motions that satisfy the given condition.
- Score: 9.06677862854201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human motion prediction, which plays a key role in computer vision, generally
requires a past motion sequence as input. However, in real applications, a
complete and correct past motion sequence can be too expensive to achieve. In
this paper, we propose a novel approach to predict future human motions from a
much weaker condition, i.e., a single image, with mixture density networks
(MDN) modeling. Contrary to most existing deep human motion prediction
approaches, the multimodal nature of MDN enables the generation of diverse
future motion hypotheses, which well compensates for the strong stochastic
ambiguity aggregated by the single input and human motion uncertainty. In
designing the loss function, we further introduce an energy-based prior over
learnable parameters of MDN to maintain motion coherence, as well as improve
the prediction accuracy. Our trained model directly takes an image as input and
generates multiple plausible motions that satisfy the given condition.
Extensive experiments on two standard benchmark datasets demonstrate the
effectiveness of our method, in terms of prediction diversity and accuracy.
Related papers
- MDMP: Multi-modal Diffusion for supervised Motion Predictions with uncertainty [7.402769693163035]
This paper introduces a Multi-modal Diffusion model for Motion Prediction (MDMP)
It integrates skeletal data and textual descriptions of actions to generate refined long-term motion predictions with quantifiable uncertainty.
Our model consistently outperforms existing generative techniques in accurately predicting long-term motions.
arXiv Detail & Related papers (2024-10-04T18:49:00Z) - DivDiff: A Conditional Diffusion Model for Diverse Human Motion Prediction [9.447439259813112]
We propose a conditional diffusion-based generative model, called DivDiff, to predict more diverse and realistic human motions.
Specifically, the DivDiff employs DDPM as our backbone and incorporates Discrete Cosine Transform (DCT) and transformer mechanisms.
We design a diversified reinforcement sampling function (DRSF) to enforce human skeletal constraints on the predicted human motions.
arXiv Detail & Related papers (2024-08-16T04:51:32Z) - TransFusion: A Practical and Effective Transformer-based Diffusion Model
for 3D Human Motion Prediction [1.8923948104852863]
We propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction.
Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers.
In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization, we treat all inputs, including conditions, as tokens to create a more lightweight model.
arXiv Detail & Related papers (2023-07-30T01:52:07Z) - Executing your Commands via Motion Diffusion in Latent Space [51.64652463205012]
We propose a Motion Latent-based Diffusion model (MLD) to produce vivid motion sequences conforming to the given conditional inputs.
Our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks.
arXiv Detail & Related papers (2022-12-08T03:07:00Z) - PREF: Predictability Regularized Neural Motion Fields [68.60019434498703]
Knowing 3D motions in a dynamic scene is essential to many vision applications.
We leverage a neural motion field for estimating the motion of all points in a multiview setting.
We propose to regularize the estimated motion to be predictable.
arXiv Detail & Related papers (2022-09-21T22:32:37Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction.
Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z) - Probabilistic Human Motion Prediction via A Bayesian Neural Network [71.16277790708529]
We propose a probabilistic model for human motion prediction in this paper.
Our model could generate several future motions when given an observed motion sequence.
We extensively validate our approach on a large scale benchmark dataset Human3.6m.
arXiv Detail & Related papers (2021-07-14T09:05:33Z) - 3D Human motion anticipation and classification [8.069283749930594]
We propose a novel sequence-to-sequence model for human motion prediction and feature learning.
Our model learns to predict multiple future sequences of human poses from the same input sequence.
We show that it takes less than half the number of epochs to train an activity recognition network by using the feature learned from the discriminator.
arXiv Detail & Related papers (2020-12-31T00:19:39Z) - Adversarial Refinement Network for Human Motion Prediction [61.50462663314644]
Two popular methods, recurrent neural networks and feed-forward deep networks, are able to predict rough motion trend.
We propose an Adversarial Refinement Network (ARNet) following a simple yet effective coarse-to-fine mechanism with novel adversarial error augmentation.
arXiv Detail & Related papers (2020-11-23T05:42:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.