HiT-DVAE: Human Motion Generation via Hierarchical Transformer Dynamical
VAE
- URL: http://arxiv.org/abs/2204.01565v1
- Date: Mon, 4 Apr 2022 15:12:34 GMT
- Title: HiT-DVAE: Human Motion Generation via Hierarchical Transformer Dynamical
VAE
- Authors: Xiaoyu Bie, Wen Guo, Simon Leglaive, Lauren Girin, Francesc
Moreno-Noguer, Xavier Alameda-Pineda
- Abstract summary: We propose Hierarchical Transformer Dynamical Variational Autoencoder, HiT-DVAE, which implements auto-regressive generation with transformer-like attention mechanisms.
We evaluate the proposed method on HumanEva-I and Human3.6M with various evaluation methods, and outperform the state-of-the-art methods on most of the metrics.
- Score: 37.23381308240617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Studies on the automatic processing of 3D human pose data have flourished in
the recent past. In this paper, we are interested in the generation of
plausible and diverse future human poses following an observed 3D pose
sequence. Current methods address this problem by injecting random variables
from a single latent space into a deterministic motion prediction framework,
which precludes the inherent multi-modality in human motion generation. In
addition, previous works rarely explore the use of attention to select which
frames are to be used to inform the generation process up to our knowledge. To
overcome these limitations, we propose Hierarchical Transformer Dynamical
Variational Autoencoder, HiT-DVAE, which implements auto-regressive generation
with transformer-like attention mechanisms. HiT-DVAE simultaneously learns the
evolution of data and latent space distribution with time correlated
probabilistic dependencies, thus enabling the generative model to learn a more
complex and time-varying latent space as well as diverse and realistic human
motions. Furthermore, the auto-regressive generation brings more flexibility on
observation and prediction, i.e. one can have any length of observation and
predict arbitrary large sequences of poses with a single pre-trained model. We
evaluate the proposed method on HumanEva-I and Human3.6M with various
evaluation methods, and outperform the state-of-the-art methods on most of the
metrics.
Related papers
- Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Hierarchical Generation of Human-Object Interactions with Diffusion
Probabilistic Models [71.64318025625833]
This paper presents a novel approach to generating the 3D motion of a human interacting with a target object.
Our framework first generates a set of milestones and then synthesizes the motion along them.
The experiments on the NSM, COUCH, and SAMP datasets show that our approach outperforms previous methods by a large margin in both quality and diversity.
arXiv Detail & Related papers (2023-10-03T17:50:23Z) - TransFusion: A Practical and Effective Transformer-based Diffusion Model
for 3D Human Motion Prediction [1.8923948104852863]
We propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction.
Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers.
In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization, we treat all inputs, including conditions, as tokens to create a more lightweight model.
arXiv Detail & Related papers (2023-07-30T01:52:07Z) - SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction [12.248428883804763]
3D human motion prediction is a research area computation of high significance and a challenge in computer vision.
Traditionally, autogregressive models have been used to predict human motion.
We present a non-autoregressive model for human motion prediction.
arXiv Detail & Related papers (2023-03-11T01:44:29Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Generating Smooth Pose Sequences for Diverse Human Motion Prediction [90.45823619796674]
We introduce a unified deep generative network for both diverse and controllable motion prediction.
Our experiments on two standard benchmark datasets, Human3.6M and HumanEva-I, demonstrate that our approach outperforms the state-of-the-art baselines in terms of both sample diversity and accuracy.
arXiv Detail & Related papers (2021-08-19T00:58:00Z) - Multi-frame sequence generator of 4D human body motion [0.0]
We propose a generative auto-encoder-based framework, which encodes, global locomotion including translation and rotation, and multi-frame temporal motion as a single latent space vector.
Our results validate the ability of the model to reconstruct 4D sequences of human morphology within a low error bound.
We also illustrate the benefits of the approach for 4D human motion prediction of future frames from initial human frames.
arXiv Detail & Related papers (2021-06-07T13:56:46Z) - Multimodal Deep Generative Models for Trajectory Prediction: A
Conditional Variational Autoencoder Approach [34.70843462687529]
We provide a self-contained tutorial on a conditional variational autoencoder approach to human behavior prediction.
The goals of this tutorial paper are to review and build a taxonomy of state-of-the-art methods in human behavior prediction.
arXiv Detail & Related papers (2020-08-10T03:18:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.