Executing your Commands via Motion Diffusion in Latent Space
- URL: http://arxiv.org/abs/2212.04048v3
- Date: Fri, 19 May 2023 08:14:04 GMT
- Title: Executing your Commands via Motion Diffusion in Latent Space
- Authors: Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi
Yu, Gang Yu
- Abstract summary: We propose a Motion Latent-based Diffusion model (MLD) to produce vivid motion sequences conforming to the given conditional inputs.
Our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks.
- Score: 51.64652463205012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study a challenging task, conditional human motion generation, which
produces plausible human motion sequences according to various conditional
inputs, such as action classes or textual descriptors. Since human motions are
highly diverse and have a property of quite different distribution from
conditional modalities, such as textual descriptors in natural languages, it is
hard to learn a probabilistic mapping from the desired conditional modality to
the human motion sequences. Besides, the raw motion data from the motion
capture system might be redundant in sequences and contain noises; directly
modeling the joint distribution over the raw motion sequences and conditional
modalities would need a heavy computational overhead and might result in
artifacts introduced by the captured noises. To learn a better representation
of the various human motion sequences, we first design a powerful Variational
AutoEncoder (VAE) and arrive at a representative and low-dimensional latent
code for a human motion sequence. Then, instead of using a diffusion model to
establish the connections between the raw motion sequences and the conditional
inputs, we perform a diffusion process on the motion latent space. Our proposed
Motion Latent-based Diffusion model (MLD) could produce vivid motion sequences
conforming to the given conditional inputs and substantially reduce the
computational overhead in both the training and inference stages. Extensive
experiments on various human motion generation tasks demonstrate that our MLD
achieves significant improvements over the state-of-the-art methods among
extensive human motion generation tasks, with two orders of magnitude faster
than previous diffusion models on raw motion sequences.
Related papers
- Text-driven Human Motion Generation with Motion Masked Diffusion Model [23.637853270123045]
Text human motion generation is a task that synthesizes human motion sequences conditioned on natural language.
Current diffusion model-based approaches have outstanding performance in the diversity and multimodality of generation.
We propose Motion Masked Diffusion Model bftext(MMDM), a novel human motion mechanism for diffusion model.
arXiv Detail & Related papers (2024-09-29T12:26:24Z) - Human Motion Synthesis_ A Diffusion Approach for Motion Stitching and In-Betweening [2.5165775267615205]
We propose a diffusion model with a transformer-based denoiser to generate realistic human motion.
Our method demonstrated strong performance in generating in-betweening sequences.
We present the performance evaluation of our method using quantitative metrics such as Frechet Inception Distance (FID), Diversity, and Multimodality.
arXiv Detail & Related papers (2024-09-10T18:02:32Z) - M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models [18.125860678409804]
We introduce the Multi-Motion Discrete Diffusion Models (M2D2M), a novel approach for human motion generation from text descriptions.
M2D2M adeptly addresses the challenge of generating multi-motion sequences, ensuring seamless transitions of motions and coherence across a series of actions.
arXiv Detail & Related papers (2024-07-19T17:57:33Z) - Shape Conditioned Human Motion Generation with Diffusion Model [0.0]
We propose a Shape-conditioned Motion Diffusion model (SMD), which enables the generation of motion sequences directly in mesh format.
We also propose a Spectral-Temporal Autoencoder (STAE) to leverage cross-temporal dependencies within the spectral domain.
arXiv Detail & Related papers (2024-05-10T19:06:41Z) - DiffusionPhase: Motion Diffusion in Frequency Domain [69.811762407278]
We introduce a learning-based method for generating high-quality human motion sequences from text descriptions.
Existing techniques struggle with motion diversity and smooth transitions in generating arbitrary-length motion sequences.
We develop a network encoder that converts the motion space into a compact yet expressive parameterized phase space.
arXiv Detail & Related papers (2023-12-07T04:39:22Z) - DiverseMotion: Towards Diverse Human Motion Generation via Discrete
Diffusion [70.33381660741861]
We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions.
We show that our DiverseMotion achieves the state-of-the-art motion quality and competitive motion diversity.
arXiv Detail & Related papers (2023-09-04T05:43:48Z) - Priority-Centric Human Motion Generation in Discrete Latent Space [59.401128190423535]
We introduce a Priority-Centric Motion Discrete Diffusion Model (M2DM) for text-to-motion generation.
M2DM incorporates a global self-attention mechanism and a regularization term to counteract code collapse.
We also present a motion discrete diffusion model that employs an innovative noise schedule, determined by the significance of each motion token.
arXiv Detail & Related papers (2023-08-28T10:40:16Z) - MoDi: Unconditional Motion Synthesis from Diverse Data [51.676055380546494]
We present MoDi, an unconditional generative model that synthesizes diverse motions.
Our model is trained in a completely unsupervised setting from a diverse, unstructured and unlabeled motion dataset.
We show that despite the lack of any structure in the dataset, the latent space can be semantically clustered.
arXiv Detail & Related papers (2022-06-16T09:06:25Z) - Weakly-supervised Action Transition Learning for Stochastic Human Motion
Prediction [81.94175022575966]
We introduce the task of action-driven human motion prediction.
It aims to predict multiple plausible future motions given a sequence of action labels and a short motion history.
arXiv Detail & Related papers (2022-05-31T08:38:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.