TEMOS: Generating diverse human motions from textual descriptions
- URL: http://arxiv.org/abs/2204.14109v1
- Date: Mon, 25 Apr 2022 14:53:06 GMT
- Title: TEMOS: Generating diverse human motions from textual descriptions
- Authors: Mathis Petrovich, Michael J. Black, G\"ul Varol
- Abstract summary: We address the problem of generating diverse 3D human motions from textual descriptions.
We propose TEMOS, a text-conditioned generative model leveraging variational autoencoder (VAE) training with human motion data.
We show that TEMOS framework can produce both skeleton-based animations as in prior work, as well more expressive SMPL body motions.
- Score: 53.85978336198444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of generating diverse 3D human motions from textual
descriptions. This challenging task requires joint modeling of both modalities:
understanding and extracting useful human-centric information from the text,
and then generating plausible and realistic sequences of human poses. In
contrast to most previous work which focuses on generating a single,
deterministic, motion from a textual description, we design a variational
approach that can produce multiple diverse human motions. We propose TEMOS, a
text-conditioned generative model leveraging variational autoencoder (VAE)
training with human motion data, in combination with a text encoder that
produces distribution parameters compatible with the VAE latent space. We show
that TEMOS framework can produce both skeleton-based animations as in prior
work, as well more expressive SMPL body motions. We evaluate our approach on
the KIT Motion-Language benchmark and, despite being relatively
straightforward, demonstrate significant improvements over the state of the
art. Code and models are available on our project page.
Related papers
- Towards Open Domain Text-Driven Synthesis of Multi-Person Motions [36.737740727883924]
We curate human pose and motion datasets by estimating pose information from large-scale image and video datasets.
Our method is the first to generate multi-subject motion sequences with high diversity and fidelity from a large variety of textual prompts.
arXiv Detail & Related papers (2024-05-28T18:00:06Z) - Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs [67.59291068131438]
Motion-Agent is a conversational framework designed for general human motion generation, editing, and understanding.
Motion-Agent employs an open-source pre-trained language model to develop a generative agent, MotionLLM, that bridges the gap between motion and text.
arXiv Detail & Related papers (2024-05-27T09:57:51Z) - Generating Human Motion in 3D Scenes from Text Descriptions [60.04976442328767]
This paper focuses on the task of generating human motions in 3D indoor scenes given text descriptions of the human-scene interactions.
We propose a new approach that decomposes the complex problem into two more manageable sub-problems.
For language grounding of the target object, we leverage the power of large language models; for motion generation, we design an object-centric scene representation.
arXiv Detail & Related papers (2024-05-13T14:30:12Z) - Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance [48.986552871497]
We introduce a novel two-stage framework that employs scene affordance as an intermediate representation.
By leveraging scene affordance maps, our method overcomes the difficulty in generating human motion under multimodal condition signals.
Our approach consistently outperforms all baselines on established benchmarks, including HumanML3D and HUMANISE.
arXiv Detail & Related papers (2024-03-26T18:41:07Z) - BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics [50.88842027976421]
We propose BOTH57M, a novel multi-modal dataset for two-hand motion generation.
Our dataset includes accurate motion tracking for the human body and hands.
We also provide a strong baseline method, BOTH2Hands, for the novel task.
arXiv Detail & Related papers (2023-12-13T07:30:19Z) - HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models [42.62823339416957]
We address the problem of generating realistic 3D human-object interactions (HOIs) driven by textual prompts.
We first develop a dual-branch diffusion model (HOI-DM) to generate both human and object motions conditioned on the input text.
We also develop an affordance prediction diffusion model (APDM) to predict the contacting area between the human and object.
arXiv Detail & Related papers (2023-12-11T17:41:17Z) - Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion
Model [11.873294782380984]
We propose a fine-grained method for generating high-quality, conditional human motion sequences supporting precise text description.
Our approach consists of two key components: 1) a linguistics-structure assisted module that constructs accurate and complete language feature to fully utilize text information; and 2) a context-aware progressive reasoning module that learns neighborhood and overall semantic linguistics features from shallow and deep graph neural networks to achieve a multi-step inference.
arXiv Detail & Related papers (2023-09-12T14:43:47Z) - DiverseMotion: Towards Diverse Human Motion Generation via Discrete
Diffusion [70.33381660741861]
We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions.
We show that our DiverseMotion achieves the state-of-the-art motion quality and competitive motion diversity.
arXiv Detail & Related papers (2023-09-04T05:43:48Z) - TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of
3D Human Motions and Texts [20.336481832461168]
Inspired by the strong ties between vision and language, our paper aims to explore the generation of 3D human full-body motions from texts.
We propose the use of motion token, a discrete and compact motion representation.
Our approach is flexible, could be used for both text2motion and motion2text tasks.
arXiv Detail & Related papers (2022-07-04T19:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.