Human Motion Generation: A Survey
- URL: http://arxiv.org/abs/2307.10894v3
- Date: Wed, 15 Nov 2023 06:26:21 GMT
- Title: Human Motion Generation: A Survey
- Authors: Wentao Zhu, Xiaoxuan Ma, Dongwoo Ro, Hai Ci, Jinlu Zhang, Jiaxin Shi,
Feng Gao, Qi Tian, and Yizhou Wang
- Abstract summary: Human motion generation aims to generate natural human pose sequences and shows immense potential for real-world applications.
Most research within this field focuses on generating human motions based on conditional signals, such as text, audio, and scene contexts.
We present a comprehensive literature review of human motion generation, which is the first of its kind in this field.
- Score: 67.38982546213371
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human motion generation aims to generate natural human pose sequences and
shows immense potential for real-world applications. Substantial progress has
been made recently in motion data collection technologies and generation
methods, laying the foundation for increasing interest in human motion
generation. Most research within this field focuses on generating human motions
based on conditional signals, such as text, audio, and scene contexts. While
significant advancements have been made in recent years, the task continues to
pose challenges due to the intricate nature of human motion and its implicit
relationship with conditional signals. In this survey, we present a
comprehensive literature review of human motion generation, which, to the best
of our knowledge, is the first of its kind in this field. We begin by
introducing the background of human motion and generative models, followed by
an examination of representative methods for three mainstream sub-tasks:
text-conditioned, audio-conditioned, and scene-conditioned human motion
generation. Additionally, we provide an overview of common datasets and
evaluation metrics. Lastly, we discuss open problems and outline potential
future research directions. We hope that this survey could provide the
community with a comprehensive glimpse of this rapidly evolving field and
inspire novel ideas that address the outstanding challenges.
Related papers
- A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights [8.192172339127657]
Human video generation aims to synthesize 2D human body video sequences with generative models given control conditions such as text, audio, and pose.
Recent advancements in generative models have laid a solid foundation for the growing interest in this area.
Despite the significant progress, the task of human video generation remains challenging due to the consistency of characters, the complexity of human motion, and difficulties in their relationship with the environment.
arXiv Detail & Related papers (2024-07-11T12:09:05Z) - LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment [27.38638713080283]
We introduce LaserHuman, a pioneering dataset engineered to revolutionize Scene-Text-to-Motion research.
LaserHuman stands out with its inclusion of genuine human motions within 3D environments.
We propose a multi-conditional diffusion model, which is simple but effective, achieving state-of-the-art performance on existing datasets.
arXiv Detail & Related papers (2024-03-20T05:11:10Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations [61.659439423703155]
TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations.
Our method generates continuous motions that are parameterized only by the temporal coordinate.
This work takes a step further toward general human-scene interaction simulation.
arXiv Detail & Related papers (2023-03-23T09:31:56Z) - A Comprehensive Review of Data-Driven Co-Speech Gesture Generation [11.948557523215316]
The automatic generation of such co-speech gestures is a long-standing problem in computer animation.
Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion.
This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models.
arXiv Detail & Related papers (2023-01-13T00:20:05Z) - GIMO: Gaze-Informed Human Motion Prediction in Context [75.52839760700833]
We propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, and ego-centric views with eye gaze.
Our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects.
To realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches.
arXiv Detail & Related papers (2022-04-20T13:17:39Z) - 3D Human Motion Prediction: A Survey [23.605334184939164]
3D human motion prediction, predicting future poses from a given sequence, is an issue of great significance and challenge in computer vision and machine intelligence.
A comprehensive survey on 3D human motion prediction is conducted for the purpose of retrospecting and analyzing relevant works from existing released literature.
arXiv Detail & Related papers (2022-03-03T09:46:43Z) - Scene-aware Generative Network for Human Motion Synthesis [125.21079898942347]
We propose a new framework, with the interaction between the scene and the human motion taken into account.
Considering the uncertainty of human motion, we formulate this task as a generative task.
We derive a GAN based learning approach, with discriminators to enforce the compatibility between the human motion and the contextual scene.
arXiv Detail & Related papers (2021-05-31T09:05:50Z) - Long-term Human Motion Prediction with Scene Context [60.096118270451974]
We propose a novel three-stage framework for predicting human motion.
Our method first samples multiple human motion goals, then plans 3D human paths towards each goal, and finally predicts 3D human pose sequences following each path.
arXiv Detail & Related papers (2020-07-07T17:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.