3D Human Interaction Generation: A Survey
- URL: http://arxiv.org/abs/2503.13120v1
- Date: Mon, 17 Mar 2025 12:47:33 GMT
- Title: 3D Human Interaction Generation: A Survey
- Authors: Siyuan Fan, Wenke Huang, Xiantao Cai, Bo Du,
- Abstract summary: 3D human interaction generation focuses on producing dynamic and contextually relevant interactions between humans and interactive entities.<n>Recent advancements in 3D model representation methods, motion capture technologies, and generative models have laid a solid foundation for the growing interest in this domain.<n>Despite the rapid advancements in this area, challenges remain due to the need for naturalness in human motion generation and the accurate interaction between humans and interactive entities.
- Score: 25.736432845850576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D human interaction generation has emerged as a key research area, focusing on producing dynamic and contextually relevant interactions between humans and various interactive entities. Recent rapid advancements in 3D model representation methods, motion capture technologies, and generative models have laid a solid foundation for the growing interest in this domain. Existing research in this field can be broadly categorized into three areas: human-scene interaction, human-object interaction, and human-human interaction. Despite the rapid advancements in this area, challenges remain due to the need for naturalness in human motion generation and the accurate interaction between humans and interactive entities. In this survey, we present a comprehensive literature review of human interaction generation, which, to the best of our knowledge, is the first of its kind. We begin by introducing the foundational technologies, including model representations, motion capture methods, and generative models. Subsequently, we introduce the approaches proposed for the three sub-tasks, along with their corresponding datasets and evaluation metrics. Finally, we discuss potential future research directions in this area and conclude the survey. Through this survey, we aim to offer a comprehensive overview of the current advancements in the field, highlight key challenges, and inspire future research works.
Related papers
- A Survey on Human Interaction Motion Generation [16.56813883497309]
Humans inhabit a world defined by interactions -- with other humans, objects, and environments.<n>Interactive movements convey our relationships with our surroundings and demonstrate how we perceive and communicate with the real world.<n> replicating these interaction behaviors in digital systems has emerged as an important topic for applications in robotics, virtual reality, and animation.
arXiv Detail & Related papers (2025-03-17T02:55:10Z) - Human-Centric Foundation Models: Perception, Generation and Agentic Modeling [79.97999901785772]
Human-centric Foundation Models unify diverse human-centric tasks into a single framework.
We present a comprehensive overview of HcFMs by proposing a taxonomy that categorizes current approaches into four groups.
This survey aims to serve as a roadmap for researchers and practitioners working towards more robust, versatile, and intelligent digital human and embodiments modeling.
arXiv Detail & Related papers (2025-02-12T16:38:40Z) - Human Action Anticipation: A Survey [86.415721659234]
The literature on behavior prediction spans various tasks, including action anticipation, activity forecasting, intent prediction, goal prediction, and so on.
Our survey aims to tie together this fragmented literature, covering recent technical innovations as well as the development of new large-scale datasets for model training and evaluation.
arXiv Detail & Related papers (2024-10-17T21:37:40Z) - A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights [8.192172339127657]
Human video generation aims to synthesize 2D human body video sequences with generative models given control conditions such as text, audio, and pose.
Recent advancements in generative models have laid a solid foundation for the growing interest in this area.
Despite the significant progress, the task of human video generation remains challenging due to the consistency of characters, the complexity of human motion, and difficulties in their relationship with the environment.
arXiv Detail & Related papers (2024-07-11T12:09:05Z) - THOR: Text to Human-Object Interaction Diffusion via Relation Intervention [51.02435289160616]
We propose a novel Text-guided Human-Object Interaction diffusion model with Relation Intervention (THOR)
In each diffusion step, we initiate text-guided human and object motion and then leverage human-object relations to intervene in object motion.
We construct Text-BEHAVE, a Text2HOI dataset that seamlessly integrates textual descriptions with the currently largest publicly available 3D HOI dataset.
arXiv Detail & Related papers (2024-03-17T13:17:25Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - ContactGen: Contact-Guided Interactive 3D Human Generation for Partners [9.13466172688693]
We introduce a new task of 3D human generation in terms of physical contact.
A given partner human can have diverse poses and different contact regions according to the type of interaction.
We propose a novel method of generating interactive 3D humans for a given partner human based on a guided diffusion framework.
arXiv Detail & Related papers (2024-01-30T17:57:46Z) - Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - Human Motion Generation: A Survey [67.38982546213371]
Human motion generation aims to generate natural human pose sequences and shows immense potential for real-world applications.
Most research within this field focuses on generating human motions based on conditional signals, such as text, audio, and scene contexts.
We present a comprehensive literature review of human motion generation, which is the first of its kind in this field.
arXiv Detail & Related papers (2023-07-20T14:15:20Z) - Didn't see that coming: a survey on non-verbal social human behavior
forecasting [47.99589136455976]
Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years.
Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field.
We define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of social signals prediction and human motion forecasting.
arXiv Detail & Related papers (2022-03-04T18:25:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.