Digital Life Project: Autonomous 3D Characters with Social Intelligence
- URL: http://arxiv.org/abs/2312.04547v1
- Date: Thu, 7 Dec 2023 18:58:59 GMT
- Title: Digital Life Project: Autonomous 3D Characters with Social Intelligence
- Authors: Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan
Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan,
Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren,
Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu
- Abstract summary: Digital Life Project is a framework utilizing language as the universal medium to build autonomous 3D characters.
Our framework comprises two primary components: SocioMind and MoMat-MoGen.
- Score: 86.2845109451914
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we present Digital Life Project, a framework utilizing language
as the universal medium to build autonomous 3D characters, who are capable of
engaging in social interactions and expressing with articulated body motions,
thereby simulating life in a digital environment. Our framework comprises two
primary components: 1) SocioMind: a meticulously crafted digital brain that
models personalities with systematic few-shot exemplars, incorporates a
reflection process based on psychology principles, and emulates autonomy by
initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis
paradigm for controlling the character's digital body. It integrates motion
matching, a proven industry technique to ensure motion quality, with
cutting-edge advancements in motion generation for diversity. Extensive
experiments demonstrate that each module achieves state-of-the-art performance
in its respective domain. Collectively, they enable virtual characters to
initiate and sustain dialogues autonomously, while evolving their
socio-psychological states. Concurrently, these characters can perform
contextually relevant bodily movements. Additionally, a motion captioning
module further allows the virtual character to recognize and appropriately
respond to human players' actions. Homepage: https://digital-life-project.com/
Related papers
- ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions [66.87211993793807]
We present ReMoS, a denoising diffusion based model that synthesizes full body motion of a person in two person interaction scenario.
We demonstrate ReMoS across challenging two person scenarios such as pair dancing, Ninjutsu, kickboxing, and acrobatics.
We also contribute the ReMoCap dataset for two person interactions containing full body and finger motions.
arXiv Detail & Related papers (2023-11-28T18:59:52Z) - GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency [57.9920824261925]
Hands are dexterous and highly versatile manipulators that are central to how humans interact with objects and their environment.
modeling realistic hand-object interactions is critical for applications in computer graphics, computer vision, and mixed reality.
GRIP is a learning-based method that takes as input the 3D motion of the body and the object, and synthesizes realistic motion for both hands before, during, and after object interaction.
arXiv Detail & Related papers (2023-08-22T17:59:51Z) - IMoS: Intent-Driven Full-Body Motion Synthesis for Human-Object
Interactions [69.95820880360345]
We present the first framework to synthesize the full-body motion of virtual human characters with 3D objects placed within their reach.
Our system takes as input textual instructions specifying the objects and the associated intentions of the virtual characters.
We show that our synthesized full-body motions appear more realistic to the participants in more than 80% of scenarios.
arXiv Detail & Related papers (2022-12-14T23:59:24Z) - Generating Holistic 3D Human Motion from Speech [97.11392166257791]
We build a high-quality dataset of 3D holistic body meshes with synchronous speech.
We then define a novel speech-to-motion generation framework in which the face, body, and hands are modeled separately.
arXiv Detail & Related papers (2022-12-08T17:25:19Z) - Triangular Character Animation Sampling with Motion, Emotion, and
Relation [78.80083186208712]
We present a novel framework to sample and synthesize animations by associating the characters' body motions, facial expressions, and social relations.
Our method can provide animators with an automatic way to generate 3D character animations, help synthesize interactions between Non-Player Characters (NPCs) and enhance machine emotion intelligence in virtual reality (VR)
arXiv Detail & Related papers (2022-03-09T18:19:03Z) - DASH: Modularized Human Manipulation Simulation with Vision and Language
for Embodied AI [25.144827619452105]
We present Dynamic and Autonomous Simulated Human (DASH), an embodied virtual human that, given natural language commands, performs grasp-and-stack tasks in a physically-simulated cluttered environment.
By factoring the DASH system into a vision module, a language module, and manipulation modules of two skill categories, we can mix and match analytical and machine learning techniques for different modules so that DASH is able to not only perform randomly arranged tasks with a high success rate, but also do so under anthropomorphic constraints.
arXiv Detail & Related papers (2021-08-28T00:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.