X2C: A Dataset Featuring Nuanced Facial Expressions for Realistic Humanoid Imitation
- URL: http://arxiv.org/abs/2505.11146v1
- Date: Fri, 16 May 2025 11:48:19 GMT
- Title: X2C: A Dataset Featuring Nuanced Facial Expressions for Realistic Humanoid Imitation
- Authors: Peizhen Li, Longbing Cao, Xiao-Ming Wu, Runze Yang, Xiaohan Yu,
- Abstract summary: The ability to imitate realistic facial expressions is essential for humanoid robots engaged in affective human-robot communication.<n>We introduce X2C, a dataset featuring nuanced facial expressions for realistic humanoid imitation.<n>X2CNet, a novel human-to-humanoid facial expression imitation framework, learns the correspondence between nuanced humanoid expressions and their underlying control values from X2C.
- Score: 27.987188226933846
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to imitate realistic facial expressions is essential for humanoid robots engaged in affective human-robot communication. However, the lack of datasets containing diverse humanoid facial expressions with proper annotations hinders progress in realistic humanoid facial expression imitation. To address these challenges, we introduce X2C (Anything to Control), a dataset featuring nuanced facial expressions for realistic humanoid imitation. With X2C, we contribute: 1) a high-quality, high-diversity, large-scale dataset comprising 100,000 (image, control value) pairs. Each image depicts a humanoid robot displaying a diverse range of facial expressions, annotated with 30 control values representing the ground-truth expression configuration; 2) X2CNet, a novel human-to-humanoid facial expression imitation framework that learns the correspondence between nuanced humanoid expressions and their underlying control values from X2C. It enables facial expression imitation in the wild for different human performers, providing a baseline for the imitation task, showcasing the potential value of our dataset; 3) real-world demonstrations on a physical humanoid robot, highlighting its capability to advance realistic humanoid facial expression imitation. Code and Data: https://lipzh5.github.io/X2CNet/
Related papers
- EVA: Expressive Virtual Avatars from Multi-view Videos [51.33851869426057]
We introduce Expressive Virtual Avatars (EVA), an actor-specific, fully controllable, and expressive human avatar framework.<n>EVA achieves high-fidelity, lifelike renderings in real time while enabling independent control of facial expressions, body movements, and hand gestures.<n>This work represents a significant advancement towards fully drivable digital human models.
arXiv Detail & Related papers (2025-05-21T11:22:52Z) - Learning from Massive Human Videos for Universal Humanoid Pose Control [46.417054298537195]
This paper introduces Humanoid-X, a large-scale dataset of over 20 million humanoid robot poses with corresponding text-based motion descriptions.<n>We train a large humanoid model, UH-1, which takes text instructions as input and outputs corresponding actions to control a humanoid robot.<n>Our scalable training approach leads to superior generalization in text-based humanoid control, marking a significant step toward adaptable, real-world-ready humanoid robots.
arXiv Detail & Related papers (2024-12-18T18:59:56Z) - Towards Localized Fine-Grained Control for Facial Expression Generation [54.82883891478555]
Humans, particularly their faces, are central to content generation due to their ability to convey rich expressions and intent.
Current generative models mostly generate flat neutral expressions and characterless smiles without authenticity.
We propose the use of AUs (action units) for facial expression control in face generation.
arXiv Detail & Related papers (2024-07-25T18:29:48Z) - HINT: Learning Complete Human Neural Representations from Limited Viewpoints [69.76947323932107]
We propose a NeRF-based algorithm able to learn a detailed and complete human model from limited viewing angles.
As a result, our method can reconstruct complete humans even from a few viewing angles, increasing performance by more than 15% PSNR.
arXiv Detail & Related papers (2024-05-30T05:43:09Z) - CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation [13.27632316528572]
Speech-driven 3D facial animation technology has been developed for years, but its practical application still lacks expectations.
Main challenges lie in data limitations, lip alignment, and the naturalness of facial expressions.
This paper proposes a method called CSTalk that models the correlations among different regions of facial movements and supervises the training of the generative model to generate realistic expressions.
arXiv Detail & Related papers (2024-04-29T11:19:15Z) - Driving Animatronic Robot Facial Expression From Speech [7.8799497614708605]
This paper introduces a novel, skinning-centric approach to drive animatronic robot facial expressions from speech input.
The proposed approach employs linear skinning (LBS) as a unifying representation, guiding innovations in both embodiment design and motion synthesis.
This approach demonstrates the capability to produce highly realistic facial expressions on an animatronic face in real-time at over 4000 fps on a single Nvidia GTX 4090.
arXiv Detail & Related papers (2024-03-19T12:11:57Z) - RealDex: Towards Human-like Grasping for Robotic Dexterous Hand [64.33746404551343]
We introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns.<n>RealDex holds immense promise in advancing humanoid robot for automated perception, cognition, and manipulation in real-world scenarios.
arXiv Detail & Related papers (2024-02-21T14:59:46Z) - CapHuman: Capture Your Moments in Parallel Universes [60.06408546134581]
We present a new framework named CapHuman.
CapHuman encodes identity features and then learns to align them into the latent space.
We introduce the 3D facial prior to equip our model with control over the human head in a flexible and 3D-consistent manner.
arXiv Detail & Related papers (2024-02-01T14:41:59Z) - XAGen: 3D Expressive Human Avatars Generation [76.69560679209171]
XAGen is the first 3D generative model for human avatars capable of expressive control over body, face, and hands.
We propose a multi-part rendering technique that disentangles the synthesis of body, face, and hands.
Experiments show that XAGen surpasses state-of-the-art methods in terms of realism, diversity, and expressive control abilities.
arXiv Detail & Related papers (2023-11-22T18:30:42Z) - Towards Inclusive HRI: Using Sim2Real to Address Underrepresentation in
Emotion Expression Recognition [5.819149317261972]
We aim to build a system that can perceive humans in a more transparent and inclusive manner.
We use a Sim2Real approach in which we use a suite of 3D simulated human models.
By augmenting a small dynamic emotional expression dataset with a synthetic dataset containing 4536 samples, we achieved an improvement in accuracy of 15%.
arXiv Detail & Related papers (2022-08-15T23:37:13Z) - HSPACE: Synthetic Parametric Humans Animated in Complex Environments [67.8628917474705]
We build a large-scale photo-realistic dataset, Human-SPACE, of animated humans placed in complex indoor and outdoor environments.
We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, in order to generate an initial dataset of over 1 million frames.
Assets are generated automatically, at scale, and are compatible with existing real time rendering and game engines.
arXiv Detail & Related papers (2021-12-23T22:27:55Z) - Smile Like You Mean It: Driving Animatronic Robotic Face with Learned
Models [11.925808365657936]
Ability to generate intelligent and generalizable facial expressions is essential for building human-like social robots.
We develop a vision-based self-supervised learning framework for facial mimicry.
Our method enables accurate and diverse face mimicry across diverse human subjects.
arXiv Detail & Related papers (2021-05-26T17:57:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.