Related papers: FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction

FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction

URL: http://arxiv.org/abs/2503.01363v2
Date: Tue, 04 Mar 2025 07:51:38 GMT
Title: FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction
Authors: Yanghai Zhang, Changyi Liu, Keting Fu, Wenbin Zhou, Qingdu Li, Jianwei Zhang,
Abstract summary: This paper proposes FABG (Facial Affective Behavior Generation), an end-to-end imitation learning system for human-robot interaction.<n>We develop an immersive virtual reality (VR) demonstration system that allows operators to perceive stereoscopic environments.<n>We deploy FABG on a real-world 25-degree-of-freedom humanoid robot, validating its effectiveness through four fundamental interaction tasks.
Score: 3.8177867835232004
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper proposes FABG (Facial Affective Behavior Generation), an end-to-end imitation learning system for human-robot interaction, designed to generate natural and fluid facial affective behaviors. In interaction, effectively obtaining high-quality demonstrations remains a challenge. In this work, we develop an immersive virtual reality (VR) demonstration system that allows operators to perceive stereoscopic environments. This system ensures "the operator's visual perception matches the robot's sensory input" and "the operator's actions directly determine the robot's behaviors" - as if the operator replaces the robot in human interaction engagements. We propose a prediction-driven latency compensation strategy to reduce robotic reaction delays and enhance interaction fluency. FABG naturally acquires human interactive behaviors and subconscious motions driven by intuition, eliminating manual behavior scripting. We deploy FABG on a real-world 25-degree-of-freedom (DoF) humanoid robot, validating its effectiveness through four fundamental interaction tasks: expression response, dynamic gaze, foveated attention, and gesture recognition, supported by data collection and policy training. Project website: https://cybergenies.github.io

Related papers

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis [51.95817740348585]
Human-X is a novel framework designed to enable immersive and physically plausible human interactions across diverse entities.<n>Our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner.<n>Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction.
arXiv Detail & Related papers (2025-08-04T06:35:48Z)
Recognizing Actions from Robotic View for Natural Human-Robot Interaction [52.00935005918032]
Natural Human-Robot Interaction (N-HRI) requires robots to recognize human actions at varying distances and states, regardless of whether the robot itself is in motion or stationary.<n>Existing benchmarks for N-HRI fail to address the unique complexities in N-HRI due to limited data, modalities, task categories, and diversity of subjects and environments.<n>We introduce (Action from Robotic View) a large-scale dataset for perception-centric robotic views prevalent in mobile service robots.
arXiv Detail & Related papers (2025-07-30T09:48:34Z)
Real-Time Imitation of Human Head Motions, Blinks and Emotions by Nao Robot: A Closed-Loop Approach [2.473948454680334]
This paper introduces a novel approach for enabling real-time imitation of human head motion by a robot. By using the MediaPipe as a computer vision library and the DeepFace as an emotion recognition library, this research endeavors to capture the subtleties of human head motion. The proposed approach holds promise in improving communication for children with autism, offering them a valuable tool for more effective interaction.
arXiv Detail & Related papers (2025-04-28T17:01:54Z)
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation [66.18557528695924]
We introduce Moto, which converts video content into latent Motion Token sequences by a Latent Motion Tokenizer.<n>We pre-train Moto-GPT through motion token autoregression, enabling it to capture diverse visual motion knowledge.<n>To transfer learned motion priors to real robot actions, we implement a co-fine-tuning strategy that seamlessly bridges latent motion token prediction and real robot control.
arXiv Detail & Related papers (2024-12-05T18:57:04Z)
EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning [10.266351600604612]
This paper introduces a framework, called EMOTION, for generating expressive motion sequences in humanoid robots. We conduct online user studies comparing the naturalness and understandability of the motions generated by EMOTION and its human-feedback version, EMOTION++.
arXiv Detail & Related papers (2024-10-30T17:22:45Z)
Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris. Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models. We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z)
Real-Time Dynamic Robot-Assisted Hand-Object Interaction via Motion Primitives [45.256762954338704]
We propose an approach to enhancing physical HRI with a focus on dynamic robot-assisted hand-object interaction. We employ a transformer-based algorithm to perform real-time 3D modeling of human hands from single RGB images. The robot's action implementation is dynamically fine-tuned using the continuously updated 3D hand models.
arXiv Detail & Related papers (2024-05-29T21:20:16Z)
Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction [9.806227900768926]
We propose to model social motion forecasting in a shared human-robot representation space. ECHO operates in the aforementioned shared space to predict the future motions of the agents encountered in social scenarios. We evaluate our model in multi-person and human-robot motion forecasting tasks and obtain state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2024-02-07T11:37:14Z)
What Matters to You? Towards Visual Representation Alignment for Robot Learning [81.30964736676103]
When operating in service of people, robots need to optimize rewards aligned with end-user preferences. We propose Representation-Aligned Preference-based Learning (RAPL), a method for solving the visual representation alignment problem.
arXiv Detail & Related papers (2023-10-11T23:04:07Z)
Synthesis and Execution of Communicative Robotic Movements with Generative Adversarial Networks [59.098560311521034]
We focus on how to transfer on two different robotic platforms the same kinematics modulation that humans adopt when manipulating delicate objects. We choose to modulate the velocity profile adopted by the robots' end-effector, inspired by what humans do when transporting objects with different characteristics. We exploit a novel Generative Adversarial Network architecture, trained with human kinematics examples, to generalize over them and generate new and meaningful velocity profiles.
arXiv Detail & Related papers (2022-03-29T15:03:05Z)
A ROS Architecture for Personalised HRI with a Bartender Social Robot [61.843727637976045]
BRILLO project has the overall goal of creating an autonomous robotic bartender that can interact with customers while accomplishing its bartending tasks. We present the developed three-layers ROS architecture integrating a perception layer managing the processing of different social signals, a decision-making layer for handling multi-party interactions, and an execution layer controlling the behaviour of a complex robot composed of arms and a face.
arXiv Detail & Related papers (2022-03-13T11:33:06Z)
A MultiModal Social Robot Toward Personalized Emotion Interaction [1.2183405753834562]
This study demonstrates a multimodal human-robot interaction (HRI) framework with reinforcement learning to enhance the robotic interaction policy. The goal is to apply this framework in social scenarios that can let the robots generate a more natural and engaging HRI framework.
arXiv Detail & Related papers (2021-10-08T00:35:44Z)
Show Me What You Can Do: Capability Calibration on Reachable Workspace for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth. We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z)
Affect-Driven Modelling of Robot Personality for Collaborative Human-Robot Interactions [16.40684407420441]
Collaborative interactions require social robots to adapt to the dynamics of human affective behaviour. We propose a novel framework for personality-driven behaviour generation in social robots.
arXiv Detail & Related papers (2020-10-14T16:34:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.