Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot
- URL: http://arxiv.org/abs/2311.05334v1
- Date: Thu, 9 Nov 2023 13:01:21 GMT
- Title: Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot
- Authors: Carlo Mazzola, Francesco Rea, Alessandra Sciutti
- Abstract summary: Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
- Score: 52.277579221741746
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Addressee Estimation is the ability to understand to whom a person is
talking, a skill essential for social robots to interact smoothly with humans.
In this sense, it is one of the problems that must be tackled to develop
effective conversational agents in multi-party and unstructured scenarios. As
humans, one of the channels that mainly lead us to such estimation is the
non-verbal behavior of speakers: first of all, their gaze and body pose.
Inspired by human perceptual skills, in the present work, a deep-learning model
for Addressee Estimation relying on these two non-verbal features is designed,
trained, and deployed on an iCub robot. The study presents the procedure of
such implementation and the performance of the model deployed in real-time
human-robot interaction compared to previous tests on the dataset used for the
training.
Related papers
- Human-Robot Mutual Learning through Affective-Linguistic Interaction and Differential Outcomes Training [Pre-Print] [0.3811184252495269]
We test how affective-linguistic communication, in combination with differential outcomes training, affects mutual learning in a human-robot context.
Taking inspiration from child- caregiver dynamics, our human-robot interaction setup consists of a (simulated) robot attempting to learn how best to communicate internal, homeostatically-controlled needs.
arXiv Detail & Related papers (2024-07-01T13:35:08Z) - A Multi-Modal Explainability Approach for Human-Aware Robots in Multi-Party Conversation [39.87346821309096]
We present an addressee estimation model with improved performance in comparison with the previous SOTA.
We also propose several ways to incorporate explainability and transparency in the aforementioned architecture.
arXiv Detail & Related papers (2024-05-20T13:09:32Z) - HandMeThat: Human-Robot Communication in Physical and Social
Environments [73.91355172754717]
HandMeThat is a benchmark for a holistic evaluation of instruction understanding and following in physical and social environments.
HandMeThat contains 10,000 episodes of human-robot interactions.
We show that both offline and online reinforcement learning algorithms perform poorly on HandMeThat.
arXiv Detail & Related papers (2023-10-05T16:14:46Z) - Towards a Real-time Measure of the Perception of Anthropomorphism in
Human-robot Interaction [5.112850258732114]
We conducted an online human-robot interaction experiment in an educational use-case scenario.
43 English-speaking participants took part in the study.
We found that the degree of subjective and objective perception of anthropomorphism positively correlates with acoustic-prosodic entrainment.
arXiv Detail & Related papers (2022-01-24T11:10:37Z) - Few-Shot Visual Grounding for Natural Human-Robot Interaction [0.0]
We propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user.
At the core of our system, we employ a multi-modal deep neural network for visual grounding.
We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets.
arXiv Detail & Related papers (2021-03-17T15:24:02Z) - Cognitive architecture aided by working-memory for self-supervised
multi-modal humans recognition [54.749127627191655]
The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions.
Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task.
One solution is to make robots learn from their first-hand sensory data with self-supervision.
arXiv Detail & Related papers (2021-03-16T13:50:24Z) - Joint Mind Modeling for Explanation Generation in Complex Human-Robot
Collaborative Tasks [83.37025218216888]
We propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations.
The robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications.
Results show that the generated explanations of our approach significantly improves the collaboration performance and user perception of the robot.
arXiv Detail & Related papers (2020-07-24T23:35:03Z) - Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions.
We propose two knowledge-based data-driven methods to effectively capture these social interactions.
We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z) - Human Grasp Classification for Reactive Human-to-Robot Handovers [50.91803283297065]
We propose an approach for human-to-robot handovers in which the robot meets the human halfway.
We collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses.
We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position.
arXiv Detail & Related papers (2020-03-12T19:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.