Related papers: Why Robots Are Bad at Detecting Their Mistakes: Limitations of Miscommunication Detection in Human-Robot Dialogue

Why Robots Are Bad at Detecting Their Mistakes: Limitations of Miscommunication Detection in Human-Robot Dialogue

URL: http://arxiv.org/abs/2506.20268v1
Date: Wed, 25 Jun 2025 09:25:04 GMT
Title: Why Robots Are Bad at Detecting Their Mistakes: Limitations of Miscommunication Detection in Human-Robot Dialogue
Authors: Ruben Janssens, Jens De Bock, Sofie Labat, Eva Verhelst, Veronique Hoste, Tony Belpaeme,
Abstract summary: This research evaluates the effectiveness of machine learning models in detecting miscommunications in robot dialogue.<n>After each conversational turn, users provided feedback on whether they perceived an error, enabling an analysis of the models' ability to accurately detect robot mistakes.
Score: 0.6118899177909359
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting miscommunication in human-robot interaction is a critical function for maintaining user engagement and trust. While humans effortlessly detect communication errors in conversations through both verbal and non-verbal cues, robots face significant challenges in interpreting non-verbal feedback, despite advances in computer vision for recognizing affective expressions. This research evaluates the effectiveness of machine learning models in detecting miscommunications in robot dialogue. Using a multi-modal dataset of 240 human-robot conversations, where four distinct types of conversational failures were systematically introduced, we assess the performance of state-of-the-art computer vision models. After each conversational turn, users provided feedback on whether they perceived an error, enabling an analysis of the models' ability to accurately detect robot mistakes. Despite using state-of-the-art models, the performance barely exceeds random chance in identifying miscommunication, while on a dataset with more expressive emotional content, they successfully identified confused states. To explore the underlying cause, we asked human raters to do the same. They could also only identify around half of the induced miscommunications, similarly to our model. These results uncover a fundamental limitation in identifying robot miscommunications in dialogue: even when users perceive the induced miscommunication as such, they often do not communicate this to their robotic conversation partner. This knowledge can shape expectations of the performance of computer vision models and can help researchers to design better human-robot conversations by deliberately eliciting feedback where needed.

Related papers

ERR@HRI 2.0 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Conversations [15.140345369639215]
ERR@HRI 2.0 Challenge provides a dataset of conversational robot failures during human-robot conversations.<n> dataset includes 16 hours of dyadic human-robot interactions, incorporating facial, speech, and head movement features.<n>Participants are invited to form teams and develop machine learning models that detect these failures using multimodal data.
arXiv Detail & Related papers (2025-07-17T18:21:45Z)
Human strategies for correcting `human-robot' errors during a laundry sorting task [3.9697512504288373]
Video analysis from 42 participants found speech patterns, including laughter, verbal expressions, and filler words, such as oh'' and ok''<n>Common strategies deployed when errors occurred, included correcting and teaching, taking responsibility, and displays of frustration.<n>An anthropomorphic robot may not be ideally suited to this kind of task.
arXiv Detail & Related papers (2025-04-11T09:53:36Z)
Human-Robot Interaction and Perceived Irrationality: A Study of Trust Dynamics and Error Acknowledgment [0.0]
This study systematically examines trust dynamics and system design by analyzing human reactions to robot failures.<n>We conducted a four-stage survey to explore how trust evolves throughout human-robot interactions.<n>Results indicate that trust in robotic systems significantly increased when robots acknowledged their errors or limitations.
arXiv Detail & Related papers (2024-03-21T11:00:11Z)
Real-time Addressee Estimation: Deployment of a Deep-Learning Model on the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans. Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot. The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z)
Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in Dialog Systems [64.10696852552103]
Highly anthropomorphic responses might make users uncomfortable or implicitly deceive them into thinking they are interacting with a human. We collect human ratings on the feasibility of approximately 900 two-turn dialogs sampled from 9 diverse data sources.
arXiv Detail & Related papers (2022-10-22T12:10:44Z)
Robots with Different Embodiments Can Express and Influence Carefulness in Object Manipulation [104.5440430194206]
This work investigates the perception of object manipulations performed with a communicative intent by two robots. We designed the robots' movements to communicate carefulness or not during the transportation of objects.
arXiv Detail & Related papers (2022-08-03T13:26:52Z)
Continuous ErrP detections during multimodal human-robot interaction [2.5199066832791535]
We implement a multimodal human-robot interaction (HRI) scenario, in which a simulated robot communicates with its human partner through speech and gestures. The human partner, in turn, evaluates whether the robot's verbal announcement (intention) matches the action (pointing gesture) chosen by the robot. In intrinsic evaluations of robot actions by humans, evident in the EEG, were recorded in real time, continuously segmented online and classified asynchronously.
arXiv Detail & Related papers (2022-07-25T15:39:32Z)
Data-driven emotional body language generation for social robotics [58.88028813371423]
In social robotics, endowing humanoid robots with the ability to generate bodily expressions of affect can improve human-robot interaction and collaboration. We implement a deep learning data-driven framework that learns from a few hand-designed robotic bodily expressions. The evaluation study found that the anthropomorphism and animacy of the generated expressions are not perceived differently from the hand-designed ones.
arXiv Detail & Related papers (2022-05-02T09:21:39Z)
AAAI SSS-22 Symposium on Closing the Assessment Loop: Communicating Proficiency and Intent in Human-Robot Teaming [4.787322716745613]
How should a robot convey predicted ability on a new task? How should a robot adapt its proficiency criteria based on human intentions and values? There are no agreed upon standards for evaluating proficiency and intent-based interactions.
arXiv Detail & Related papers (2022-04-05T18:28:01Z)
Let's be friends! A rapport-building 3D embodied conversational agent for the Human Support Robot [0.0]
Partial subtle mirroring of nonverbal behaviors during conversations (also known as mimicking or parallel empathy) is essential for rapport building. Our research question is whether integrating an ECA able to mirror its interlocutor's facial expressions and head movements with a human-service robot will improve the user's experience. Our contribution is the complex integration of an expressive ECA, able to track its interlocutor's face, and to mirror his/her facial expressions and head movements in real time, integrated with a human support robot.
arXiv Detail & Related papers (2021-03-08T01:02:41Z)
Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs [90.20235972293801]
Aiming to understand how human (false-temporal)-belief-a core socio-cognitive ability unify-would affect human interactions with robots, this paper proposes to adopt a graphical model to the representation of object states, robot knowledge, and human (false-)beliefs. An inference algorithm is derived to fuse individual pg from all robots across multi-views into a joint pg, which affords more effective reasoning inference capability to overcome the errors originated from a single view.
arXiv Detail & Related papers (2020-04-25T23:02:04Z)
You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation. Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.