Maia: A Real-time Non-Verbal Chat for Human-AI Interaction
- URL: http://arxiv.org/abs/2402.06385v1
- Date: Fri, 9 Feb 2024 13:07:22 GMT
- Title: Maia: A Real-time Non-Verbal Chat for Human-AI Interaction
- Authors: Dragos Costea, Alina Marcu, Cristina Lazar, Marius Leordeanu
- Abstract summary: We propose an alternative to text chats for Human-AI interaction, using facial expressions and head movements that mirror, but also improvise over the human user.
Our goal is to track and analyze facial expressions, and other non-verbal cues in real-time, and use this information to build models that can predict and understand human behavior.
- Score: 11.558827428811385
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Face-to-face communication modeling in computer vision is an area of research
focusing on developing algorithms that can recognize and analyze non-verbal
cues and behaviors during face-to-face interactions. We propose an alternative
to text chats for Human-AI interaction, based on non-verbal visual
communication only, using facial expressions and head movements that mirror,
but also improvise over the human user, to efficiently engage with the users,
and capture their attention in a low-cost and real-time fashion. Our goal is to
track and analyze facial expressions, and other non-verbal cues in real-time,
and use this information to build models that can predict and understand human
behavior. We offer three different complementary approaches, based on
retrieval, statistical, and deep learning techniques. We provide human as well
as automatic evaluations and discuss the advantages and disadvantages of each
direction.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.
In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation.
We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z) - Real-time Addressee Estimation: Deployment of a Deep-Learning Model on
the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans.
Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot.
The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - BOSS: A Benchmark for Human Belief Prediction in Object-context
Scenarios [14.23697277904244]
This paper uses the combined knowledge of Theory of Mind (ToM) and Object-Context Relations to investigate methods for enhancing collaboration between humans and autonomous systems.
We propose a novel and challenging multimodal video dataset for assessing the capability of artificial intelligence (AI) systems in predicting human belief states in an object-context scenario.
arXiv Detail & Related papers (2022-06-21T18:29:17Z) - Multi-Cue Adaptive Emotion Recognition Network [4.570705738465714]
We propose a new deep learning approach for emotion recognition based on adaptive multi-cues.
We compare the proposed approach with the state-of-art approaches in the CAER-S dataset.
arXiv Detail & Related papers (2021-11-03T15:08:55Z) - Let's be friends! A rapport-building 3D embodied conversational agent
for the Human Support Robot [0.0]
Partial subtle mirroring of nonverbal behaviors during conversations (also known as mimicking or parallel empathy) is essential for rapport building.
Our research question is whether integrating an ECA able to mirror its interlocutor's facial expressions and head movements with a human-service robot will improve the user's experience.
Our contribution is the complex integration of an expressive ECA, able to track its interlocutor's face, and to mirror his/her facial expressions and head movements in real time, integrated with a human support robot.
arXiv Detail & Related papers (2021-03-08T01:02:41Z) - You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation.
Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z) - Learning Human-Object Interaction Detection using Interaction Points [140.0200950601552]
We propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs.
Our network predicts interaction points, which directly localize and classify the inter-action.
Experiments are performed on two popular benchmarks: V-COCO and HICO-DET.
arXiv Detail & Related papers (2020-03-31T08:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.