Related papers: Multiple Appropriate Facial Reaction Generation in Dyadic Interaction Settings: What, Why and How?

Multiple Appropriate Facial Reaction Generation in Dyadic Interaction Settings: What, Why and How?

URL: http://arxiv.org/abs/2302.06514v4
Date: Thu, 23 Mar 2023 16:58:41 GMT
Title: Multiple Appropriate Facial Reaction Generation in Dyadic Interaction Settings: What, Why and How?
Authors: Siyang Song, Micol Spitale, Yiming Luo, Batuhan Bal, Hatice Gunes
Abstract summary: This paper defines the Multiple Appropriate Reaction Generation task for the first time in the literature. It then proposes a new set of objective evaluation metrics to evaluate the appropriateness of the generated reactions. The paper subsequently introduces a framework to predict, generate, and evaluate multiple appropriate facial reactions.
Score: 11.130984858239412
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: According to the Stimulus Organism Response (SOR) theory, all human behavioral reactions are stimulated by context, where people will process the received stimulus and produce an appropriate reaction. This implies that in a specific context for a given input stimulus, a person can react differently according to their internal state and other contextual factors. Analogously, in dyadic interactions, humans communicate using verbal and nonverbal cues, where a broad spectrum of listeners' non-verbal reactions might be appropriate for responding to a specific speaker behaviour. There already exists a body of work that investigated the problem of automatically generating an appropriate reaction for a given input. However, none attempted to automatically generate multiple appropriate reactions in the context of dyadic interactions and evaluate the appropriateness of those reactions using objective measures. This paper starts by defining the facial Multiple Appropriate Reaction Generation (fMARG) task for the first time in the literature and proposes a new set of objective evaluation metrics to evaluate the appropriateness of the generated reactions. The paper subsequently introduces a framework to predict, generate, and evaluate multiple appropriate facial reactions.

Related papers

Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting [11.016004057765185]
The dyadic reaction generation task involves responsive facial reactions that align closely with the behaviors of a conversational partner.<n>This paper introduces a novel approach, the Latent Behavior Diffusion Model, comprising a context-aware autoencoder and a diffusion-based conditional generator.<n> Experimental results demonstrate the effectiveness of our approach in achieving superior performance in dyadic reaction synthesis tasks compared to existing methods.
arXiv Detail & Related papers (2025-05-12T09:22:27Z)
HERO: Human Reaction Generation from Videos [54.602947113980655]
HERO is a framework for Human rEaction geneRation from videOs. HERO considers both global and frame-level local representations of the video to extract the interaction intention. Local visual representations are continuously injected into the model to maximize the exploitation of the dynamic properties inherent in videos.
arXiv Detail & Related papers (2025-03-11T10:39:32Z)
Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
This paper introduces modelname, a novel chemical reaction representation learning model tailored for a variety of organic-reaction-related tasks. By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction, thereby enhancing the comprehension of the reaction mechanism. We have designed an adapter structure to incorporate reaction conditions into the chemical reaction representation, allowing the model to handle diverse reaction conditions and adapt to various datasets and downstream tasks, e.g., reaction performance prediction.
arXiv Detail & Related papers (2024-11-26T17:41:44Z)
ReGenNet: Towards Human Action-Reaction Synthesis [87.57721371471536]
We analyze the asymmetric, dynamic, synchronous, and detailed nature of human-human interactions. We propose the first multi-setting human action-reaction benchmark to generate human reactions conditioned on given human actions.
arXiv Detail & Related papers (2024-03-18T15:33:06Z)
Emotional Listener Portrait: Realistic Listener Motion Simulation in Conversation [50.35367785674921]
Listener head generation centers on generating non-verbal behaviors of a listener in reference to the information delivered by a speaker. A significant challenge when generating such responses is the non-deterministic nature of fine-grained facial expressions during a conversation. We propose the Emotional Listener Portrait (ELP), which treats each fine-grained facial motion as a composition of several discrete motion-codewords. Our ELP model can not only automatically generate natural and diverse responses toward a given speaker via sampling from the learned distribution but also generate controllable responses with a predetermined attitude.
arXiv Detail & Related papers (2023-09-29T18:18:32Z)
Leveraging Implicit Feedback from Deployment Data in Dialogue [83.02878726357523]
We study improving social conversational agents by learning from natural dialogue between users and a deployed model. We leverage signals like user response length, sentiment and reaction of the future human utterances in the collected dialogue episodes.
arXiv Detail & Related papers (2023-07-26T11:34:53Z)
MRecGen: Multimodal Appropriate Reaction Generator [31.60823534748163]
This paper proposes the first multiple and multimodal (verbal and nonverbal) appropriate human reaction generation framework. It can be applied to various human-computer interaction scenarios by generating appropriate virtual agent/robot behaviours.
arXiv Detail & Related papers (2023-07-05T19:07:00Z)
ReactFace: Online Multiple Appropriate Facial Reaction Generation in Dyadic Interactions [46.66378299720377]
In dyadic interaction, predicting the listener's facial reactions is challenging as different reactions could be appropriate in response to the same speaker's behaviour. This paper reformulates the task as an extrapolation or prediction problem, and proposes a novel framework (called ReactFace) to generate multiple different but appropriate facial reactions.
arXiv Detail & Related papers (2023-05-25T05:55:53Z)
Reversible Graph Neural Network-based Reaction Distribution Learning for Multiple Appropriate Facial Reactions Generation [22.579200870471475]
This paper proposes the first multiple appropriate facial reaction generation framework. It re-formulates the one-to-many mapping facial reaction generation problem as a one-to-one mapping problem. Experimental results demonstrate that our approach outperforms existing models in generating more appropriate, realistic, and synchronized facial reactions.
arXiv Detail & Related papers (2023-05-24T15:56:26Z)
Use of a Taxonomy of Empathetic Response Intents to Control and Interpret Empathy in Neural Chatbots [4.264192013842096]
A recent trend in the domain of open-domain conversational agents is enabling them to converse empathetically to emotional prompts. Current approaches either follow an end-to-end approach or condition the responses on similar emotion labels to generate empathetic responses. We propose several rule-based and neural approaches to predict the next response's emotion/intent and generate responses conditioned on these predicted emotions/intents.
arXiv Detail & Related papers (2023-05-17T10:03:03Z)
Exemplars-guided Empathetic Response Generation Controlled by the Elements of Human Communication [88.52901763928045]
We propose an approach that relies on exemplars to cue the generative model on fine stylistic properties that signal empathy to the interlocutor. We empirically show that these approaches yield significant improvements in empathetic response quality in terms of both automated and human-evaluated metrics.
arXiv Detail & Related papers (2021-06-22T14:02:33Z)
Mapping the Space of Chemical Reactions Using Attention-Based Neural Networks [0.3848364262836075]
This work shows that transformer-based models can infer reaction classes from non-annotated, simple text-based representations of chemical reactions. Our best model reaches a classification accuracy of 98.2%. The insights into chemical reaction space enabled by our learned fingerprints are illustrated by an interactive reaction atlas.
arXiv Detail & Related papers (2020-12-09T10:25:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.