EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics
- URL: http://arxiv.org/abs/2408.08782v2
- Date: Fri, 11 Oct 2024 12:04:11 GMT
- Title: EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics
- Authors: Chenwei Wan, Matthieu Labeau, ChloƩ Clavel,
- Abstract summary: EmoDynamiX models the discourse dynamics between user fine-grained emotions and system strategies using a heterogeneous graph for better performance and transparency.
Experimental results on two ESC datasets show EmoDynamiX outperforms previous state-of-the-art methods with a significant margin.
- Score: 12.105216351739422
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing emotionally intelligent conversational systems to provide comfort and advice to people experiencing distress is a compelling area of research. Recently, with advancements in large language models (LLMs), end-to-end dialogue agents without explicit strategy prediction steps have become prevalent. However, implicit strategy planning lacks transparency, and recent studies show that LLMs' inherent preference bias towards certain socio-emotional strategies hinders the delivery of high-quality emotional support. To address this challenge, we propose decoupling strategy prediction from language generation, and introduce a novel dialogue strategy prediction framework, EmoDynamiX, which models the discourse dynamics between user fine-grained emotions and system strategies using a heterogeneous graph for better performance and transparency. Experimental results on two ESC datasets show EmoDynamiX outperforms previous state-of-the-art methods with a significant margin (better proficiency and lower preference bias). Our approach also exhibits better transparency by allowing backtracing of decision making.
Related papers
- Dynamic Modality and View Selection for Multimodal Emotion Recognition with Missing Modalities [46.543216927386005]
Multiple channels, such as speech (voice) and facial expressions (image) are crucial in understanding human emotions.
One significant hurdle is how AI models manage the absence of a particular modality.
This study's central focus is assessing the performance and resilience of two strategies when confronted with the lack of one modality.
arXiv Detail & Related papers (2024-04-18T15:18:14Z) - ASEM: Enhancing Empathy in Chatbot through Attention-based Sentiment and
Emotion Modeling [0.0]
We present a novel solution by employing a mixture of experts, multiple encoders, to offer distinct perspectives on the emotional state of the user's utterance.
We propose an end-to-end model architecture called ASEM that performs emotion analysis on top of sentiment analysis for open-domain chatbots.
arXiv Detail & Related papers (2024-02-25T20:36:51Z) - Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation [28.74445806009475]
This work initially analyzes the results of large language models (LLMs) on ESConv.
We observe that exhibiting high preference for specific strategies hinders effective emotional support.
Our findings emphasize that (1) low preference for specific strategies hinders the progress of emotional support, (2) external assistance helps reduce preference bias, and (3) existing LLMs alone cannot become good emotional supporters.
arXiv Detail & Related papers (2024-02-20T18:21:32Z) - Knowledge-enhanced Memory Model for Emotional Support Conversation [8.856733707377922]
We propose a knowledge-enhanced Memory mODEl for emotional suppoRt coNversation (MODERN)
Specifically, we first devise a knowledge-enriched dialogue context encoding to perceive the dynamic emotion change of different periods of the conversation.
We then implement a novel memory-enhanced strategy modeling module to model the semantic patterns behind the strategy categories.
arXiv Detail & Related papers (2023-10-11T17:51:28Z) - Building Emotional Support Chatbots in the Era of LLMs [64.06811786616471]
We introduce an innovative methodology that synthesizes human insights with the computational prowess of Large Language Models (LLMs)
By utilizing the in-context learning potential of ChatGPT, we generate an ExTensible Emotional Support dialogue dataset, named ExTES.
Following this, we deploy advanced tuning techniques on the LLaMA model, examining the impact of diverse training strategies, ultimately yielding an LLM meticulously optimized for emotional support interactions.
arXiv Detail & Related papers (2023-08-17T10:49:18Z) - Improving Multi-turn Emotional Support Dialogue Generation with
Lookahead Strategy Planning [81.79431311952656]
We propose a novel system MultiESC to provide Emotional Support.
For strategy planning, we propose lookaheads to estimate the future user feedback after using particular strategies.
For user state modeling, MultiESC focuses on capturing users' subtle emotional expressions and understanding their emotion causes.
arXiv Detail & Related papers (2022-10-09T12:23:47Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - An Attribute-Aligned Strategy for Learning Speech Representation [57.891727280493015]
We propose an attribute-aligned learning strategy to derive speech representation that can flexibly address these issues by attribute-selection mechanism.
Specifically, we propose a layered-representation variational autoencoder (LR-VAE), which factorizes speech representation into attribute-sensitive nodes.
Our proposed method achieves competitive performances on identity-free SER and a better performance on emotionless SV.
arXiv Detail & Related papers (2021-06-05T06:19:14Z) - Reinforcement Learning for Emotional Text-to-Speech Synthesis with
Improved Emotion Discriminability [82.39099867188547]
Emotional text-to-speech synthesis (ETTS) has seen much progress in recent years.
We propose a new interactive training paradigm for ETTS, denoted as i-ETTS.
We formulate an iterative training strategy with reinforcement learning to ensure the quality of i-ETTS optimization.
arXiv Detail & Related papers (2021-04-03T13:52:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.