Watch the Speakers: A Hybrid Continuous Attribution Network for Emotion
Recognition in Conversation With Emotion Disentanglement
- URL: http://arxiv.org/abs/2309.09799v2
- Date: Tue, 19 Sep 2023 12:26:48 GMT
- Title: Watch the Speakers: A Hybrid Continuous Attribution Network for Emotion
Recognition in Conversation With Emotion Disentanglement
- Authors: Shanglin Lei and Xiaoping Wang and Guanting Dong and Jiang Li and
Yingjian Liu
- Abstract summary: Emotion Recognition in Conversation (ERC) has attracted widespread attention in the natural language processing field.
Existing ERC methods face challenges in achieving generalization to diverse scenarios due to insufficient modeling of context.
We present a Hybrid Continuous Attributive Network (HCAN) to address these issues in the perspective of emotional continuation and emotional attribution.
- Score: 8.17164107060944
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Emotion Recognition in Conversation (ERC) has attracted widespread attention
in the natural language processing field due to its enormous potential for
practical applications. Existing ERC methods face challenges in achieving
generalization to diverse scenarios due to insufficient modeling of context,
ambiguous capture of dialogue relationships and overfitting in speaker
modeling. In this work, we present a Hybrid Continuous Attributive Network
(HCAN) to address these issues in the perspective of emotional continuation and
emotional attribution. Specifically, HCAN adopts a hybrid recurrent and
attention-based module to model global emotion continuity. Then a novel
Emotional Attribution Encoding (EAE) is proposed to model intra- and
inter-emotional attribution for each utterance. Moreover, aiming to enhance the
robustness of the model in speaker modeling and improve its performance in
different scenarios, A comprehensive loss function emotional cognitive loss
$\mathcal{L}_{\rm EC}$ is proposed to alleviate emotional drift and overcome
the overfitting of the model to speaker modeling. Our model achieves
state-of-the-art performance on three datasets, demonstrating the superiority
of our work. Another extensive comparative experiments and ablation studies on
three benchmarks are conducted to provided evidence to support the efficacy of
each module. Further exploration of generalization ability experiments shows
the plug-and-play nature of the EAE module in our method.
Related papers
- Robust Facial Reactions Generation: An Emotion-Aware Framework with Modality Compensation [27.2792182180834]
We propose an Emotion-aware Modality Compensatory (EMC) framework.
Our framework ensures resilience when faced with missing modality data.
It also generates more appropriate emotion-aware reactions via the Emotion-aware Attention (EA) module.
arXiv Detail & Related papers (2024-07-22T17:00:02Z) - BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks [2.9873893715462176]
This work introduces a novel framework named BiosERC, which investigates speaker characteristics in a conversation.
By employing Large Language Models (LLMs), we extract the "biographical information" of the speaker within a conversation.
Our proposed method achieved state-of-the-art (SOTA) results on three famous benchmark datasets.
arXiv Detail & Related papers (2024-07-05T06:25:34Z) - Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning [55.127202990679976]
We introduce the MERR dataset, containing 28,618 coarse-grained and 4,487 fine-grained annotated samples across diverse emotional categories.
This dataset enables models to learn from varied scenarios and generalize to real-world applications.
We propose Emotion-LLaMA, a model that seamlessly integrates audio, visual, and textual inputs through emotion-specific encoders.
arXiv Detail & Related papers (2024-06-17T03:01:22Z) - ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains [61.50113532215864]
Causal Emotion Entailment (CEE) aims to identify the causal utterances in a conversation that stimulate the emotions expressed in a target utterance.
Current works in CEE mainly focus on modeling semantic and emotional interactions in conversations.
We introduce a step-by-step reasoning method, Emotion-Cause Reasoning Chain (ECR-Chain), to infer the stimulus from the target emotional expressions in conversations.
arXiv Detail & Related papers (2024-05-17T15:45:08Z) - Building Emotional Support Chatbots in the Era of LLMs [64.06811786616471]
We introduce an innovative methodology that synthesizes human insights with the computational prowess of Large Language Models (LLMs)
By utilizing the in-context learning potential of ChatGPT, we generate an ExTensible Emotional Support dialogue dataset, named ExTES.
Following this, we deploy advanced tuning techniques on the LLaMA model, examining the impact of diverse training strategies, ultimately yielding an LLM meticulously optimized for emotional support interactions.
arXiv Detail & Related papers (2023-08-17T10:49:18Z) - EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation [34.24557248359872]
We propose an emotional inertia and contagion-driven dependency modeling approach (EmotionIC) for ERC task.
Our EmotionIC consists of three main components, i.e., Identity Masked Multi-Head Attention (IMMHA), Dialogue-based Gated Recurrent Unit (DiaGRU) and Skip-chain Conditional Random Field (SkipCRF)
Experimental results show that our method can significantly outperform the state-of-the-art models on four benchmark datasets.
arXiv Detail & Related papers (2023-03-20T13:58:35Z) - Leveraging TCN and Transformer for effective visual-audio fusion in
continuous emotion recognition [0.5370906227996627]
We present our approach to the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge.
We propose a novel multi-modal fusion model that leverages Temporal Convolutional Networks (TCN) and Transformer to enhance the performance of continuous emotion recognition.
arXiv Detail & Related papers (2023-03-15T04:15:57Z) - A Hierarchical Regression Chain Framework for Affective Vocal Burst
Recognition [72.36055502078193]
We propose a hierarchical framework, based on chain regression models, for affective recognition from vocal bursts.
To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules.
The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE" tasks.
arXiv Detail & Related papers (2023-03-14T16:08:45Z) - TSAM: A Two-Stream Attention Model for Causal Emotion Entailment [50.07800752967995]
Causal Emotion Entailment (CEE) aims to discover the potential causes behind an emotion in a conversational utterance.
We classify multiple utterances synchronously to capture the correlations between utterances in a global view.
We propose a Two-Stream Attention Model (TSAM) to effectively model the speaker's emotional influences in the conversational history.
arXiv Detail & Related papers (2022-03-02T02:11:41Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - Shapes of Emotions: Multimodal Emotion Recognition in Conversations via
Emotion Shifts [2.443125107575822]
Emotion Recognition in Conversations (ERC) is an important and active research problem.
Recent work has shown the benefits of using multiple modalities for the ERC task.
We propose a multimodal ERC model and augment it with an emotion-shift component.
arXiv Detail & Related papers (2021-12-03T14:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.