EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation
- URL: http://arxiv.org/abs/2303.11117v5
- Date: Wed, 20 Mar 2024 11:35:04 GMT
- Title: EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation
- Authors: Yingjian Liu, Jiang Li, Xiaoping Wang, Zhigang Zeng,
- Abstract summary: We propose an emotional inertia and contagion-driven dependency modeling approach (EmotionIC) for ERC task.
Our EmotionIC consists of three main components, i.e., Identity Masked Multi-Head Attention (IMMHA), Dialogue-based Gated Recurrent Unit (DiaGRU) and Skip-chain Conditional Random Field (SkipCRF)
Experimental results show that our method can significantly outperform the state-of-the-art models on four benchmark datasets.
- Score: 34.24557248359872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Emotion Recognition in Conversation (ERC) has attracted growing attention in recent years as a result of the advancement and implementation of human-computer interface technologies. In this paper, we propose an emotional inertia and contagion-driven dependency modeling approach (EmotionIC) for ERC task. Our EmotionIC consists of three main components, i.e., Identity Masked Multi-Head Attention (IMMHA), Dialogue-based Gated Recurrent Unit (DiaGRU), and Skip-chain Conditional Random Field (SkipCRF). Compared to previous ERC models, EmotionIC can model a conversation more thoroughly at both the feature-extraction and classification levels. The proposed model attempts to integrate the advantages of attention- and recurrence-based methods at the feature-extraction level. Specifically, IMMHA is applied to capture identity-based global contextual dependencies, while DiaGRU is utilized to extract speaker- and temporal-aware local contextual information. At the classification level, SkipCRF can explicitly mine complex emotional flows from higher-order neighboring utterances in the conversation. Experimental results show that our method can significantly outperform the state-of-the-art models on four benchmark datasets. The ablation studies confirm that our modules can effectively model emotional inertia and contagion.
Related papers
- Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning [40.101313334772016]
The purpose of emotion recognition in conversation (ERC) is to identify the emotion category of an utterance based on contextual information.
Previous ERC methods relied on simple connections for cross-modal fusion.
We propose a cross-modal fusion emotion prediction network based on vector connections.
arXiv Detail & Related papers (2024-05-28T07:22:30Z) - ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains [61.50113532215864]
Causal Emotion Entailment (CEE) aims to identify the causal utterances in a conversation that stimulate the emotions expressed in a target utterance.
Current works in CEE mainly focus on modeling semantic and emotional interactions in conversations.
We introduce a step-by-step reasoning method, Emotion-Cause Reasoning Chain (ECR-Chain), to infer the stimulus from the target emotional expressions in conversations.
arXiv Detail & Related papers (2024-05-17T15:45:08Z) - Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification.
We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z) - Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.
To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity.
Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z) - Watch the Speakers: A Hybrid Continuous Attribution Network for Emotion
Recognition in Conversation With Emotion Disentanglement [8.17164107060944]
Emotion Recognition in Conversation (ERC) has attracted widespread attention in the natural language processing field.
Existing ERC methods face challenges in achieving generalization to diverse scenarios due to insufficient modeling of context.
We present a Hybrid Continuous Attributive Network (HCAN) to address these issues in the perspective of emotional continuation and emotional attribution.
arXiv Detail & Related papers (2023-09-18T14:18:16Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - CFN-ESA: A Cross-Modal Fusion Network with Emotion-Shift Awareness for Dialogue Emotion Recognition [34.24557248359872]
We propose a Cross-modal Fusion Network with Emotion-Shift Awareness (CFN-ESA) for emotion recognition in conversation.
CFN-ESA consists of unimodal encoder (RUME), cross-modal encoder (ACME), and emotion-shift module (LESM)
arXiv Detail & Related papers (2023-07-28T09:29:42Z) - A Hierarchical Regression Chain Framework for Affective Vocal Burst
Recognition [72.36055502078193]
We propose a hierarchical framework, based on chain regression models, for affective recognition from vocal bursts.
To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules.
The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE" tasks.
arXiv Detail & Related papers (2023-03-14T16:08:45Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - Shapes of Emotions: Multimodal Emotion Recognition in Conversations via
Emotion Shifts [2.443125107575822]
Emotion Recognition in Conversations (ERC) is an important and active research problem.
Recent work has shown the benefits of using multiple modalities for the ERC task.
We propose a multimodal ERC model and augment it with an emotion-shift component.
arXiv Detail & Related papers (2021-12-03T14:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.