Emotion Dynamics Modeling via BERT
- URL: http://arxiv.org/abs/2104.07252v1
- Date: Thu, 15 Apr 2021 05:58:48 GMT
- Title: Emotion Dynamics Modeling via BERT
- Authors: Haiqin Yang and Jianping Shen
- Abstract summary: We develop a series of BERT-based models to capture the inter-interlocutor and intra-interlocutor dependencies of the conversational emotion dynamics.
Our proposed models can attain around 5% and 10% improvement over the state-of-the-art baselines, respectively.
- Score: 7.3785751096660555
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Emotion dynamics modeling is a significant task in emotion recognition in
conversation. It aims to predict conversational emotions when building
empathetic dialogue systems. Existing studies mainly develop models based on
Recurrent Neural Networks (RNNs). They cannot benefit from the power of the
recently-developed pre-training strategies for better token representation
learning in conversations. More seriously, it is hard to distinguish the
dependency of interlocutors and the emotional influence among interlocutors by
simply assembling the features on top of RNNs. In this paper, we develop a
series of BERT-based models to specifically capture the inter-interlocutor and
intra-interlocutor dependencies of the conversational emotion dynamics.
Concretely, we first substitute BERT for RNNs to enrich the token
representations. Then, a Flat-structured BERT (F-BERT) is applied to link up
utterances in a conversation directly, and a Hierarchically-structured BERT
(H-BERT) is employed to distinguish the interlocutors when linking up
utterances. More importantly, a Spatial-Temporal-structured BERT, namely
ST-BERT, is proposed to further determine the emotional influence among
interlocutors. Finally, we conduct extensive experiments on two popular emotion
recognition in conversation benchmark datasets and demonstrate that our
proposed models can attain around 5\% and 10\% improvement over the
state-of-the-art baselines, respectively.
Related papers
- Acknowledgment of Emotional States: Generating Validating Responses for
Empathetic Dialogue [21.621844911228315]
This study introduces the first framework designed to engender empathetic dialogue with validating responses.
Our approach incorporates a tripartite module system: 1) validation timing detection, 2) users' emotional state identification, and 3) validating response generation.
arXiv Detail & Related papers (2024-02-20T07:20:03Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - Context-Dependent Embedding Utterance Representations for Emotion
Recognition in Conversations [1.8126187844654875]
We approach Emotion Recognition in Conversations leveraging the conversational context.
We propose context-dependent embedding representations of each utterance.
The effectiveness of our approach is validated on the open-domain DailyDialog dataset and on the task-oriented EmoWOZ dataset.
arXiv Detail & Related papers (2023-04-17T12:37:57Z) - EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation [34.24557248359872]
We propose an emotional inertia and contagion-driven dependency modeling approach (EmotionIC) for ERC task.
Our EmotionIC consists of three main components, i.e., Identity Masked Multi-Head Attention (IMMHA), Dialogue-based Gated Recurrent Unit (DiaGRU) and Skip-chain Conditional Random Field (SkipCRF)
Experimental results show that our method can significantly outperform the state-of-the-art models on four benchmark datasets.
arXiv Detail & Related papers (2023-03-20T13:58:35Z) - deep learning of segment-level feature representation for speech emotion
recognition in conversations [9.432208348863336]
We propose a conversational speech emotion recognition method to deal with capturing attentive contextual dependency and speaker-sensitive interactions.
First, we use a pretrained VGGish model to extract segment-based audio representation in individual utterances.
Second, an attentive bi-directional recurrent unit (GRU) models contextual-sensitive information and explores intra- and inter-speaker dependencies jointly.
arXiv Detail & Related papers (2023-02-05T16:15:46Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - Contextual Information and Commonsense Based Prompt for Emotion
Recognition in Conversation [14.651642872901496]
Emotion recognition in conversation (ERC) aims to detect the emotion for each utterance in a given conversation.
Recent ERC models have leveraged pre-trained language models (PLMs) with the paradigm of pre-training and fine-tuning to obtain good performance.
We propose a novel ERC model CISPER with the new paradigm of prompt and language model (LM) tuning.
arXiv Detail & Related papers (2022-07-27T02:34:05Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
Text-to-Speech Model [56.75775793011719]
We introduce and publicly release a Mandarin emotion speech dataset including 9,724 samples with audio files and its emotion human-labeled annotation.
Unlike those models which need additional reference audio as input, our model could predict emotion labels just from the input text and generate more expressive speech conditioned on the emotion embedding.
In the experiment phase, we first validate the effectiveness of our dataset by an emotion classification task. Then we train our model on the proposed dataset and conduct a series of subjective evaluations.
arXiv Detail & Related papers (2021-06-17T08:34:21Z) - Reinforcement Learning for Emotional Text-to-Speech Synthesis with
Improved Emotion Discriminability [82.39099867188547]
Emotional text-to-speech synthesis (ETTS) has seen much progress in recent years.
We propose a new interactive training paradigm for ETTS, denoted as i-ETTS.
We formulate an iterative training strategy with reinforcement learning to ensure the quality of i-ETTS optimization.
arXiv Detail & Related papers (2021-04-03T13:52:47Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.