Related papers: Emotion Dynamics Modeling via BERT

Emotion Dynamics Modeling via BERT

URL: http://arxiv.org/abs/2104.07252v1
Date: Thu, 15 Apr 2021 05:58:48 GMT
Title: Emotion Dynamics Modeling via BERT
Authors: Haiqin Yang and Jianping Shen
Abstract summary: We develop a series of BERT-based models to capture the inter-interlocutor and intra-interlocutor dependencies of the conversational emotion dynamics. Our proposed models can attain around 5% and 10% improvement over the state-of-the-art baselines, respectively.
Score: 7.3785751096660555
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Emotion dynamics modeling is a significant task in emotion recognition in conversation. It aims to predict conversational emotions when building empathetic dialogue systems. Existing studies mainly develop models based on Recurrent Neural Networks (RNNs). They cannot benefit from the power of the recently-developed pre-training strategies for better token representation learning in conversations. More seriously, it is hard to distinguish the dependency of interlocutors and the emotional influence among interlocutors by simply assembling the features on top of RNNs. In this paper, we develop a series of BERT-based models to specifically capture the inter-interlocutor and intra-interlocutor dependencies of the conversational emotion dynamics. Concretely, we first substitute BERT for RNNs to enrich the token representations. Then, a Flat-structured BERT (F-BERT) is applied to link up utterances in a conversation directly, and a Hierarchically-structured BERT (H-BERT) is employed to distinguish the interlocutors when linking up utterances. More importantly, a Spatial-Temporal-structured BERT, namely ST-BERT, is proposed to further determine the emotional influence among interlocutors. Finally, we conduct extensive experiments on two popular emotion recognition in conversation benchmark datasets and demonstrate that our proposed models can attain around 5\% and 10\% improvement over the state-of-the-art baselines, respectively.

Related papers

Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities [93.09944267871163]
FullDuplexBench is a benchmark that systematically evaluates key conversational behaviors. We aim to advance spoken dialogue modeling and encourage the development of more interactive and natural dialogue systems.
arXiv Detail & Related papers (2025-03-06T18:59:16Z)
Acknowledgment of Emotional States: Generating Validating Responses for Empathetic Dialogue [21.621844911228315]
This study introduces the first framework designed to engender empathetic dialogue with validating responses. Our approach incorporates a tripartite module system: 1) validation timing detection, 2) users' emotional state identification, and 3) validating response generation.
arXiv Detail & Related papers (2024-02-20T07:20:03Z)
Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation. This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions. Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z)
Context-Dependent Embedding Utterance Representations for Emotion Recognition in Conversations [1.8126187844654875]
We approach Emotion Recognition in Conversations leveraging the conversational context. We propose context-dependent embedding representations of each utterance. The effectiveness of our approach is validated on the open-domain DailyDialog dataset and on the task-oriented EmoWOZ dataset.
arXiv Detail & Related papers (2023-04-17T12:37:57Z)
EmotionIC: emotional inertia and contagion-driven dependency modeling for emotion recognition in conversation [34.24557248359872]
We propose an emotional inertia and contagion-driven dependency modeling approach (EmotionIC) for ERC task. Our EmotionIC consists of three main components, i.e., Identity Masked Multi-Head Attention (IMMHA), Dialogue-based Gated Recurrent Unit (DiaGRU) and Skip-chain Conditional Random Field (SkipCRF) Experimental results show that our method can significantly outperform the state-of-the-art models on four benchmark datasets.
arXiv Detail & Related papers (2023-03-20T13:58:35Z)
deep learning of segment-level feature representation for speech emotion recognition in conversations [9.432208348863336]
We propose a conversational speech emotion recognition method to deal with capturing attentive contextual dependency and speaker-sensitive interactions. First, we use a pretrained VGGish model to extract segment-based audio representation in individual utterances. Second, an attentive bi-directional recurrent unit (GRU) models contextual-sensitive information and explores intra- and inter-speaker dependencies jointly.
arXiv Detail & Related papers (2023-02-05T16:15:46Z)
Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs. We employ domain-adaptive training strategies to help the model adapt to the dialogue domains. Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z)
Contextual Information and Commonsense Based Prompt for Emotion Recognition in Conversation [14.651642872901496]
Emotion recognition in conversation (ERC) aims to detect the emotion for each utterance in a given conversation. Recent ERC models have leveraged pre-trained language models (PLMs) with the paradigm of pre-training and fine-tuning to obtain good performance. We propose a novel ERC model CISPER with the new paradigm of prompt and language model (LM) tuning.
arXiv Detail & Related papers (2022-07-27T02:34:05Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities. We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z)
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model [56.75775793011719]
We introduce and publicly release a Mandarin emotion speech dataset including 9,724 samples with audio files and its emotion human-labeled annotation. Unlike those models which need additional reference audio as input, our model could predict emotion labels just from the input text and generate more expressive speech conditioned on the emotion embedding. In the experiment phase, we first validate the effectiveness of our dataset by an emotion classification task. Then we train our model on the proposed dataset and conduct a series of subjective evaluations.
arXiv Detail & Related papers (2021-06-17T08:34:21Z)
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability [82.39099867188547]
Emotional text-to-speech synthesis (ETTS) has seen much progress in recent years. We propose a new interactive training paradigm for ETTS, denoted as i-ETTS. We formulate an iterative training strategy with reinforcement learning to ensure the quality of i-ETTS optimization.
arXiv Detail & Related papers (2021-04-03T13:52:47Z)
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.