Related papers: Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation

Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation

URL: http://arxiv.org/abs/2210.08713v2
Date: Wed, 19 Oct 2022 08:52:55 GMT
Title: Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation
Authors: Xiaohui Song, Longtao Huang, Hui Xue, Songlin Hu
Abstract summary: We propose a Supervised Prototypical Contrastive Learning (SPCL) loss for the emotion recognition task. We design a difficulty measure function based on the distance between classes and introduce curriculum learning to alleviate the impact of extreme samples. We achieve state-of-the-art results on three widely used benchmarks.
Score: 25.108385802645163
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Capturing emotions within a conversation plays an essential role in modern dialogue systems. However, the weak correlation between emotions and semantics brings many challenges to emotion recognition in conversation (ERC). Even semantically similar utterances, the emotion may vary drastically depending on contexts or speakers. In this paper, we propose a Supervised Prototypical Contrastive Learning (SPCL) loss for the ERC task. Leveraging the Prototypical Network, the SPCL targets at solving the imbalanced classification problem through contrastive learning and does not require a large batch size. Meanwhile, we design a difficulty measure function based on the distance between classes and introduce curriculum learning to alleviate the impact of extreme samples. We achieve state-of-the-art results on three widely used benchmarks. Further, we conduct analytical experiments to demonstrate the effectiveness of our proposed SPCL and curriculum learning strategy. We release the code at https://github.com/caskcsg/SPCL.

Related papers

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification. We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z)
Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation [23.309174697717374]
Emotion Recognition in Conversation (ERC) involves detecting the underlying emotion behind each utterance within a conversation. We propose an Emotion-Anchored Contrastive Learning framework that can generate more distinguishable utterance representations for similar emotions. Our proposed EACL achieves state-of-the-art emotion recognition performance and exhibits superior performance on similar emotions.
arXiv Detail & Related papers (2024-03-29T17:00:55Z)
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting. To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity. Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z)
ERNetCL: A novel emotion recognition network in textual conversation based on curriculum learning strategy [37.41082775317849]
We propose a novel emotion recognition network based on curriculum learning strategy (ERNetCL) The proposed ERNetCL primarily consists of temporal encoder (TE), spatial encoder (SE), and curriculum learning (CL) loss. Our proposed method is effective and dramatically beats other baseline models.
arXiv Detail & Related papers (2023-08-12T03:05:44Z)
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition [72.36055502078193]
We propose a hierarchical framework, based on chain regression models, for affective recognition from vocal bursts. To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules. The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE" tasks.
arXiv Detail & Related papers (2023-03-14T16:08:45Z)
Cluster-Level Contrastive Learning for Emotion Recognition in Conversations [13.570186295041644]
Key challenge for Emotion Recognition in Conversations (ERC) is to distinguish semantically similar emotions. Some works utilise Supervised Contrastive Learning (SCL) which uses categorical emotion labels as supervision signals and contrasts in high-dimensional semantic space. We propose a novel low-dimensional Supervised Cluster-level Contrastive Learning ( SCCL) method, which first reduces the high-dimensional SCL space to a three-dimensional affect representation space.
arXiv Detail & Related papers (2023-02-07T14:49:20Z)
Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Loss [80.79641247882012]
We focus on unsupervised feature learning for Multimodal Emotion Recognition (MER) We consider discrete emotions, and as modalities text, audio and vision are used. Our method, as being based on contrastive loss between pairwise modalities, is the first attempt in MER literature.
arXiv Detail & Related papers (2022-07-23T10:11:24Z)
Hybrid Curriculum Learning for Emotion Recognition in Conversation [10.912215835115063]
Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC) With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models.
arXiv Detail & Related papers (2021-12-22T08:02:58Z)
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability [82.39099867188547]
Emotional text-to-speech synthesis (ETTS) has seen much progress in recent years. We propose a new interactive training paradigm for ETTS, denoted as i-ETTS. We formulate an iterative training strategy with reinforcement learning to ensure the quality of i-ETTS optimization.
arXiv Detail & Related papers (2021-04-03T13:52:47Z)
SpanEmo: Casting Multi-label Emotion Classification as Span-prediction [15.41237087996244]
We propose a new model "SpanEmo" casting multi-label emotion classification as span-prediction. We introduce a loss function focused on modelling multiple co-existing emotions in the input sentence. Experiments performed on the SemEval2018 multi-label emotion data over three language sets demonstrate our method's effectiveness.
arXiv Detail & Related papers (2021-01-25T12:11:04Z)
COSMIC: COmmonSense knowledge for eMotion Identification in Conversations [95.71018134363976]
We propose COSMIC, a new framework that incorporates different elements of commonsense such as mental states, events, and causal relations. We show that COSMIC achieves new state-of-the-art results for emotion recognition on four different benchmark conversational datasets.
arXiv Detail & Related papers (2020-10-06T15:09:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.