Related papers: Analyzing the Influence of Dataset Composition for Emotion Recognition

Analyzing the Influence of Dataset Composition for Emotion Recognition

URL: http://arxiv.org/abs/2103.03700v1
Date: Fri, 5 Mar 2021 14:20:59 GMT
Title: Analyzing the Influence of Dataset Composition for Emotion Recognition
Authors: A. Sutherland, S. Magg, C. Weber, S. Wermter
Abstract summary: We analyze the influence data collection methodology has on two multimodal emotion recognition datasets. Experiments with the full IEMOCAP dataset indicate that the composition negatively influences generalization performance when compared to the OMG-Emotion Behavior dataset.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recognizing emotions from text in multimodal architectures has yielded promising results, surpassing video and audio modalities under certain circumstances. However, the method by which multimodal data is collected can be significant for recognizing emotional features in language. In this paper, we address the influence data collection methodology has on two multimodal emotion recognition datasets, the IEMOCAP dataset and the OMG-Emotion Behavior dataset, by analyzing textual dataset compositions and emotion recognition accuracy. Experiments with the full IEMOCAP dataset indicate that the composition negatively influences generalization performance when compared to the OMG-Emotion Behavior dataset. We conclude by discussing the impact this may have on HRI experiments.

Related papers

Enriching Multimodal Sentiment Analysis through Textual Emotional Descriptions of Visual-Audio Content [56.62027582702816]
Multimodal Sentiment Analysis seeks to unravel human emotions by amalgamating text, audio, and visual data. Yet, discerning subtle emotional nuances within audio and video expressions poses a formidable challenge. We introduce DEVA, a progressive fusion framework founded on textual sentiment descriptions.
arXiv Detail & Related papers (2024-12-12T11:30:41Z)
A Cross-Corpus Speech Emotion Recognition Method Based on Supervised Contrastive Learning [0.0]
This paper proposes a cross-corpus speech emotion recognition method based on supervised contrast learning. The method employs a two-stage fine-tuning process: first, the self-supervised speech representation model is fine-tuned using supervised contrastive learning on multiple speech emotion datasets. The experimental results show that the WavLM-based model achieved unweighted accuracy (UA) of 77.41% on the IEMOCAP dataset and 96.49% on the CASIA dataset.
arXiv Detail & Related papers (2024-11-25T07:03:31Z)
Speech Emotion Recognition under Resource Constraints with Data Distillation [64.36799373890916]
Speech emotion recognition (SER) plays a crucial role in human-computer interaction. The emergence of edge devices in the Internet of Things presents challenges in constructing intricate deep learning models. We propose a data distillation framework to facilitate efficient development of SER models in IoT applications.
arXiv Detail & Related papers (2024-06-21T13:10:46Z)
Deep Imbalanced Learning for Multimodal Emotion Recognition in Conversations [15.705757672984662]
Multimodal Emotion Recognition in Conversations (MERC) is a significant development direction for machine intelligence. Many data in MERC naturally exhibit an imbalanced distribution of emotion categories, and researchers ignore the negative impact of imbalanced data on emotion recognition. We propose the Class Boundary Enhanced Representation Learning (CBERL) model to address the imbalanced distribution of emotion categories in raw data. We have conducted extensive experiments on the IEMOCAP and MELD benchmark datasets, and the results show that CBERL has achieved a certain performance improvement in the effectiveness of emotion recognition.
arXiv Detail & Related papers (2023-12-11T12:35:17Z)
SER_AMPEL: a multi-source dataset for speech emotion recognition of Italian older adults [58.49386651361823]
SER_AMPEL is a multi-source dataset for speech emotion recognition (SER) It is collected with the aim of providing a reference for speech emotion recognition in case of Italian older adults. The evidence of the need for such a dataset emerges from the analysis of the state of the art.
arXiv Detail & Related papers (2023-11-24T13:47:25Z)
Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2) The dataset is composed of both real and synthetic videos from seven gesture classes. We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z)
REDAffectiveLM: Leveraging Affect Enriched Embedding and Transformer-based Neural Language Model for Readers' Emotion Detection [3.6678641723285446]
We propose a novel approach for Readers' Emotion Detection from short-text documents using a deep learning model called REDAffectiveLM. We leverage context-specific and affect enriched representations by using a transformer-based pre-trained language model in tandem with affect enriched Bi-LSTM+Attention.
arXiv Detail & Related papers (2023-01-21T19:28:25Z)
A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition [11.928873764689458]
We conduct a comprehensive evaluation of popular deep learning approaches for emotion recognition. We show that long-range dependencies in the speech signal are critical for emotion recognition. Speed/rate augmentation offers the most robust performance gain across models.
arXiv Detail & Related papers (2022-11-09T17:27:03Z)
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities. We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z)
Towards Unbiased Visual Emotion Recognition via Causal Intervention [63.74095927462]
We propose a novel Emotion Recognition Network (IERN) to alleviate the negative effects brought by the dataset bias. A series of designed tests validate the effectiveness of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms other state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-26T10:40:59Z)
iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis [23.261770969903065]
iMiGUE is identity-free video dataset for Micro-Gesture Understanding and Emotion analysis (iMiGUE) iMiGUE focuses on micro-gesture, i.e., unintentional behaviors driven by inner feelings.
arXiv Detail & Related papers (2021-07-01T08:15:14Z)
Affective Image Content Analysis: Two Decades Review and New Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades. We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z)
Contrastive Unsupervised Learning for Speech Emotion Recognition [22.004507213531102]
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication. We show that the contrastive predictive coding (CPC) method can learn salient representations from unlabeled datasets.
arXiv Detail & Related papers (2021-02-12T06:06:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.