Implicit Design Choices and Their Impact on Emotion Recognition Model
Development and Evaluation
- URL: http://arxiv.org/abs/2309.03238v1
- Date: Wed, 6 Sep 2023 02:45:42 GMT
- Title: Implicit Design Choices and Their Impact on Emotion Recognition Model
Development and Evaluation
- Authors: Mimansa Jaiswal
- Abstract summary: The subjectivity of emotions poses significant challenges in developing accurate and robust computational models.
This thesis examines critical facets of emotion recognition, beginning with the collection of diverse datasets.
To handle the challenge of non-representative training data, this work collects the Multimodal Stressed Emotion dataset.
- Score: 5.534160116442057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Emotion recognition is a complex task due to the inherent subjectivity in
both the perception and production of emotions. The subjectivity of emotions
poses significant challenges in developing accurate and robust computational
models. This thesis examines critical facets of emotion recognition, beginning
with the collection of diverse datasets that account for psychological factors
in emotion production.
To handle the challenge of non-representative training data, this work
collects the Multimodal Stressed Emotion dataset, which introduces controlled
stressors during data collection to better represent real-world influences on
emotion production. To address issues with label subjectivity, this research
comprehensively analyzes how data augmentation techniques and annotation
schemes impact emotion perception and annotator labels. It further handles
natural confounding variables and variations by employing adversarial networks
to isolate key factors like stress from learned emotion representations during
model training. For tackling concerns about leakage of sensitive demographic
variables, this work leverages adversarial learning to strip sensitive
demographic information from multimodal encodings. Additionally, it proposes
optimized sociological evaluation metrics aligned with cost-effective,
real-world needs for model testing.
This research advances robust, practical emotion recognition through
multifaceted studies of challenges in datasets, labels, modeling, demographic
and membership variable encoding in representations, and evaluation. The
groundwork has been laid for cost-effective, generalizable emotion recognition
models that are less likely to encode sensitive demographic information.
Related papers
- CAPE: A Chinese Dataset for Appraisal-based Emotional Generation using Large Language Models [30.40159858361768]
We introduce a two-stage automatic data generation framework to create CAPE, a Chinese dataset named Cognitive Appraisal theory-based Emotional corpus.
This corpus facilitates the generation of dialogues with contextually appropriate emotional responses by accounting for diverse personal and situational factors.
Our study shows the potential for advancing emotional expression in conversational agents, paving the way for more nuanced and meaningful human-computer interactions.
arXiv Detail & Related papers (2024-10-18T03:33:18Z) - Emotion Detection through Body Gesture and Face [0.0]
The project addresses the challenge of emotion recognition by focusing on non-facial cues, specifically hands, body gestures, and gestures.
Traditional emotion recognition systems mainly rely on facial expression analysis and often ignore the rich emotional information conveyed through body language.
The project aims to contribute to the field of affective computing by enhancing the ability of machines to interpret and respond to human emotions in a more comprehensive and nuanced way.
arXiv Detail & Related papers (2024-07-13T15:15:50Z) - Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.
To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity.
Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z) - Deep Imbalanced Learning for Multimodal Emotion Recognition in
Conversations [15.705757672984662]
Multimodal Emotion Recognition in Conversations (MERC) is a significant development direction for machine intelligence.
Many data in MERC naturally exhibit an imbalanced distribution of emotion categories, and researchers ignore the negative impact of imbalanced data on emotion recognition.
We propose the Class Boundary Enhanced Representation Learning (CBERL) model to address the imbalanced distribution of emotion categories in raw data.
We have conducted extensive experiments on the IEMOCAP and MELD benchmark datasets, and the results show that CBERL has achieved a certain performance improvement in the effectiveness of emotion recognition.
arXiv Detail & Related papers (2023-12-11T12:35:17Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - Computer Vision Estimation of Emotion Reaction Intensity in the Wild [1.5481864635049696]
We describe our submission to the newly introduced Emotional Reaction Intensity (ERI) Estimation challenge.
We developed four deep neural networks trained in the visual domain and a multimodal model trained with both visual and audio features to predict emotion reaction intensity.
arXiv Detail & Related papers (2023-03-19T19:09:41Z) - Seeking Subjectivity in Visual Emotion Distribution Learning [93.96205258496697]
Visual Emotion Analysis (VEA) aims to predict people's emotions towards different visual stimuli.
Existing methods often predict visual emotion distribution in a unified network, neglecting the inherent subjectivity in its crowd voting process.
We propose a novel textitSubjectivity Appraise-and-Match Network (SAMNet) to investigate the subjectivity in visual emotion distribution.
arXiv Detail & Related papers (2022-07-25T02:20:03Z) - A cross-corpus study on speech emotion recognition [29.582678406878568]
This study investigates whether information learnt from acted emotions is useful for detecting natural emotions.
Four adult English datasets covering acted, elicited and natural emotions are considered.
A state-of-the-art model is proposed to accurately investigate the degradation of performance.
arXiv Detail & Related papers (2022-07-05T15:15:22Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - Affective Image Content Analysis: Two Decades Review and New
Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades.
We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence.
We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z) - Enhancing Cognitive Models of Emotions with Representation Learning [58.2386408470585]
We present a novel deep learning-based framework to generate embedding representations of fine-grained emotions.
Our framework integrates a contextualized embedding encoder with a multi-head probing model.
Our model is evaluated on the Empathetic Dialogue dataset and shows the state-of-the-art result for classifying 32 emotions.
arXiv Detail & Related papers (2021-04-20T16:55:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.