Estimating the Uncertainty in Emotion Attributes using Deep Evidential
Regression
- URL: http://arxiv.org/abs/2306.06760v1
- Date: Sun, 11 Jun 2023 20:07:29 GMT
- Title: Estimating the Uncertainty in Emotion Attributes using Deep Evidential
Regression
- Authors: Wen Wu, Chao Zhang, Philip C. Woodland
- Abstract summary: In automatic emotion recognition, labels assigned by different human annotators to the same utterance are often inconsistent.
This paper proposes a Bayesian approach, deep evidential emotion regression (DEER), to estimate the uncertainty in emotion attributes.
Experiments on the widely used MSP-Podcast and IEMOCAP datasets showed DEER produced state-of-the-art results.
- Score: 17.26466867595571
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In automatic emotion recognition (AER), labels assigned by different human
annotators to the same utterance are often inconsistent due to the inherent
complexity of emotion and the subjectivity of perception. Though deterministic
labels generated by averaging or voting are often used as the ground truth, it
ignores the intrinsic uncertainty revealed by the inconsistent labels. This
paper proposes a Bayesian approach, deep evidential emotion regression (DEER),
to estimate the uncertainty in emotion attributes. Treating the emotion
attribute labels of an utterance as samples drawn from an unknown Gaussian
distribution, DEER places an utterance-specific normal-inverse gamma prior over
the Gaussian likelihood and predicts its hyper-parameters using a deep neural
network model. It enables a joint estimation of emotion attributes along with
the aleatoric and epistemic uncertainties. AER experiments on the widely used
MSP-Podcast and IEMOCAP datasets showed DEER produced state-of-the-art results
for both the mean values and the distribution of emotion attributes.
Related papers
- Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset.
For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset.
We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z) - Handling Ambiguity in Emotion: From Out-of-Domain Detection to
Distribution Estimation [45.53789836426869]
The subjective perception of emotion leads to inconsistent labels from human annotators.
This paper investigates three methods to handle ambiguous emotion.
We show that incorporating utterances without majority-agreed labels as an additional class in the classifier reduces the classification performance of the other emotion classes.
We also propose detecting utterances with ambiguous emotions as out-of-domain samples by quantifying the uncertainty in emotion classification using evidential deep learning.
arXiv Detail & Related papers (2024-02-20T09:53:38Z) - Towards Assumption-free Bias Mitigation [47.5131072745805]
We propose an assumption-free framework to detect the related attributes automatically by modeling feature interaction for bias mitigation.
Experimental results on four real-world datasets demonstrate that our proposed framework can significantly alleviate unfair prediction behaviors.
arXiv Detail & Related papers (2023-07-09T05:55:25Z) - Distribution-based Emotion Recognition in Conversation [17.26466867595571]
This paper proposes a distribution-based framework that formulates ERC as a sequence-to-sequence problem for emotion distribution estimation.
Experimental results on the IEMOCAP dataset show that ERC outperformed the single-utterance-based system.
arXiv Detail & Related papers (2022-11-09T12:16:28Z) - Label Uncertainty Modeling and Prediction for Speech Emotion Recognition
using t-Distributions [15.16865739526702]
We propose to model the label distribution using a Student's t-distribution.
We derive the corresponding Kullback-Leibler divergence based loss function and use it to train an estimator for the distribution of emotion labels.
Results reveal that our t-distribution based approach improves over the Gaussian approach with state-of-the-art uncertainty modeling results.
arXiv Detail & Related papers (2022-07-25T12:38:20Z) - Seeking Subjectivity in Visual Emotion Distribution Learning [93.96205258496697]
Visual Emotion Analysis (VEA) aims to predict people's emotions towards different visual stimuli.
Existing methods often predict visual emotion distribution in a unified network, neglecting the inherent subjectivity in its crowd voting process.
We propose a novel textitSubjectivity Appraise-and-Match Network (SAMNet) to investigate the subjectivity in visual emotion distribution.
arXiv Detail & Related papers (2022-07-25T02:20:03Z) - Estimating the Uncertainty in Emotion Class Labels with
Utterance-Specific Dirichlet Priors [24.365876333182207]
We propose a novel training loss based on per-utterance Dirichlet prior distributions for verbal emotion recognition.
An additional metric is used to evaluate the performance by detecting test utterances with high labelling uncertainty.
Experiments with the widely used IEMOCAP dataset demonstrate that the two-branch structure achieves state-of-the-art classification results.
arXiv Detail & Related papers (2022-03-08T23:30:01Z) - End-to-end label uncertainty modeling for speech emotion recognition
using Bayesian neural networks [16.708069984516964]
We introduce an end-to-end Bayesian neural network architecture to capture the inherent subjectivity in emotions.
At training, the network learns a distribution of weights to capture the inherent uncertainty related to subjective emotion annotations.
We evaluate the proposed approach on the AVEC'16 emotion recognition dataset.
arXiv Detail & Related papers (2021-10-07T09:34:28Z) - Label Distribution Amendment with Emotional Semantic Correlations for
Facial Expression Recognition [69.18918567657757]
We propose a new method that amends the label distribution of each facial image by leveraging correlations among expressions in the semantic space.
By comparing semantic and task class-relation graphs of each image, the confidence of its label distribution is evaluated.
Experimental results demonstrate the proposed method is more effective than compared state-of-the-art methods.
arXiv Detail & Related papers (2021-07-23T07:46:14Z) - A Circular-Structured Representation for Visual Emotion Distribution
Learning [82.89776298753661]
We propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning.
To be specific, we first construct an Emotion Circle to unify any emotional state within it.
On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes.
arXiv Detail & Related papers (2021-06-23T14:53:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.