Revisiting Emotions Representation for Recognition in the Wild
- URL: http://arxiv.org/abs/2602.06778v1
- Date: Fri, 06 Feb 2026 15:32:12 GMT
- Title: Revisiting Emotions Representation for Recognition in the Wild
- Authors: Joao Baptista Cardia Neto, Claudio Ferrari, Stefano Berretti,
- Abstract summary: We propose a novel approach to describe complex emotional states as probability distributions over a set of emotion classes.<n>We estimate the likelihood of a face image belonging to each of the distributions, so that emotional states can be described as a mixture of emotions.<n>In a preliminary set of experiments, we illustrate the advantages of this solution and a new possible direction of investigation.
- Score: 15.292517580358528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial emotion recognition has been typically cast as a single-label classification problem of one out of six prototypical emotions. However, that is an oversimplification that is unsuitable for representing the multifaceted spectrum of spontaneous emotional states, which are most often the result of a combination of multiple emotions contributing at different intensities. Building on this, a promising direction that was explored recently is to cast emotion recognition as a distribution learning problem. Still, such approaches are limited in that research datasets are typically annotated with a single emotion class. In this paper, we contribute a novel approach to describe complex emotional states as probability distributions over a set of emotion classes. To do so, we propose a solution to automatically re-label existing datasets by exploiting the result of a study in which a large set of both basic and compound emotions is mapped to probability distributions in the Valence-Arousal-Dominance (VAD) space. In this way, given a face image annotated with VAD values, we can estimate the likelihood of it belonging to each of the distributions, so that emotional states can be described as a mixture of emotions, enriching their description, while also accounting for the ambiguous nature of their perception. In a preliminary set of experiments, we illustrate the advantages of this solution and a new possible direction of investigation. Data annotations are available at https://github.com/jbcnrlz/affectnet-b-annotation.
Related papers
- EmoVerse: A MLLMs-Driven Emotion Representation Dataset for Interpretable Visual Emotion Analysis [61.87711517626139]
EmoVerse is a large-scale open-source dataset that enables interpretable visual emotion analysis.<n>With over 219k images, the dataset further includes dual annotations in Categorical Emotion States (CES) and Dimensional Emotion Space (DES)
arXiv Detail & Related papers (2025-11-16T11:16:50Z) - HeLo: Heterogeneous Multi-Modal Fusion with Label Correlation for Emotion Distribution Learning [25.95933218051548]
We propose a multi-modal emotion distribution learning framework, named HeLo, to explore the heterogeneity and complementary information in multi-modal emotional data.<n> Experimental results on two publicly available datasets demonstrate the superiority of our proposed method in emotion distribution learning.
arXiv Detail & Related papers (2025-07-09T13:08:58Z) - Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation [63.94836524433559]
DICE-Talk is a framework for disentangling identity with emotion and cooperating emotions with similar characteristics.<n>We develop a disentangled emotion embedder that jointly models audio-visual emotional cues through cross-modal attention.<n>Second, we introduce a correlation-enhanced emotion conditioning module with learnable Emotion Banks.<n>Third, we design an emotion discrimination objective that enforces affective consistency during the diffusion process.
arXiv Detail & Related papers (2025-04-25T05:28:21Z) - The Whole Is Bigger Than the Sum of Its Parts: Modeling Individual Annotators to Capture Emotional Variability [9.953472660494088]
Emotion expression and perception are nuanced, complex, and highly subjective processes.<n>Most speech emotion recognition tasks address this by averaging annotator labels as ground truth.<n>Previous work has attempted to learn distributions to capture emotion variability, but these methods also lose information about the individual annotators.<n>We introduce a novel method to create distributions from continuous model outputs that permit the learning of emotion distributions during model training.
arXiv Detail & Related papers (2024-08-21T19:24:06Z) - Handling Ambiguity in Emotion: From Out-of-Domain Detection to
Distribution Estimation [45.53789836426869]
The subjective perception of emotion leads to inconsistent labels from human annotators.
This paper investigates three methods to handle ambiguous emotion.
We show that incorporating utterances without majority-agreed labels as an additional class in the classifier reduces the classification performance of the other emotion classes.
We also propose detecting utterances with ambiguous emotions as out-of-domain samples by quantifying the uncertainty in emotion classification using evidential deep learning.
arXiv Detail & Related papers (2024-02-20T09:53:38Z) - Seeking Subjectivity in Visual Emotion Distribution Learning [93.96205258496697]
Visual Emotion Analysis (VEA) aims to predict people's emotions towards different visual stimuli.
Existing methods often predict visual emotion distribution in a unified network, neglecting the inherent subjectivity in its crowd voting process.
We propose a novel textitSubjectivity Appraise-and-Match Network (SAMNet) to investigate the subjectivity in visual emotion distribution.
arXiv Detail & Related papers (2022-07-25T02:20:03Z) - A Circular-Structured Representation for Visual Emotion Distribution
Learning [82.89776298753661]
We propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning.
To be specific, we first construct an Emotion Circle to unify any emotional state within it.
On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes.
arXiv Detail & Related papers (2021-06-23T14:53:27Z) - Modality-Transferable Emotion Embeddings for Low-Resource Multimodal
Emotion Recognition [55.44502358463217]
We propose a modality-transferable model with emotion embeddings to tackle the aforementioned issues.
Our model achieves state-of-the-art performance on most of the emotion categories.
Our model also outperforms existing baselines in the zero-shot and few-shot scenarios for unseen emotions.
arXiv Detail & Related papers (2020-09-21T06:10:39Z) - EmoGraph: Capturing Emotion Correlations using Graph Networks [71.53159402053392]
We propose EmoGraph that captures the dependencies among different emotions through graph networks.
EmoGraph outperforms strong baselines, especially for macro-F1.
An experiment illustrates the captured emotion correlations can also benefit a single-label classification task.
arXiv Detail & Related papers (2020-08-21T08:59:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.