Direct Classification of Emotional Intensity
- URL: http://arxiv.org/abs/2011.07460v1
- Date: Sun, 15 Nov 2020 06:32:48 GMT
- Title: Direct Classification of Emotional Intensity
- Authors: Jacob Ouyang, Isaac R Galatzer-Levy, Vidya Koesmahargyo, Li Zhang
- Abstract summary: We train a model using videos of different people smiling that outputs an intensity score from 0-10.
Our model then employs an adaptive learning technique to improve performance when dealing with new subjects.
- Score: 4.360819666001918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a model that can directly predict emotion intensity
score from video inputs, instead of deriving from action units. Using a 3d DNN
incorporated with dynamic emotion information, we train a model using videos of
different people smiling that outputs an intensity score from 0-10. Each video
is labeled framewise using a normalized action-unit based intensity score. Our
model then employs an adaptive learning technique to improve performance when
dealing with new subjects. Compared to other models, our model excels in
generalization between different people as well as provides a new framework to
directly classify emotional intensity.
Related papers
- ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations [22.85503397110192]
This paper proposes a novel 3D speech-to-animation (STA) generation framework to address the shortcomings of existing models.
We introduce a novel STA model coupled with a reward model. This combination enables the decoupling of emotion and content under audio conditions.
We conduct extensive empirical experiments on a benchmark dataset, and the results validate the effectiveness of our proposed framework.
arXiv Detail & Related papers (2024-11-20T07:37:37Z) - Emotion Detection in Reddit: Comparative Study of Machine Learning and Deep Learning Techniques [0.0]
This study concentrates on text-based emotion detection by leveraging the GoEmotions dataset.
We employed a range of models for this task, including six machine learning models, three ensemble models, and a Long Short-Term Memory (LSTM) model.
Results indicate that the Stacking classifier outperforms other models in accuracy and performance.
arXiv Detail & Related papers (2024-11-15T16:28:25Z) - AICL: Action In-Context Learning for Video Diffusion Model [124.39948693332552]
We propose AICL, which empowers the generative model with the ability to understand action information in reference videos.
Extensive experiments demonstrate that AICL effectively captures the action and achieves state-of-the-art generation performance.
arXiv Detail & Related papers (2024-03-18T07:41:19Z) - Helping Hands: An Object-Aware Ego-Centric Video Recognition Model [60.350851196619296]
We introduce an object-aware decoder for improving the performance of ego-centric representations on ego-centric videos.
We show that the model can act as a drop-in replacement for an ego-awareness video model to improve performance through visual-text grounding.
arXiv Detail & Related papers (2023-08-15T17:58:11Z) - Computer Vision Estimation of Emotion Reaction Intensity in the Wild [1.5481864635049696]
We describe our submission to the newly introduced Emotional Reaction Intensity (ERI) Estimation challenge.
We developed four deep neural networks trained in the visual domain and a multimodal model trained with both visual and audio features to predict emotion reaction intensity.
arXiv Detail & Related papers (2023-03-19T19:09:41Z) - Recognizing Emotions evoked by Movies using Multitask Learning [3.4290619267487488]
Methods for recognizing evoked emotions are usually trained on human annotated data.
We propose two deep learning architectures: a Single-Task (ST) architecture and a Multi-Task (MT) architecture.
Our results show that the MT approach can more accurately model each viewer and the aggregated annotation when compared to methods that are directly trained on the aggregated annotations.
arXiv Detail & Related papers (2021-07-30T10:21:40Z) - Learning Local Recurrent Models for Human Mesh Recovery [50.85467243778406]
We present a new method for video mesh recovery that divides the human mesh into several local parts following the standard skeletal model.
We then model the dynamics of each local part with separate recurrent models, with each model conditioned appropriately based on the known kinematic structure of the human body.
This results in a structure-informed local recurrent learning architecture that can be trained in an end-to-end fashion with available annotations.
arXiv Detail & Related papers (2021-07-27T14:30:33Z) - Enhancing Cognitive Models of Emotions with Representation Learning [58.2386408470585]
We present a novel deep learning-based framework to generate embedding representations of fine-grained emotions.
Our framework integrates a contextualized embedding encoder with a multi-head probing model.
Our model is evaluated on the Empathetic Dialogue dataset and shows the state-of-the-art result for classifying 32 emotions.
arXiv Detail & Related papers (2021-04-20T16:55:15Z) - Affect2MM: Affective Analysis of Multimedia Content Using Emotion
Causality [84.69595956853908]
We present Affect2MM, a learning method for time-series emotion prediction for multimedia content.
Our goal is to automatically capture the varying emotions depicted by characters in real-life human-centric situations and behaviors.
arXiv Detail & Related papers (2021-03-11T09:07:25Z) - Modality-Transferable Emotion Embeddings for Low-Resource Multimodal
Emotion Recognition [55.44502358463217]
We propose a modality-transferable model with emotion embeddings to tackle the aforementioned issues.
Our model achieves state-of-the-art performance on most of the emotion categories.
Our model also outperforms existing baselines in the zero-shot and few-shot scenarios for unseen emotions.
arXiv Detail & Related papers (2020-09-21T06:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.