Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks
- URL: http://arxiv.org/abs/2011.09280v2
- Date: Fri, 15 Jan 2021 14:49:00 GMT
- Title: Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks
- Authors: Thomas Teixeira, Eric Granger, Alessandro Lameiras Koerich
- Abstract summary: We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
- Score: 82.54695985117783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial expressions are one of the most powerful ways for depicting specific
patterns in human behavior and describing human emotional state. Despite the
impressive advances of affective computing over the last decade, automatic
video-based systems for facial expression recognition still cannot handle
properly variations in facial expression among individuals as well as
cross-cultural and demographic aspects. Nevertheless, recognizing facial
expressions is a difficult task even for humans. In this paper, we investigate
the suitability of state-of-the-art deep learning architectures based on
convolutional neural networks (CNNs) for continuous emotion recognition using
long video sequences captured in-the-wild. This study focuses on deep learning
models that allow encoding spatiotemporal relations in videos considering a
complex and multi-dimensional emotion space, where values of valence and
arousal must be predicted. We have developed and evaluated convolutional
recurrent neural networks combining 2D-CNNs and long short term-memory units,
and inflated 3D-CNN models, which are built by inflating the weights of a
pre-trained 2D-CNN model during fine-tuning, using application-specific videos.
Experimental results on the challenging SEWA-DB dataset have shown that these
architectures can effectively be fine-tuned to encode the spatiotemporal
information from successive raw pixel images and achieve state-of-the-art
results on such a dataset.
Related papers
- Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models [49.3179290313959]
The proposed method, emotion-centered generative replay (ECgr), tackles this challenge by integrating synthetic images from generative adversarial networks.
ECgr incorporates a quality assurance algorithm to ensure the fidelity of generated images.
The experimental results on four diverse facial expression datasets demonstrate that incorporating images generated by our pseudo-rehearsal method enhances training on the targeted dataset and the source dataset.
arXiv Detail & Related papers (2024-04-18T15:28:34Z) - Deep Learning Approaches for Human Action Recognition in Video Data [0.8080830346931087]
This study conducts an in-depth analysis of various deep learning models to address this challenge.
We focus on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Two-Stream ConvNets.
The results of this study underscore the potential of composite models in achieving robust human action recognition.
arXiv Detail & Related papers (2024-03-11T15:31:25Z) - A Hybrid End-to-End Spatio-Temporal Attention Neural Network with
Graph-Smooth Signals for EEG Emotion Recognition [1.6328866317851187]
We introduce a deep neural network that acquires interpretable representations by a hybrid structure of network-temporal encoding and recurrent attention blocks.
We demonstrate that our proposed architecture exceeds state-of-the-art results for emotion classification on the publicly available DEAP dataset.
arXiv Detail & Related papers (2023-07-06T15:35:14Z) - Facial Expressions Recognition with Convolutional Neural Networks [0.0]
We will be diving into implementing a system for recognition of facial expressions (FER) by leveraging neural networks.
We demonstrate a state-of-the-art single-network-accuracy of 70.10% on the FER2013 dataset without using any additional training data.
arXiv Detail & Related papers (2021-07-19T06:41:00Z) - Video-based Facial Expression Recognition using Graph Convolutional
Networks [57.980827038988735]
We introduce a Graph Convolutional Network (GCN) layer into a common CNN-RNN based model for video-based facial expression recognition.
We evaluate our method on three widely-used datasets, CK+, Oulu-CASIA and MMI, and also one challenging wild dataset AFEW8.0.
arXiv Detail & Related papers (2020-10-26T07:31:51Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - Real-time Facial Expression Recognition "In The Wild'' by Disentangling
3D Expression from Identity [6.974241731162878]
This paper proposes a novel method for human emotion recognition from a single RGB image.
We construct a large-scale dataset of facial videos, rich in facial dynamics, identities, expressions, appearance and 3D pose variations.
Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation.
arXiv Detail & Related papers (2020-05-12T01:32:55Z) - TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
Real-time Video Facial Expression Recognition [93.0013343535411]
This study explores a novel deep time windowed convolutional neural network design (TimeConvNets) for the purpose of real-time video facial expression recognition.
We show that TimeConvNets can better capture the transient nuances of facial expressions and boost classification accuracy while maintaining a low inference time.
arXiv Detail & Related papers (2020-03-03T20:58:52Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.