TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
Real-time Video Facial Expression Recognition
- URL: http://arxiv.org/abs/2003.01791v1
- Date: Tue, 3 Mar 2020 20:58:52 GMT
- Title: TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
Real-time Video Facial Expression Recognition
- Authors: James Ren Hou Lee and Alexander Wong
- Abstract summary: This study explores a novel deep time windowed convolutional neural network design (TimeConvNets) for the purpose of real-time video facial expression recognition.
We show that TimeConvNets can better capture the transient nuances of facial expressions and boost classification accuracy while maintaining a low inference time.
- Score: 93.0013343535411
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A core challenge faced by the majority of individuals with Autism Spectrum
Disorder (ASD) is an impaired ability to infer other people's emotions based on
their facial expressions. With significant recent advances in machine learning,
one potential approach to leveraging technology to assist such individuals to
better recognize facial expressions and reduce the risk of possible loneliness
and depression due to social isolation is the design of computer vision-driven
facial expression recognition systems. Motivated by this social need as well as
the low latency requirement of such systems, this study explores a novel deep
time windowed convolutional neural network design (TimeConvNets) for the
purpose of real-time video facial expression recognition. More specifically, we
explore an efficient convolutional deep neural network design for
spatiotemporal encoding of time windowed video frame sub-sequences and study
the respective balance between speed and accuracy. Furthermore, to evaluate the
proposed TimeConvNet design, we introduce a more difficult dataset called
BigFaceX, composed of a modified aggregation of the extended Cohn-Kanade (CK+),
BAUM-1, and the eNTERFACE public datasets. Different variants of the proposed
TimeConvNet design with different backbone network architectures were evaluated
using BigFaceX alongside other network designs for capturing spatiotemporal
information, and experimental results demonstrate that TimeConvNets can better
capture the transient nuances of facial expressions and boost classification
accuracy while maintaining a low inference time.
Related papers
- The Disappearance of Timestep Embedding in Modern Time-Dependent Neural Networks [11.507779310946853]
We report a vulnerability of vanishing timestep embedding, which disables the time-awareness of a time-dependent neural network.
Our analysis provides a detailed description of this phenomenon as well as several solutions to address the root cause.
arXiv Detail & Related papers (2024-05-23T02:58:23Z) - STIP: A SpatioTemporal Information-Preserving and Perception-Augmented
Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems.
The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions.
Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z) - Backpropagation with Biologically Plausible Spatio-Temporal Adjustment
For Training Deep Spiking Neural Networks [5.484391472233163]
The success of deep learning is inseparable from backpropagation.
We propose a biological plausible spatial adjustment, which rethinks the relationship between membrane potential and spikes.
Secondly, we propose a biologically plausible temporal adjustment making the error propagate across the spikes in the temporal dimension.
arXiv Detail & Related papers (2021-10-17T15:55:51Z) - Facial Expressions Recognition with Convolutional Neural Networks [0.0]
We will be diving into implementing a system for recognition of facial expressions (FER) by leveraging neural networks.
We demonstrate a state-of-the-art single-network-accuracy of 70.10% on the FER2013 dataset without using any additional training data.
arXiv Detail & Related papers (2021-07-19T06:41:00Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - AEGIS: A real-time multimodal augmented reality computer vision based
system to assist facial expression recognition for individuals with autism
spectrum disorder [93.0013343535411]
This paper presents the development of a multimodal augmented reality (AR) system which combines the use of computer vision and deep convolutional neural networks (CNN)
The proposed system, which we call AEGIS, is an assistive technology deployable on a variety of user devices including tablets, smartphones, video conference systems, or smartglasses.
We leverage both spatial and temporal information in order to provide an accurate expression prediction, which is then converted into its corresponding visualization and drawn on top of the original video frame.
arXiv Detail & Related papers (2020-10-22T17:20:38Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - The FaceChannel: A Light-weight Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic FER are based on very deep neural networks that are difficult to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how the FaceChannel achieves a comparable, if not better, performance, as compared to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-04-17T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.