Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition
- URL: http://arxiv.org/abs/2304.02309v1
- Date: Wed, 5 Apr 2023 09:06:30 GMT
- Title: Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition
- Authors: Michael Stettler, Alexander Lappe, Nick Taubert, Martin Giese
- Abstract summary: We propose a biologically-inspired mechanism for transfer learning in facial expression recognition.
Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes.
Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
- Score: 62.997667081978825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: People can innately recognize human facial expressions in unnatural forms,
such as when depicted on the unusual faces drawn in cartoons or when applied to
an animal's features. However, current machine learning algorithms struggle
with out-of-domain transfer in facial expression recognition (FER). We propose
a biologically-inspired mechanism for such transfer learning, which is based on
norm-referenced encoding, where patterns are encoded in terms of difference
vectors relative to a domain-specific reference vector. By incorporating
domain-specific reference frames, we demonstrate high data efficiency in
transfer learning across multiple domains. Our proposed architecture provides
an explanation for how the human brain might innately recognize facial
expressions on varying head shapes (humans, monkeys, and cartoon avatars)
without extensive training. Norm-referenced encoding also allows the intensity
of the expression to be read out directly from neural unit activity, similar to
face-selective neurons in the brain. Our model achieves a classification
accuracy of 92.15\% on the FERG dataset with extreme data efficiency. We train
our proposed mechanism with only 12 images, including a single image of each
class (facial expression) and one image per domain (avatar). In comparison, the
authors of the FERG dataset achieved a classification accuracy of 89.02\% with
their FaceExpr model, which was trained on 43,000 images.
Related papers
- Improving Generalization of Image Captioning with Unsupervised Prompt
Learning [63.26197177542422]
Generalization of Image Captioning (GeneIC) learns a domain-specific prompt vector for the target domain without requiring annotated data.
GeneIC aligns visual and language modalities with a pre-trained Contrastive Language-Image Pre-Training (CLIP) model.
arXiv Detail & Related papers (2023-08-05T12:27:01Z) - Disentangling Identity and Pose for Facial Expression Recognition [54.50747989860957]
We propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation.
For identity encoder, a well pre-trained face recognition model is utilized and fixed during training, which alleviates the restriction on specific expression training data.
By comparing the difference between synthesized neutral and expressional images of the same individual, the expression component is further disentangled from identity and pose.
arXiv Detail & Related papers (2022-08-17T06:48:13Z) - Exploiting Emotional Dependencies with Graph Convolutional Networks for
Facial Expression Recognition [31.40575057347465]
This paper proposes a novel multi-task learning framework to recognize facial expressions in-the-wild.
A shared feature representation is learned for both discrete and continuous recognition in a MTL setting.
The results of our experiments show that our method outperforms the current state-of-the-art methods on discrete FER.
arXiv Detail & Related papers (2021-06-07T10:20:05Z) - Facial expression and attributes recognition based on multi-task
learning of lightweight neural networks [9.162936410696409]
We examine the multi-task training of lightweight convolutional neural networks for face identification and classification of facial attributes.
It is shown that it is still necessary to fine-tune these networks in order to predict facial expressions.
Several models are presented based on MobileNet, EfficientNet and RexNet architectures.
arXiv Detail & Related papers (2021-03-31T14:21:04Z) - Human Expression Recognition using Facial Shape Based Fourier
Descriptors Fusion [15.063379178217717]
This paper aims to produce a new facial expression recognition method based on the changes in the facial muscles.
The geometric features are used to specify the facial regions i.e., mouth, eyes, and nose.
A multi-class support vector machine is applied for classification of seven human expression.
arXiv Detail & Related papers (2020-12-28T05:01:44Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Video-based Facial Expression Recognition using Graph Convolutional
Networks [57.980827038988735]
We introduce a Graph Convolutional Network (GCN) layer into a common CNN-RNN based model for video-based facial expression recognition.
We evaluate our method on three widely-used datasets, CK+, Oulu-CASIA and MMI, and also one challenging wild dataset AFEW8.0.
arXiv Detail & Related papers (2020-10-26T07:31:51Z) - Seeing eye-to-eye? A comparison of object recognition performance in
humans and deep convolutional neural networks under image manipulation [0.0]
This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks.
Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations.
arXiv Detail & Related papers (2020-07-13T10:26:30Z) - Domain Adaptation with Morphologic Segmentation [8.0698976170854]
We present a novel domain adaptation framework that uses morphologic segmentation to translate images from arbitrary input domains (real and synthetic) into a uniform output domain.
Our goal is to establish a preprocessing step that unifies data from multiple sources into a common representation.
We showcase the effectiveness of our approach by qualitatively and quantitatively evaluating our method on four data sets of simulated and real data of urban scenes.
arXiv Detail & Related papers (2020-06-16T17:06:02Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.