Landmark-Aware and Part-based Ensemble Transfer Learning Network for
Facial Expression Recognition from Static images
- URL: http://arxiv.org/abs/2104.11274v1
- Date: Thu, 22 Apr 2021 18:38:33 GMT
- Title: Landmark-Aware and Part-based Ensemble Transfer Learning Network for
Facial Expression Recognition from Static images
- Authors: Rohan Wadhawan and Tapan K. Gandhi
- Abstract summary: Part-based Ensemble Transfer Learning network models how humans recognize facial expressions.
It consists of 5 sub-networks, in which each sub-network performs transfer learning from one of the five subsets of facial landmarks.
It requires only 3.28 $times$ $106$ FLOPS, which ensures computational efficiency for real-time deployment.
- Score: 0.5156484100374059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial Expression Recognition from static images is a challenging problem in
computer vision applications. Convolutional Neural Network (CNN), the
state-of-the-art method for various computer vision tasks, has had limited
success in predicting expressions from faces having extreme poses,
illumination, and occlusion conditions. To mitigate this issue, CNNs are often
accompanied by techniques like transfer, multi-task, or ensemble learning that
often provide high accuracy at the cost of high computational complexity. In
this work, we propose a Part-based Ensemble Transfer Learning network, which
models how humans recognize facial expressions by correlating the spatial
orientation pattern of the facial features with a specific expression. It
consists of 5 sub-networks, in which each sub-network performs transfer
learning from one of the five subsets of facial landmarks: eyebrows, eyes,
nose, mouth, or jaw to expression classification. We test the proposed network
on the CK+, JAFFE, and SFEW datasets, and it outperforms the benchmark for CK+
and JAFFE datasets by 0.51\% and 5.34\%, respectively. Additionally, it
consists of a total of 1.65M model parameters and requires only 3.28 $\times$
$10^{6}$ FLOPS, which ensures computational efficiency for real-time
deployment. Grad-CAM visualizations of our proposed ensemble highlight the
complementary nature of its sub-networks, a key design parameter of an
effective ensemble network. Lastly, cross-dataset evaluation results reveal
that our proposed ensemble has a high generalization capacity. Our model
trained on the SFEW Train dataset achieves an accuracy of 47.53\% on the CK+
dataset, which is higher than what it achieves on the SFEW Valid dataset.
Related papers
- Bridging the Gaps: Utilizing Unlabeled Face Recognition Datasets to Boost Semi-Supervised Facial Expression Recognition [5.750927184237346]
We focus on utilizing large unlabeled Face Recognition (FR) datasets to boost semi-supervised FER.
Specifically, we first perform face reconstruction pre-training on large-scale facial images without annotations.
To further alleviate the scarcity of labeled and diverse images, we propose a Mixup-based data augmentation strategy.
arXiv Detail & Related papers (2024-10-23T07:26:19Z) - HSEmotion Team at the 7th ABAW Challenge: Multi-Task Learning and Compound Facial Expression Recognition [16.860963320038902]
We describe the results of the HSEmotion team in two tasks of the seventh Affective Behavior Analysis in-the-wild (ABAW) competition.
We propose an efficient pipeline based on frame-level facial feature extractors pre-trained in multi-task settings.
We ensure the privacy-awareness of our techniques by using the lightweight architectures of neural networks.
arXiv Detail & Related papers (2024-07-18T05:47:49Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Data Augmentation and Transfer Learning Approaches Applied to Facial
Expressions Recognition [0.3481985817302898]
We propose a novel data augmentation technique that improves the performances in the recognition task.
We build from scratch GAN models able to generate new synthetic images for each emotion type.
On the augmented datasets we fine tune pretrained convolutional neural networks with different architectures.
arXiv Detail & Related papers (2024-02-15T14:46:03Z) - SwinFace: A Multi-task Transformer for Face Recognition, Expression
Recognition, Age Estimation and Attribute Estimation [60.94239810407917]
This paper presents a multi-purpose algorithm for simultaneous face recognition, facial expression recognition, age estimation, and face attribute estimation based on a single Swin Transformer.
To address the conflicts among multiple tasks, a Multi-Level Channel Attention (MLCA) module is integrated into each task-specific analysis.
Experiments show that the proposed model has a better understanding of the face and achieves excellent performance for all tasks.
arXiv Detail & Related papers (2023-08-22T15:38:39Z) - Contextualized Spatio-Temporal Contrastive Learning with
Self-Supervision [106.77639982059014]
We present ConST-CL framework to effectively learn-temporally fine-grained representations.
We first design a region-based self-supervised task which requires the model to learn to transform instance representations from one view to another guided by context features.
We then introduce a simple design that effectively reconciles the simultaneous learning of both holistic and local representations.
arXiv Detail & Related papers (2021-12-09T19:13:41Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Facial expression and attributes recognition based on multi-task
learning of lightweight neural networks [9.162936410696409]
We examine the multi-task training of lightweight convolutional neural networks for face identification and classification of facial attributes.
It is shown that it is still necessary to fine-tune these networks in order to predict facial expressions.
Several models are presented based on MobileNet, EfficientNet and RexNet architectures.
arXiv Detail & Related papers (2021-03-31T14:21:04Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z) - Deep Multi-Facial Patches Aggregation Network For Facial Expression
Recognition [5.735035463793008]
We propose an approach for Facial Expressions Recognition (FER) based on a deep multi-facial patches aggregation network.
Deep features are learned from facial patches using deep sub-networks and aggregated within one deep architecture for expression classification.
arXiv Detail & Related papers (2020-02-20T17:57:06Z) - Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition [55.15661254072032]
We present a sparsity-aware deep network for automatic 4D facial expression recognition (FER)
We first propose a novel augmentation method to combat the data limitation problem for deep learning.
We then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views.
arXiv Detail & Related papers (2020-02-08T13:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.