Benchmarking Deep Facial Expression Recognition: An Extensive Protocol
with Balanced Dataset in the Wild
- URL: http://arxiv.org/abs/2311.02910v1
- Date: Mon, 6 Nov 2023 06:48:49 GMT
- Title: Benchmarking Deep Facial Expression Recognition: An Extensive Protocol
with Balanced Dataset in the Wild
- Authors: Gianmarco Ipinze Tutuianu, Yang Liu, Ari Alam\"aki, Janne Kauttonen
- Abstract summary: Facial expression recognition (FER) is a crucial part of human-computer interaction.
We collected a new in-the-wild facial expression dataset for cross-domain validation.
We ranked network architectures and summarized a set of recommendations on deploying deep FER methods in real scenarios.
- Score: 5.044138778500218
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial expression recognition (FER) is a crucial part of human-computer
interaction. Existing FER methods achieve high accuracy and generalization
based on different open-source deep models and training approaches. However,
the performance of these methods is not always good when encountering practical
settings, which are seldom explored. In this paper, we collected a new
in-the-wild facial expression dataset for cross-domain validation. Twenty-three
commonly used network architectures were implemented and evaluated following a
uniform protocol. Moreover, various setups, in terms of input resolutions,
class balance management, and pre-trained strategies, were verified to show the
corresponding performance contribution. Based on extensive experiments on three
large-scale FER datasets and our practical cross-validation, we ranked network
architectures and summarized a set of recommendations on deploying deep FER
methods in real scenarios. In addition, potential ethical rules, privacy
issues, and regulations were discussed in practical FER applications such as
marketing, education, and entertainment business.
Related papers
- Semantics-Oriented Multitask Learning for DeepFake Detection: A Joint Embedding Approach [77.65459419417533]
We propose an automatic dataset expansion technique to support semantics-oriented DeepFake detection tasks.
We also resort to joint embedding of face images and their corresponding labels for prediction.
Our method improves the generalizability of DeepFake detection and renders some degree of model interpretation by providing human-understandable explanations.
arXiv Detail & Related papers (2024-08-29T07:11:50Z) - Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents [16.78371134590167]
Key-value relations are prevalent in Visually-Rich Documents (VRDs)
These non-textual cues serve as important indicators that greatly enhance human comprehension and acquisition of such relation triplets.
Our research focuses on few-shot relational learning, specifically targeting the extraction of key-value relation triplets in VRDs.
arXiv Detail & Related papers (2024-03-23T08:40:35Z) - Pre-trained Recommender Systems: A Causal Debiasing Perspective [19.712997823535066]
We develop a generic recommender that captures universal interaction patterns by training on generic user-item interaction data extracted from different domains.
Our empirical studies show that the proposed model could significantly improve the recommendation performance in zero- and few-shot learning settings.
arXiv Detail & Related papers (2023-10-30T03:37:32Z) - NormAUG: Normalization-guided Augmentation for Domain Generalization [60.159546669021346]
We propose a simple yet effective method called NormAUG (Normalization-guided Augmentation) for deep learning.
Our method introduces diverse information at the feature level and improves the generalization of the main path.
In the test stage, we leverage an ensemble strategy to combine the predictions from the auxiliary path of our model, further boosting performance.
arXiv Detail & Related papers (2023-07-25T13:35:45Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Domain Generalization for Activity Recognition via Adaptive Feature
Fusion [9.458837222079612]
We propose emphAdaptive Feature Fusion for Activity Recognition(AFFAR).
AFFAR learns to fuse the domain-invariant and domain-specific representations to improve the model's generalization performance.
We apply AFAR to a real application, i.e., the diagnosis of Children's Attention Deficit Hyperactivity Disorder(ADHD)
arXiv Detail & Related papers (2022-07-21T02:14:09Z) - One-Class Knowledge Distillation for Face Presentation Attack Detection [53.30584138746973]
This paper introduces a teacher-student framework to improve the cross-domain performance of face PAD with one-class domain adaptation.
Student networks are trained to mimic the teacher network and learn similar representations for genuine face samples of the target domain.
In the test phase, the similarity score between the representations of the teacher and student networks is used to distinguish attacks from genuine ones.
arXiv Detail & Related papers (2022-05-08T06:20:59Z) - Deep Multi-Facial Patches Aggregation Network For Facial Expression
Recognition [5.735035463793008]
We propose an approach for Facial Expressions Recognition (FER) based on a deep multi-facial patches aggregation network.
Deep features are learned from facial patches using deep sub-networks and aggregated within one deep architecture for expression classification.
arXiv Detail & Related papers (2020-02-20T17:57:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.