DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions
in the Wild
- URL: http://arxiv.org/abs/2008.05924v1
- Date: Thu, 13 Aug 2020 14:10:05 GMT
- Title: DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions
in the Wild
- Authors: Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang, Wanchuang Xia,
Cheng Lu, Jiateng Liu
- Abstract summary: We present a new large-scale 'in-the-wild' dynamic facial expression database, DFEW, consisting of over 16,000 video clips from thousands of movies.
Second, we propose a novel method called Expression-Clustered Spatiotemporal Feature Learning framework to deal with dynamic FER in the wild.
Third, we conduct extensive benchmark experiments on DFEW using a lot of deep feature learning methods as well as our proposed EC-STFL.
- Score: 22.305429904593126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, facial expression recognition (FER) in the wild has gained a lot of
researchers' attention because it is a valuable topic to enable the FER
techniques to move from the laboratory to the real applications. In this paper,
we focus on this challenging but interesting topic and make contributions from
three aspects. First, we present a new large-scale 'in-the-wild' dynamic facial
expression database, DFEW (Dynamic Facial Expression in the Wild), consisting
of over 16,000 video clips from thousands of movies. These video clips contain
various challenging interferences in practical scenarios such as extreme
illumination, occlusions, and capricious pose changes. Second, we propose a
novel method called Expression-Clustered Spatiotemporal Feature Learning
(EC-STFL) framework to deal with dynamic FER in the wild. Third, we conduct
extensive benchmark experiments on DFEW using a lot of spatiotemporal deep
feature learning methods as well as our proposed EC-STFL. Experimental results
show that DFEW is a well-designed and challenging database, and the proposed
EC-STFL can promisingly improve the performance of existing spatiotemporal deep
neural networks in coping with the problem of dynamic FER in the wild. Our DFEW
database is publicly available and can be freely downloaded from
https://dfew-dataset.github.io/.
Related papers
- A Survey on Facial Expression Recognition of Static and Dynamic Emotions [34.33582251069003]
Facial expression recognition (FER) aims to analyze emotional states from static images and dynamic sequences.
This paper offers a comprehensive survey of both image-based static FER (SFER) and video-based dynamic FER (DFER) methods.
arXiv Detail & Related papers (2024-08-28T13:15:25Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification [58.5877965612088]
Person re-identification (ReID) has made great strides thanks to the data-driven deep learning techniques.
The existing benchmark datasets lack diversity, and models trained on these data cannot generalize well to dynamic wild scenarios.
We develop a new Open-World, Diverse, Cross-Spatial-Temporal dataset named OWD with several distinct features.
arXiv Detail & Related papers (2024-03-22T11:21:51Z) - From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos [88.08209394979178]
Dynamic facial expression recognition (DFER) in the wild is still hindered by data limitations.
We introduce a novel Static-to-Dynamic model (S2D) that leverages existing SFER knowledge and dynamic information implicitly encoded in extracted facial landmark-aware features.
arXiv Detail & Related papers (2023-12-09T03:16:09Z) - SDFE-LV: A Large-Scale, Multi-Source, and Unconstrained Database for
Spotting Dynamic Facial Expressions in Long Videos [21.7199719907133]
SDFE-LV consists of 1,191 long videos, each of which contains one or more complete dynamic facial expressions.
Each complete dynamic facial expression in its corresponding long video was independently labeled for five times by 10 well-trained annotators.
arXiv Detail & Related papers (2022-09-18T01:59:12Z) - Learning Vision Transformer with Squeeze and Excitation for Facial
Expression Recognition [10.256620178727884]
We propose to learn a vision Transformer jointly with a Squeeze and Excitation (SE) block for FER task.
The proposed method is evaluated on different publicly available FER databases including CK+, JAFFE,RAF-DB and SFEW.
Experiments demonstrate that our model outperforms state-of-the-art methods on CK+ and SFEW.
arXiv Detail & Related papers (2021-07-07T09:49:01Z) - Robust Facial Expression Recognition with Convolutional Visual
Transformers [23.05378099875569]
We propose Convolutional Visual Transformers to tackle Facial Expression Recognition in the wild by two main steps.
First, we propose an attentional selective fusion (ASF) for leveraging the feature maps generated by two-branch CNNs.
Second, inspired by the success of Transformers in natural language processing, we propose to model relationships between these visual words with global self-attention.
arXiv Detail & Related papers (2021-03-31T07:07:56Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Video-based Facial Expression Recognition using Graph Convolutional
Networks [57.980827038988735]
We introduce a Graph Convolutional Network (GCN) layer into a common CNN-RNN based model for video-based facial expression recognition.
We evaluate our method on three widely-used datasets, CK+, Oulu-CASIA and MMI, and also one challenging wild dataset AFEW8.0.
arXiv Detail & Related papers (2020-10-26T07:31:51Z) - Learning to Augment Expressions for Few-shot Fine-grained Facial
Expression Recognition [98.83578105374535]
We present a novel Fine-grained Facial Expression Database - F2ED.
It includes more than 200k images with 54 facial expressions from 119 persons.
Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we evaluate several tasks of few-shot expression learning.
We propose a unified task-driven framework - Compositional Generative Adversarial Network (Comp-GAN) learning to synthesize facial images.
arXiv Detail & Related papers (2020-01-17T03:26:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.