Learning Disentangled Expression Representations from Facial Images
- URL: http://arxiv.org/abs/2008.07001v2
- Date: Tue, 18 Aug 2020 06:58:13 GMT
- Title: Learning Disentangled Expression Representations from Facial Images
- Authors: Marah Halawa, Manuel W\"ollhaf, Eduardo Vellasques, Urko S\'anchez
Sanz, and Olaf Hellwich
- Abstract summary: We use a formulation of the adversarial loss to learn disentangled representations for face images.
The used model facilitates learning on single-task datasets and improves the state-of-the-art in expression recognition with an accuracy of60.53%.
- Score: 2.2509387878255818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face images are subject to many different factors of variation, especially in
unconstrained in-the-wild scenarios. For most tasks involving such images, e.g.
expression recognition from video streams, having enough labeled data is
prohibitively expensive. One common strategy to tackle such a problem is to
learn disentangled representations for the different factors of variation of
the observed data using adversarial learning. In this paper, we use a
formulation of the adversarial loss to learn disentangled representations for
face images. The used model facilitates learning on single-task datasets and
improves the state-of-the-art in expression recognition with an accuracy
of60.53%on the AffectNetdataset, without using any additional data.
Related papers
- LightFFDNets: Lightweight Convolutional Neural Networks for Rapid Facial Forgery Detection [0.0]
This study focuses on image processing-based forgery detection using Fake-Vs-Real-Faces (Hard) [10] and 140k Real and Fake Faces [61] data sets.
Two lightweight deep learning models are proposed to conduct forgery detection using these images.
It's shown that the proposed lightweight deep learning models detect forgeries of facial imagery accurately, and computationally efficiently.
arXiv Detail & Related papers (2024-11-18T18:44:10Z) - See or Guess: Counterfactually Regularized Image Captioning [32.82695612178604]
We present a generic image captioning framework that employs causal inference to make existing models more capable of interventional tasks, and counterfactually explainable.
Our method effectively reduces hallucinations and improves the model's faithfulness to images, demonstrating high portability across both small-scale and large-scale image-to-text models.
arXiv Detail & Related papers (2024-08-29T17:59:57Z) - Contrastive Learning of View-Invariant Representations for Facial
Expressions Recognition [27.75143621836449]
We propose ViewFX, a novel view-invariant FER framework based on contrastive learning.
We test the proposed framework on two public multi-view facial expression recognition datasets.
arXiv Detail & Related papers (2023-11-12T14:05:09Z) - Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition [62.997667081978825]
We propose a biologically-inspired mechanism for transfer learning in facial expression recognition.
Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes.
Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
arXiv Detail & Related papers (2023-04-05T09:06:30Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Context-driven Visual Object Recognition based on Knowledge Graphs [0.8701566919381223]
We propose an approach that enhances deep learning methods by using external contextual knowledge encoded in a knowledge graph.
We conduct a series of experiments to investigate the impact of different contextual views on the learned object representations for the same image dataset.
arXiv Detail & Related papers (2022-10-20T13:09:00Z) - CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial
Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets.
CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z) - Semantic Diversity Learning for Zero-Shot Multi-label Classification [14.480713752871523]
This study introduces an end-to-end model training for multi-label zero-shot learning.
We propose to use an embedding matrix having principal embedding vectors trained using a tailored loss function.
In addition, during training, we suggest up-weighting in the loss function image samples presenting higher semantic diversity to encourage the diversity of the embedding matrix.
arXiv Detail & Related papers (2021-05-12T19:39:07Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.