Occlusion-Robust FAU Recognition by Mining Latent Space of Masked
Autoencoders
- URL: http://arxiv.org/abs/2212.04029v1
- Date: Thu, 8 Dec 2022 01:57:48 GMT
- Title: Occlusion-Robust FAU Recognition by Mining Latent Space of Masked
Autoencoders
- Authors: Minyang Jiang, Yongwei Wang, Martin J. McKeown and Z. Jane Wang
- Abstract summary: Facial action units (FAUs) are critical for fine-grained facial expression analysis.
New approach takes advantage of rich information from the latent space of masked autoencoder (MAE) and transforms it into FAU features.
FAUs can achieve comparable performance as state-of-the-art methods under normal conditions.
- Score: 23.39566752915331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial action units (FAUs) are critical for fine-grained facial expression
analysis. Although FAU detection has been actively studied using ideally high
quality images, it was not thoroughly studied under heavily occluded
conditions. In this paper, we propose the first occlusion-robust FAU
recognition method to maintain FAU detection performance under heavy
occlusions. Our novel approach takes advantage of rich information from the
latent space of masked autoencoder (MAE) and transforms it into FAU features.
Bypassing the occlusion reconstruction step, our model efficiently extracts FAU
features of occluded faces by mining the latent space of a pretrained masked
autoencoder. Both node and edge-level knowledge distillation are also employed
to guide our model to find a mapping between latent space vectors and FAU
features. Facial occlusion conditions, including random small patches and large
blocks, are thoroughly studied. Experimental results on BP4D and DISFA datasets
show that our method can achieve state-of-the-art performances under the
studied facial occlusion, significantly outperforming existing baseline
methods. In particular, even under heavy occlusion, the proposed method can
achieve comparable performance as state-of-the-art methods under normal
conditions.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model [15.61920157541529]
We propose a novel Deepfake detection approach by adapting the Foundation Models with rich information encoded inside.
Inspired by the recent advances of parameter efficient fine-tuning, we propose a novel side-network-based decoder.
Our approach exhibits superior effectiveness in identifying unseen Deepfake samples, achieving notable performance improvement.
arXiv Detail & Related papers (2024-04-08T14:58:52Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for
Occluded Facial Expression Recognition [0.0]
The proposed method can detect occluded parts of the face as if they were unoccluded, and recognize them, improving FER accuracy.
It involves three steps: First, the vision transformer (ViT)-based occlusion patch detector masks the occluded position by training only latent vectors from the unoccluded patches.
Second, the hybrid reconstruction network generates the masking position as a complete image using the ViT and convolutional neural network (CNN)
Last, the expression-relevant latent vector extractor retrieves and uses expression-related information from all latent vectors by applying a CNN-based class activation map
arXiv Detail & Related papers (2023-07-21T07:56:32Z) - RARE: Robust Masked Graph Autoencoder [45.485891794905946]
Masked graph autoencoder (MGAE) has emerged as a promising self-supervised graph pre-training (SGP) paradigm.
We propose a novel SGP method termed Robust mAsked gRaph autoEncoder (RARE) to improve the certainty in inferring masked data.
arXiv Detail & Related papers (2023-04-04T03:35:29Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Robust and Precise Facial Landmark Detection by Self-Calibrated Pose
Attention Network [73.56802915291917]
We propose a semi-supervised framework to achieve more robust and precise facial landmark detection.
A Boundary-Aware Landmark Intensity (BALI) field is proposed to model more effective facial shape constraints.
A Self-Calibrated Pose Attention (SCPA) model is designed to provide a self-learned objective function that enforces intermediate supervision.
arXiv Detail & Related papers (2021-12-23T02:51:08Z) - End2End Occluded Face Recognition by Masking Corrupted Features [82.27588990277192]
State-of-the-art general face recognition models do not generalize well to occluded face images.
This paper presents a novel face recognition method that is robust to occlusions based on a single end-to-end deep neural network.
Our approach, named FROM (Face Recognition with Occlusion Masks), learns to discover the corrupted features from the deep convolutional neural networks, and clean them by the dynamically learned masks.
arXiv Detail & Related papers (2021-08-21T09:08:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.