Identity-aware Facial Expression Recognition in Compressed Video
- URL: http://arxiv.org/abs/2101.00317v2
- Date: Thu, 7 Jan 2021 23:46:22 GMT
- Title: Identity-aware Facial Expression Recognition in Compressed Video
- Authors: Xiaofeng Liu, Linghao Jin, Xu Han, Jun Lu, Jane You, Lingsheng Kong
- Abstract summary: In the up to two orders of magnitude compressed domain, we can explicitly infer the expression from the residual frames.
We do not need the identity label or multiple expression samples from the same person for identity elimination.
Our solution can achieve comparable or better performance than the recent decoded image based methods.
- Score: 27.14473209125735
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper targets to explore the inter-subject variations eliminated facial
expression representation in the compressed video domain. Most of the previous
methods process the RGB images of a sequence, while the off-the-shelf and
valuable expression-related muscle movement already embedded in the compression
format. In the up to two orders of magnitude compressed domain, we can
explicitly infer the expression from the residual frames and possible to
extract identity factors from the I frame with a pre-trained face recognition
network. By enforcing the marginal independent of them, the expression feature
is expected to be purer for the expression and be robust to identity shifts. We
do not need the identity label or multiple expression samples from the same
person for identity elimination. Moreover, when the apex frame is annotated in
the dataset, the complementary constraint can be further added to regularize
the feature-level game. In testing, only the compressed residual frames are
required to achieve expression prediction. Our solution can achieve comparable
or better performance than the recent decoded image based methods on the
typical FER benchmarks with about 3$\times$ faster inference with compressed
data.
Related papers
- Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption [57.056311855630916]
We propose a Controllable Generative Image Compression framework, Control-GIC.
It is capable of fine-grained adaption across a broad spectrum while ensuring high-fidelity and generality compression.
We develop a conditional conditionalization that can trace back to historic encoded multi-granularity representations.
arXiv Detail & Related papers (2024-06-02T14:22:09Z) - Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for
Occluded Facial Expression Recognition [0.0]
The proposed method can detect occluded parts of the face as if they were unoccluded, and recognize them, improving FER accuracy.
It involves three steps: First, the vision transformer (ViT)-based occlusion patch detector masks the occluded position by training only latent vectors from the unoccluded patches.
Second, the hybrid reconstruction network generates the masking position as a complete image using the ViT and convolutional neural network (CNN)
Last, the expression-relevant latent vector extractor retrieves and uses expression-related information from all latent vectors by applying a CNN-based class activation map
arXiv Detail & Related papers (2023-07-21T07:56:32Z) - Set-Based Face Recognition Beyond Disentanglement: Burstiness
Suppression With Variance Vocabulary [78.203301910422]
We argue that the two crucial issues in SFR, the face quality and burstiness, are both identity-irrelevant and variance-relevant.
We propose a light-weighted set-based disentanglement framework to separate the identity features with the variance features.
To suppress face burstiness in the sets, we propose a vocabulary-based burst suppression (VBS) method.
arXiv Detail & Related papers (2023-04-13T04:02:58Z) - SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z) - Disentangling Identity and Pose for Facial Expression Recognition [54.50747989860957]
We propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation.
For identity encoder, a well pre-trained face recognition model is utilized and fixed during training, which alleviates the restriction on specific expression training data.
By comparing the difference between synthesized neutral and expressional images of the same individual, the expression component is further disentangled from identity and pose.
arXiv Detail & Related papers (2022-08-17T06:48:13Z) - Mutual Information Regularized Identity-aware Facial
ExpressionRecognition in Compressed Video [27.602648102881535]
We propose a novel collaborative min-min game for mutual information (MI) minimization in latent space.
We do not need the identity label or multiple expression samples from the same person for identity elimination.
Our solution can achieve comparable or better performance than the recent decoded image-based methods.
arXiv Detail & Related papers (2020-10-20T21:42:18Z) - Blind Face Restoration via Deep Multi-scale Component Dictionaries [75.02640809505277]
We propose a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations.
DFDNet generates deep dictionaries for perceptually significant face components from high-quality images.
component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features.
arXiv Detail & Related papers (2020-08-02T07:02:07Z) - LEED: Label-Free Expression Editing via Disentanglement [57.09545215087179]
LEED framework is capable of editing the expression of both frontal and profile facial images without requiring any expression label.
Two novel losses are designed for optimal expression disentanglement and consistent synthesis.
arXiv Detail & Related papers (2020-07-17T13:36:15Z) - Fine-Grained Expression Manipulation via Structured Latent Space [30.789513209376032]
We propose an end-to-end expression-guided generative adversarial network (EGGAN) to manipulate fine-grained expressions.
Our method can manipulate fine-grained expressions, and generate continuous intermediate expressions between source and target expressions.
arXiv Detail & Related papers (2020-04-21T06:18:34Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.