Mutual Information Regularized Identity-aware Facial
ExpressionRecognition in Compressed Video
- URL: http://arxiv.org/abs/2010.10637v2
- Date: Sat, 5 Jun 2021 15:09:55 GMT
- Title: Mutual Information Regularized Identity-aware Facial
ExpressionRecognition in Compressed Video
- Authors: Xiaofeng Liu, Linghao Jin, Xu Han, Jane You
- Abstract summary: We propose a novel collaborative min-min game for mutual information (MI) minimization in latent space.
We do not need the identity label or multiple expression samples from the same person for identity elimination.
Our solution can achieve comparable or better performance than the recent decoded image-based methods.
- Score: 27.602648102881535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to extract effective expression representations that invariant to the
identity-specific attributes is a long-lasting problem for facial expression
recognition (FER). Most of the previous methods process the RGB images of a
sequence, while we argue that the off-the-shelf and valuable expression-related
muscle movement is already embedded in the compression format. In this paper,
we target to explore the inter-subject variations eliminated facial expression
representation in the compressed video domain. In the up to two orders of
magnitude compressed domain, we can explicitly infer the expression from the
residual frames and possibly extract identity factors from the I frame with a
pre-trained face recognition network. By enforcing the marginal independence of
them, the expression feature is expected to be purer for the expression and be
robust to identity shifts. Specifically, we propose a novel collaborative
min-min game for mutual information (MI) minimization in latent space. We do
not need the identity label or multiple expression samples from the same person
for identity elimination. Moreover, when the apex frame is annotated in the
dataset, the complementary constraint can be further added to regularize the
feature-level game. In testing, only the compressed residual frames are
required to achieve expression prediction. Our solution can achieve comparable
or better performance than the recent decoded image-based methods on the
typical FER benchmarks with about 3 times faster inference.
Related papers
- Personalized Face Inpainting with Diffusion Models by Parallel Visual
Attention [55.33017432880408]
This paper proposes the use of Parallel Visual Attention (PVA) in conjunction with diffusion models to improve inpainting results.
We train the added attention modules and identity encoder on CelebAHQ-IDI, a dataset proposed for identity-preserving face inpainting.
Experiments demonstrate that PVA attains unparalleled identity resemblance in both face inpainting and face inpainting with language guidance tasks.
arXiv Detail & Related papers (2023-12-06T15:39:03Z) - Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for
Occluded Facial Expression Recognition [0.0]
The proposed method can detect occluded parts of the face as if they were unoccluded, and recognize them, improving FER accuracy.
It involves three steps: First, the vision transformer (ViT)-based occlusion patch detector masks the occluded position by training only latent vectors from the unoccluded patches.
Second, the hybrid reconstruction network generates the masking position as a complete image using the ViT and convolutional neural network (CNN)
Last, the expression-relevant latent vector extractor retrieves and uses expression-related information from all latent vectors by applying a CNN-based class activation map
arXiv Detail & Related papers (2023-07-21T07:56:32Z) - Set-Based Face Recognition Beyond Disentanglement: Burstiness
Suppression With Variance Vocabulary [78.203301910422]
We argue that the two crucial issues in SFR, the face quality and burstiness, are both identity-irrelevant and variance-relevant.
We propose a light-weighted set-based disentanglement framework to separate the identity features with the variance features.
To suppress face burstiness in the sets, we propose a vocabulary-based burst suppression (VBS) method.
arXiv Detail & Related papers (2023-04-13T04:02:58Z) - Optimal Transport-based Identity Matching for Identity-invariant Facial
Expression Recognition [33.072870202596725]
Identity-invariant facial expression recognition (FER) has been one of the challenging computer vision tasks.
This paper proposes to quantify the inter-identity variation by utilizing pairs of similar expressions explored through a specific matching process.
The proposed matching method is not only easy to plug in to other models, but also requires only acceptable computational overhead.
arXiv Detail & Related papers (2022-09-25T07:30:44Z) - Disentangling Identity and Pose for Facial Expression Recognition [54.50747989860957]
We propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation.
For identity encoder, a well pre-trained face recognition model is utilized and fixed during training, which alleviates the restriction on specific expression training data.
By comparing the difference between synthesized neutral and expressional images of the same individual, the expression component is further disentangled from identity and pose.
arXiv Detail & Related papers (2022-08-17T06:48:13Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Identity-aware Facial Expression Recognition in Compressed Video [27.14473209125735]
In the up to two orders of magnitude compressed domain, we can explicitly infer the expression from the residual frames.
We do not need the identity label or multiple expression samples from the same person for identity elimination.
Our solution can achieve comparable or better performance than the recent decoded image based methods.
arXiv Detail & Related papers (2021-01-01T21:03:13Z) - Blind Face Restoration via Deep Multi-scale Component Dictionaries [75.02640809505277]
We propose a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations.
DFDNet generates deep dictionaries for perceptually significant face components from high-quality images.
component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features.
arXiv Detail & Related papers (2020-08-02T07:02:07Z) - LEED: Label-Free Expression Editing via Disentanglement [57.09545215087179]
LEED framework is capable of editing the expression of both frontal and profile facial images without requiring any expression label.
Two novel losses are designed for optimal expression disentanglement and consistent synthesis.
arXiv Detail & Related papers (2020-07-17T13:36:15Z) - Fine-Grained Expression Manipulation via Structured Latent Space [30.789513209376032]
We propose an end-to-end expression-guided generative adversarial network (EGGAN) to manipulate fine-grained expressions.
Our method can manipulate fine-grained expressions, and generate continuous intermediate expressions between source and target expressions.
arXiv Detail & Related papers (2020-04-21T06:18:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.