PAM: Pose Attention Module for Pose-Invariant Face Recognition
- URL: http://arxiv.org/abs/2111.11940v1
- Date: Tue, 23 Nov 2021 15:18:33 GMT
- Title: PAM: Pose Attention Module for Pose-Invariant Face Recognition
- Authors: En-Jung Tsai, Wei-Chang Yeh
- Abstract summary: We propose a lightweight and easy-to-implement attention block, named Pose Attention Module (PAM), for pose-invariant face recognition.
Specifically, PAM performs frontal-profile feature transformation in hierarchical feature space by learning residuals between pose variations with a soft gate mechanism.
- Score: 3.0839245814393723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pose variation is one of the key challenges in face recognition. Conventional
techniques mainly focus on face frontalization or face augmentation in image
space. However, transforming face images in image space is not guaranteed to
preserve the lossless identity features of the original image. Moreover, these
methods suffer from more computational costs and memory requirements due to the
additional models. We argue that it is more desirable to perform feature
transformation in hierarchical feature space rather than image space, which can
take advantage of different feature levels and benefit from joint learning with
representation learning. To this end, we propose a lightweight and
easy-to-implement attention block, named Pose Attention Module (PAM), for
pose-invariant face recognition. Specifically, PAM performs frontal-profile
feature transformation in hierarchical feature space by learning residuals
between pose variations with a soft gate mechanism. We validated the
effectiveness of PAM block design through extensive ablation studies and
verified the performance on several popular benchmarks, including LFW, CFP-FP,
AgeDB-30, CPLFW, and CALFW. Experimental results show that our method not only
outperforms state-of-the-art methods but also effectively reduces memory
requirements by more than 75 times. It is noteworthy that our method is not
limited to face recognition with large pose variations. By adjusting the soft
gate mechanism of PAM to a specific coefficient, such semantic attention block
can easily extend to address other intra-class imbalance problems in face
recognition, including large variations in age, illumination, expression, etc.
Related papers
- Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models [69.50286698375386]
We propose a novel approach that better harnesses diffusion models for face-swapping.
We introduce a mask shuffling technique during inpainting training, which allows us to create a so-called universal model for swapping.
Ours is a relatively unified approach and so it is resilient to errors in other off-the-shelf models.
arXiv Detail & Related papers (2024-09-11T13:43:53Z) - DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation [84.0586749616249]
This paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity Facial Appearance Editing.
For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability.
In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC)
This module is guided to learn decoupled source-regarding features, thereby better preserving the identity and alleviating artifacts from non-facial attributes such as hair, clothes, and background.
arXiv Detail & Related papers (2024-03-26T12:53:10Z) - Fiducial Focus Augmentation for Facial Landmark Detection [4.433764381081446]
We propose a novel image augmentation technique to enhance the model's understanding of facial structures.
We employ a Siamese architecture-based training mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss.
Our approach outperforms multiple state-of-the-art approaches across various benchmark datasets.
arXiv Detail & Related papers (2024-02-23T01:34:00Z) - MorphGANFormer: Transformer-based Face Morphing and De-Morphing [55.211984079735196]
StyleGAN-based approaches to face morphing are among the leading techniques.
We propose a transformer-based alternative to face morphing and demonstrate its superiority to StyleGAN-based methods.
arXiv Detail & Related papers (2023-02-18T19:09:11Z) - FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping [62.38898610210771]
We present a new single-stage method for subject face swapping and identity transfer, named FaceDancer.
We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR)
arXiv Detail & Related papers (2022-10-19T11:31:38Z) - PoseFace: Pose-Invariant Features and Pose-Adaptive Loss for Face
Recognition [42.62320574369969]
We propose an efficient PoseFace framework which utilizes the facial landmarks to disentangle the pose-invariant features and exploits a pose-adaptive loss to handle the imbalance issue adaptively.
arXiv Detail & Related papers (2021-07-25T03:50:47Z) - Attention-guided Progressive Mapping for Profile Face Recognition [12.792576041526289]
Cross pose face recognition remains a significant challenge.
Learning pose-robust features by traversing to the feature space of frontal faces provides an effective and cheap way to alleviate this problem.
arXiv Detail & Related papers (2021-06-27T02:21:41Z) - Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z) - A NIR-to-VIS face recognition via part adaptive and relation attention
module [4.822208985805956]
In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras.
The illumination difference between NIR and visible-light (VIS) causes a domain gap between facial images, and the variations in pose and emotion also make facial matching more difficult.
We propose a part relation attention module that crops facial parts obtained through a semantic mask and performs relational modeling using each of these representative features.
arXiv Detail & Related papers (2021-02-01T08:13:39Z) - Multi-Margin based Decorrelation Learning for Heterogeneous Face
Recognition [90.26023388850771]
This paper presents a deep neural network approach to extract decorrelation representations in a hyperspherical space for cross-domain face images.
The proposed framework can be divided into two components: heterogeneous representation network and decorrelation representation learning.
Experimental results on two challenging heterogeneous face databases show that our approach achieves superior performance on both verification and recognition tasks.
arXiv Detail & Related papers (2020-05-25T07:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.