Related papers: More comprehensive facial inversion for more effective expression recognition

More comprehensive facial inversion for more effective expression recognition

URL: http://arxiv.org/abs/2211.13564v1
Date: Thu, 24 Nov 2022 12:31:46 GMT
Title: More comprehensive facial inversion for more effective expression recognition
Authors: Jiawei Mao, Guangyi Zhao, Yuanqi Chang, Xuesong Yin, Xiaogang Peng, Rui Xu
Abstract summary: We propose a novel generative method based on the image inversion mechanism for the FER task, termed Inversion FER (IFER) ASIT is equipped with an image inversion discriminator that measures the cosine similarity of semantic features between source and generated images, constrained by a distribution alignment loss. We extensively evaluate ASIT on facial datasets such as FFHQ and CelebA-HQ, showing that our approach achieves state-of-the-art facial inversion performance.
Score: 8.102564078640274
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Facial expression recognition (FER) plays a significant role in the ubiquitous application of computer vision. We revisit this problem with a new perspective on whether it can acquire useful representations that improve FER performance in the image generation process, and propose a novel generative method based on the image inversion mechanism for the FER task, termed Inversion FER (IFER). Particularly, we devise a novel Adversarial Style Inversion Transformer (ASIT) towards IFER to comprehensively extract features of generated facial images. In addition, ASIT is equipped with an image inversion discriminator that measures the cosine similarity of semantic features between source and generated images, constrained by a distribution alignment loss. Finally, we introduce a feature modulation module to fuse the structural code and latent codes from ASIT for the subsequent FER work. We extensively evaluate ASIT on facial datasets such as FFHQ and CelebA-HQ, showing that our approach achieves state-of-the-art facial inversion performance. IFER also achieves competitive results in facial expression recognition datasets such as RAF-DB, SFEW and AffectNet. The code and models are available at https://github.com/Talented-Q/IFER-master.

Related papers

evTransFER: A Transfer Learning Framework for Event-based Facial Expression Recognition [0.0]
We propose a learning-based framework and architecture for face-expression recognition using event-based cameras.<n>We show that this proposed transfer learning method greatly improves the ability to recognize facial expressions.<n>In addition, we propose an architecture that incorporates an LSTM to capture longer-term facial expression dynamics.
arXiv Detail & Related papers (2025-08-05T16:26:09Z)
A Visual Self-attention Mechanism Facial Expression Recognition Network beyond Convnext [5.651484411686618]
This paper proposes a visual facial expression signal processing network based on truncated ConvNeXt approach(Conv-cut) The network uses a truncated ConvNeXt-Base as the feature extractor, and then we designed a Detail Extraction Block to extract detailed features. To evaluate the proposed Conv-cut approach, we conducted experiments on the RAF-DB and FERPlus datasets, and the results show that our model has achieved state-of-the-art performance.
arXiv Detail & Related papers (2025-04-12T04:35:37Z)
WEM-GAN: Wavelet transform based facial expression manipulation [2.0918868193463207]
We propose WEM-GAN, in short for wavelet-based expression manipulation GAN. We take advantage of the wavelet transform technique and combine it with our generator with a U-net autoencoder backbone. Our model performs better in preserving identity features, editing capability, and image generation quality on the AffectNet dataset.
arXiv Detail & Related papers (2024-12-03T16:23:02Z)
OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration. We propose OSDFace, a novel one-step diffusion model for face restoration. Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z)
Bridging the Gaps: Utilizing Unlabeled Face Recognition Datasets to Boost Semi-Supervised Facial Expression Recognition [5.750927184237346]
We focus on utilizing large unlabeled Face Recognition (FR) datasets to boost semi-supervised FER. Specifically, we first perform face reconstruction pre-training on large-scale facial images without annotations. To further alleviate the scarcity of labeled and diverse images, we propose a Mixup-based data augmentation strategy.
arXiv Detail & Related papers (2024-10-23T07:26:19Z)
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection [64.29452783056253]
The rapid development of photo-realistic face generation methods has raised significant concerns in society and academia. Although existing approaches mainly capture face forgery patterns using image modality, other modalities like fine-grained noises and texts are not fully explored. We propose a novel multi-modal fine-grained CLIP (MFCLIP) model, which mines comprehensive and fine-grained forgery traces across image-noise modalities.
arXiv Detail & Related papers (2024-09-15T13:08:59Z)
Fiducial Focus Augmentation for Facial Landmark Detection [4.433764381081446]
We propose a novel image augmentation technique to enhance the model's understanding of facial structures. We employ a Siamese architecture-based training mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss. Our approach outperforms multiple state-of-the-art approaches across various benchmark datasets.
arXiv Detail & Related papers (2024-02-23T01:34:00Z)
Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning [0.0]
We introduce text-guided face recognition (TGFR) to analyze the impact of integrating facial attributes in the form of natural language descriptions. TGFR demonstrates remarkable improvements, particularly on low-quality images, over existing face recognition models.
arXiv Detail & Related papers (2023-12-14T22:04:22Z)
Improving Face Recognition from Caption Supervision with Multi-Granular Contextual Feature Aggregation [0.0]
We introduce caption-guided face recognition (CGFR) as a new framework to improve the performance of commercial-off-the-shelf (COTS) face recognition systems. We implement the proposed CGFR framework on two face recognition models (ArcFace and AdaFace) and evaluated its performance on the Multi-Modal CelebA-HQ dataset.
arXiv Detail & Related papers (2023-08-13T23:52:15Z)
FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping [62.38898610210771]
We present a new single-stage method for subject face swapping and identity transfer, named FaceDancer. We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR)
arXiv Detail & Related papers (2022-10-19T11:31:38Z)
Semantic Image Synthesis via Diffusion Models [174.24523061460704]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks. Recent work on semantic image synthesis mainly follows the de facto GAN-based approaches. We propose a novel framework based on DDPM for semantic image synthesis.
arXiv Detail & Related papers (2022-06-30T18:31:51Z)
LT-GAN: Self-Supervised GAN with Latent Transformation Detection [10.405721171353195]
We propose a self-supervised approach (LT-GAN) to improve the generation quality and diversity of images. We experimentally demonstrate that our proposed LT-GAN can be effectively combined with other state-of-the-art training techniques for added benefits.
arXiv Detail & Related papers (2020-10-19T22:09:45Z)
DotFAN: A Domain-transferred Face Augmentation Network for Pose and Illumination Invariant Face Recognition [94.96686189033869]
We propose a 3D model-assisted domain-transferred face augmentation network (DotFAN) DotFAN can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains. Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity.
arXiv Detail & Related papers (2020-02-23T08:16:34Z)
Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER. The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions. In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image. We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image. Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.