Related papers: ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation

ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation

URL: http://arxiv.org/abs/2311.01929v2
Date: Tue, 7 Nov 2023 15:34:42 GMT
Title: ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation
Authors: Xing Di, Yiyu Zheng, Xiaoming Liu, Yu Cheng
Abstract summary: Prototype-based Self-Distillation (ProS) is a novel approach for unsupervised face representation learning. ProS consists of two vision-transformers (teacher and student models) that are trained with different augmented images. ProS achieves state-of-the-art performance on various tasks, both in full and few-shot settings.
Score: 22.30414271893046
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a novel approach, called Prototype-based Self-Distillation (ProS), for unsupervised face representation learning. The existing supervised methods heavily rely on a large amount of annotated training facial data, which poses challenges in terms of data collection and privacy concerns. To address these issues, we propose ProS, which leverages a vast collection of unlabeled face images to learn a comprehensive facial omni-representation. In particular, ProS consists of two vision-transformers (teacher and student models) that are trained with different augmented images (cropping, blurring, coloring, etc.). Besides, we build a face-aware retrieval system along with augmentations to obtain the curated images comprising predominantly facial areas. To enhance the discrimination of learned features, we introduce a prototype-based matching loss that aligns the similarity distributions between features (teacher or student) and a set of learnable prototypes. After pre-training, the teacher vision transformer serves as a backbone for downstream tasks, including attribute estimation, expression recognition, and landmark alignment, achieved through simple fine-tuning with additional layers. Extensive experiments demonstrate that our method achieves state-of-the-art performance on various tasks, both in full and few-shot settings. Furthermore, we investigate pre-training with synthetic face images, and ProS exhibits promising performance in this scenario as well.

Related papers

Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection [54.26588902144298]
We propose a knowledge-guided prompt learning method for deepfake facial image detection. Specifically, we retrieve forgery-related prompts from large language models as expert knowledge to guide the optimization of learnable prompts. Our proposed approach notably outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-01-01T02:18:18Z)
FACE-AUDITOR: Data Auditing in Facial Recognition Systems [24.082527732931677]
Few-shot-based facial recognition systems have gained increasing attention due to their scalability and ability to work with a few face images. To prevent the face images from being misused, one straightforward approach is to modify the raw face images before sharing them. We propose a complete toolkit FACE-AUDITOR that can query the few-shot-based facial recognition model and determine whether any of a user's face images is used in training the model.
arXiv Detail & Related papers (2023-04-05T23:03:54Z)
Understanding Self-Supervised Pretraining with Part-Aware Representation Learning [88.45460880824376]
We study the capability that self-supervised representation pretraining methods learn part-aware representations. Results show that the fully-supervised model outperforms self-supervised models for object-level recognition.
arXiv Detail & Related papers (2023-01-27T18:58:42Z)
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation [12.677909048435408]
We propose a novel Pose-disentangled Contrastive Learning (PCL) method for general self-supervised facial representation. Our PCL first devises a pose-disentangled decoder (PDD), which disentangles the pose-related features from the face-aware features. We then introduce a pose-related contrastive learning scheme that learns pose-related information based on data augmentation of the same image.
arXiv Detail & Related papers (2022-11-24T09:30:51Z)
CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets. CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z)
Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers [57.1091606948826]
We propose a novel FER model, named Poker Face Vision Transformer or PF-ViT, to address these challenges. PF-ViT aims to separate and recognize the disturbance-agnostic emotion from a static facial image via generating its corresponding poker face. PF-ViT utilizes vanilla Vision Transformers, and its components are pre-trained as Masked Autoencoders on a large facial expression dataset.
arXiv Detail & Related papers (2022-07-22T13:39:06Z)
General Facial Representation Learning in a Visual-Linguistic Manner [45.92447707178299]
We introduce a framework, called FaRL, for general Facial Representation Learning in a visual-linguistic manner. We show that FaRL achieves better transfer performance compared with previous pre-trained models. Our model surpasses the state-of-the-art methods on face analysis tasks including face parsing and face alignment.
arXiv Detail & Related papers (2021-12-06T15:22:05Z)
Self-supervised Contrastive Learning of Multi-view Facial Expressions [9.949781365631557]
Facial expression recognition (FER) has emerged as an important component of human-computer interaction systems. We propose Contrastive Learning of Multi-view facial Expressions (CL-MEx) to exploit facial images captured simultaneously from different angles towards FER.
arXiv Detail & Related papers (2021-08-15T11:23:34Z)
Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails [53.080403912727604]
We propose a multi-stage Progressive Upsampling and Inpainting Generative Adversarial Network, dubbed Pro-UIGAN. It exploits facial geometry priors to replenish and upsample (8*) the occluded and tiny faces. Pro-UIGAN achieves visually pleasing HR faces, reaching superior performance in downstream tasks.
arXiv Detail & Related papers (2021-08-02T02:29:24Z)
Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning. Current contrastive models are ineffective at localizing the foreground object. We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER. The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions. In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.