Training face verification models from generated face identity data
- URL: http://arxiv.org/abs/2108.00800v1
- Date: Mon, 2 Aug 2021 12:00:01 GMT
- Title: Training face verification models from generated face identity data
- Authors: Dennis Conway, Loic Simon, Alexis Lechervy, Frederic Jurie
- Abstract summary: We consider an approach to increase the privacy protection of data sets, as applied to face recognition.
We build on the StyleGAN generative adversarial network and feed it with latent codes combining two distinct sub-codes.
We find that the addition of a small amount of private data greatly improves the performance of our model.
- Score: 2.557825816851682
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning tools are becoming increasingly powerful and widely used.
Unfortunately membership attacks, which seek to uncover information from data
sets used in machine learning, have the potential to limit data sharing. In
this paper we consider an approach to increase the privacy protection of data
sets, as applied to face recognition. Using an auxiliary face recognition
model, we build on the StyleGAN generative adversarial network and feed it with
latent codes combining two distinct sub-codes, one encoding visual identity
factors, and, the other, non-identity factors. By independently varying these
vectors during image generation, we create a synthetic data set of fictitious
face identities. We use this data set to train a face recognition model. The
model performance degrades in comparison to the state-of-the-art of face
verification. When tested with a simple membership attack our model provides
good privacy protection, however the model performance degrades in comparison
to the state-of-the-art of face verification. We find that the addition of a
small amount of private data greatly improves the performance of our model,
which highlights the limitations of using synthetic data to train machine
learning models.
Related papers
- Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities [22.8742248559748]
We show that in 6 state-of-the-art synthetic face recognition datasets, several samples from the original real dataset are leaked.
This paper is the first work which shows the leakage from training data of generator models into the generated synthetic face recognition datasets.
arXiv Detail & Related papers (2024-10-31T15:17:14Z) - Federated Face Forgery Detection Learning with Personalized Representation [63.90408023506508]
Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat.
Traditional forgery detection methods directly centralized training on data.
The paper proposes a novel federated face forgery detection learning with personalized representation.
arXiv Detail & Related papers (2024-06-17T02:20:30Z) - SDFR: Synthetic Data for Face Recognition Competition [51.9134406629509]
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns.
Recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets.
This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024)
The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones.
arXiv Detail & Related papers (2024-04-06T10:30:31Z) - Deep Variational Privacy Funnel: General Modeling with Applications in
Face Recognition [3.351714665243138]
We develop a method for privacy-preserving representation learning using an end-to-end training framework.
We apply our model to state-of-the-art face recognition systems.
arXiv Detail & Related papers (2024-01-26T11:32:53Z) - Segue: Side-information Guided Generative Unlearnable Examples for
Facial Privacy Protection in Real World [64.4289385463226]
We propose Segue: Side-information guided generative unlearnable examples.
To improve transferability, we introduce side information such as true labels and pseudo labels.
It can resist JPEG compression, adversarial training, and some standard data augmentations.
arXiv Detail & Related papers (2023-10-24T06:22:37Z) - Diff-Privacy: Diffusion-based Face Privacy Protection [58.1021066224765]
In this paper, we propose a novel face privacy protection method based on diffusion models, dubbed Diff-Privacy.
Specifically, we train our proposed multi-scale image inversion module (MSI) to obtain a set of SDM format conditional embeddings of the original image.
Based on the conditional embeddings, we design corresponding embedding scheduling strategies and construct different energy functions during the denoising process to achieve anonymization and visual identity information hiding.
arXiv Detail & Related papers (2023-09-11T09:26:07Z) - SynthDistill: Face Recognition with Knowledge Distillation from
Synthetic Data [8.026313049094146]
State-of-the-art face recognition networks are often computationally expensive and cannot be used for mobile applications.
We propose a new framework to train lightweight face recognition models by distilling the knowledge of a pretrained teacher face recognition model using synthetic data.
We use synthetic face images without identity labels, mitigating the problems in the intra-class variation generation of synthetic datasets.
arXiv Detail & Related papers (2023-08-28T19:15:27Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - How to Boost Face Recognition with StyleGAN? [13.067766076889995]
State-of-the-art face recognition systems require vast amounts of labeled training data.
Self-supervised revolution in the industry motivates research on the adaptation of related techniques to facial recognition.
We show that a simple approach based on fine-tuning pSp encoder for StyleGAN allows us to improve upon the state-of-the-art facial recognition.
arXiv Detail & Related papers (2022-10-18T18:41:56Z) - SFace: Privacy-friendly and Accurate Face Recognition using Synthetic
Data [9.249824128880707]
We propose and investigate the feasibility of using a privacy-friendly synthetically generated face dataset to train face recognition models.
To address the privacy aspect of using such data to train a face recognition model, we provide extensive evaluation experiments on the identity relation between the synthetic dataset and the original authentic dataset used to train the generative model.
We also propose to train face recognition on our privacy-friendly dataset, SFace, using three different learning strategies, multi-class classification, label-free knowledge transfer, and combined learning of multi-class classification and knowledge transfer.
arXiv Detail & Related papers (2022-06-21T16:42:04Z) - SynFace: Face Recognition with Synthetic Data [83.15838126703719]
We devise the SynFace with identity mixup (IM) and domain mixup (DM) to mitigate the performance gap.
We also perform a systematically empirical analysis on synthetic face images to provide some insights on how to effectively utilize synthetic data for face recognition.
arXiv Detail & Related papers (2021-08-18T03:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.