GANDiffFace: Controllable Generation of Synthetic Datasets for Face
Recognition with Realistic Variations
- URL: http://arxiv.org/abs/2305.19962v1
- Date: Wed, 31 May 2023 15:49:12 GMT
- Title: GANDiffFace: Controllable Generation of Synthetic Datasets for Face
Recognition with Realistic Variations
- Authors: Pietro Melzi, Christian Rathgeb, Ruben Tolosana, Ruben Vera-Rodriguez,
Dominik Lawatsch, Florian Domin, Maxim Schaubert
- Abstract summary: This study introduces GANDiffFace, a novel framework for the generation of synthetic datasets for face recognition.
GANDiffFace combines the power of Generative Adversarial Networks (GANs) and Diffusion models to overcome the limitations of existing synthetic datasets.
- Score: 2.7467281625529134
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Face recognition systems have significantly advanced in recent years, driven
by the availability of large-scale datasets. However, several issues have
recently came up, including privacy concerns that have led to the
discontinuation of well-established public datasets. Synthetic datasets have
emerged as a solution, even though current synthesis methods present other
drawbacks such as limited intra-class variations, lack of realism, and unfair
representation of demographic groups. This study introduces GANDiffFace, a
novel framework for the generation of synthetic datasets for face recognition
that combines the power of Generative Adversarial Networks (GANs) and Diffusion
models to overcome the limitations of existing synthetic datasets. In
GANDiffFace, we first propose the use of GANs to synthesize highly realistic
identities and meet target demographic distributions. Subsequently, we
fine-tune Diffusion models with the images generated with GANs, synthesizing
multiple images of the same identity with a variety of accessories, poses,
expressions, and contexts. We generate multiple synthetic datasets by changing
GANDiffFace settings, and compare their mated and non-mated score distributions
with the distributions provided by popular real-world datasets for face
recognition, i.e. VGG2 and IJB-C. Our results show the feasibility of the
proposed GANDiffFace, in particular the use of Diffusion models to enhance the
(limited) intra-class variations provided by GANs towards the level of
real-world datasets.
Related papers
- HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere [22.8742248559748]
Face recognition datasets are often collected by crawling Internet and without individuals' consents, raising ethical and privacy concerns.
Generating synthetic datasets for training face recognition models has emerged as a promising alternative.
We propose a new synthetic dataset generation approach, called HyperFace.
arXiv Detail & Related papers (2024-11-13T09:42:12Z) - ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition [60.15830516741776]
Synthetic face recognition (SFR) aims to generate datasets that mimic the distribution of real face data.
We introduce a diffusion-fueled SFR model termed $textID3$.
$textID3$ employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances.
arXiv Detail & Related papers (2024-09-26T06:46:40Z) - Synthetic Face Datasets Generation via Latent Space Exploration from Brownian Identity Diffusion [20.352548473293993]
Face Recognition (FR) models are trained on large-scale datasets, which have privacy and ethical concerns.
Lately, the use of synthetic data to complement or replace genuine data for the training of FR models has been proposed.
We introduce a new method, inspired by the physical motion of soft particles subjected to Brownian forces, allowing us to sample identities in a latent space under various constraints.
With this in hands, we generate several face datasets and benchmark them by training FR models, showing that data generated with our method exceeds the performance of previously GAN-based datasets and achieves competitive performance with state-of-the-
arXiv Detail & Related papers (2024-04-30T22:32:02Z) - Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data [104.45155847778584]
This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn)
FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations.
arXiv Detail & Related papers (2024-04-16T08:15:10Z) - SDFR: Synthetic Data for Face Recognition Competition [51.9134406629509]
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns.
Recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets.
This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024)
The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones.
arXiv Detail & Related papers (2024-04-06T10:30:31Z) - Leveraging Synthetic Data for Generalizable and Fair Facial Action Unit Detection [9.404202619102943]
We propose to use synthetically generated data and multi-source domain adaptation (MSDA) to address the problems of the scarcity of labeled data and the diversity of subjects.
Specifically, we propose to generate a diverse dataset through synthetic facial expression re-targeting.
To further improve gender fairness, PM2 matches the features of the real data with a female and a male synthetic image.
arXiv Detail & Related papers (2024-03-15T23:50:18Z) - Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model.
DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - Unsupervised Face Recognition using Unlabeled Synthetic Data [16.494722503803196]
We propose an unsupervised face recognition model based on unlabeled synthetic data (U SynthFace)
Our proposed U SynthFace learns to maximize the similarity between two augmented images of the same synthetic instance.
We prove the effectiveness of our U SynthFace in achieving relatively high recognition accuracies using unlabeled synthetic data.
arXiv Detail & Related papers (2022-11-14T14:05:19Z) - On the use of automatically generated synthetic image datasets for
benchmarking face recognition [2.0196229393131726]
Recent advances in Generative Adversarial Networks (GANs) provide a pathway to replace real datasets by synthetic datasets.
Recent advances in Generative Adversarial Networks (GANs) to synthesize realistic face images provide a pathway to replace real datasets by synthetic datasets.
benchmarking results on the synthetic dataset are a good substitution, often providing error rates and system ranking similar to the benchmarking on the real dataset.
arXiv Detail & Related papers (2021-06-08T09:54:02Z) - Partially Conditioned Generative Adversarial Networks [75.08725392017698]
Generative Adversarial Networks (GANs) let one synthesise artificial datasets by implicitly modelling the underlying probability distribution of a real-world training dataset.
With the introduction of Conditional GANs and their variants, these methods were extended to generating samples conditioned on ancillary information available for each sample within the dataset.
In this work, we argue that standard Conditional GANs are not suitable for such a task and propose a new Adversarial Network architecture and training strategy.
arXiv Detail & Related papers (2020-07-06T15:59:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.