Bias and Diversity in Synthetic-based Face Recognition
- URL: http://arxiv.org/abs/2311.03970v1
- Date: Tue, 7 Nov 2023 13:12:34 GMT
- Title: Bias and Diversity in Synthetic-based Face Recognition
- Authors: Marco Huber, Anh Thi Luu, Fadi Boutros, Arjan Kuijper, Naser Damer
- Abstract summary: We investigate how the diversity of synthetic face recognition datasets compares to authentic datasets.
We look at the distribution of gender, ethnicity, age, and head position.
With regard to bias, it can be seen that the synthetic-based models share a similar bias behavior with the authentic-based models.
- Score: 12.408456748469426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthetic data is emerging as a substitute for authentic data to solve
ethical and legal challenges in handling authentic face data. The current
models can create real-looking face images of people who do not exist. However,
it is a known and sensitive problem that face recognition systems are
susceptible to bias, i.e. performance differences between different demographic
and non-demographics attributes, which can lead to unfair decisions. In this
work, we investigate how the diversity of synthetic face recognition datasets
compares to authentic datasets, and how the distribution of the training data
of the generative models affects the distribution of the synthetic data. To do
this, we looked at the distribution of gender, ethnicity, age, and head
position. Furthermore, we investigated the concrete bias of three recent
synthetic-based face recognition models on the studied attributes in comparison
to a baseline model trained on authentic data. Our results show that the
generator generate a similar distribution as the used training data in terms of
the different attributes. With regard to bias, it can be seen that the
synthetic-based models share a similar bias behavior with the authentic-based
models. However, with the uncovered lower intra-identity attribute consistency
seems to be beneficial in reducing bias.
Related papers
- The Impact of Balancing Real and Synthetic Data on Accuracy and Fairness in Face Recognition [10.849598219674132]
We investigate the impact of demographically balanced authentic and synthetic data, both individually and in combination, on the accuracy and fairness of face recognition models.
Our findings emphasize two main points: (i) the increased effectiveness of training data generated by diffusion-based models in enhancing accuracy, whether used alone or combined with subsets of authentic data, and (ii) the minimal impact of incorporating balanced data from pre-trained generative methods on fairness.
arXiv Detail & Related papers (2024-09-04T16:50:48Z) - Toward Fairer Face Recognition Datasets [69.04239222633795]
Face recognition and verification are computer vision tasks whose performance has progressed with the introduction of deep representations.
Ethical, legal, and technical challenges due to the sensitive character of face data and biases in real training datasets hinder their development.
We promote fairness by introducing a demographic attributes balancing mechanism in generated training datasets.
arXiv Detail & Related papers (2024-06-24T12:33:21Z) - SDFR: Synthetic Data for Face Recognition Competition [51.9134406629509]
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns.
Recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets.
This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024)
The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones.
arXiv Detail & Related papers (2024-04-06T10:30:31Z) - Synthetic Data for the Mitigation of Demographic Biases in Face
Recognition [10.16490522214987]
This study investigates the possibility of mitigating the demographic biases that affect face recognition technologies through the use of synthetic data.
We use synthetic datasets generated with GANDiffFace, a novel framework able to synthesize datasets for face recognition with controllable demographic distribution and realistic intra-class variations.
Our results support the proposed approach and the use of synthetic data to mitigate demographic biases in face recognition.
arXiv Detail & Related papers (2024-02-02T14:57:42Z) - Toward responsible face datasets: modeling the distribution of a
disentangled latent space for sampling face images from demographic groups [0.0]
Recently, it has been exposed that some modern facial recognition systems could discriminate specific demographic groups.
We propose to use a simple method for modeling and sampling a disentangled projection of a StyleGAN latent space to generate any combination of demographic groups.
Our experiments show that we can synthesis any combination of demographic groups effectively and the identities are different from the original training dataset.
arXiv Detail & Related papers (2023-09-15T14:42:04Z) - Benchmarking Algorithmic Bias in Face Recognition: An Experimental
Approach Using Synthetic Faces and Human Evaluation [24.35436087740559]
We propose an experimental method for measuring bias in face recognition systems.
Our method is based on generating synthetic faces using a neural face generator.
We validate our method quantitatively by evaluating race and gender biases of three research-grade face recognition models.
arXiv Detail & Related papers (2023-08-10T08:57:31Z) - Analyzing Bias in Diffusion-based Face Generation Models [75.80072686374564]
Diffusion models are increasingly popular in synthetic data generation and image editing applications.
We investigate the presence of bias in diffusion-based face generation models with respect to attributes such as gender, race, and age.
We examine how dataset size affects the attribute composition and perceptual quality of both diffusion and Generative Adversarial Network (GAN) based face generation models.
arXiv Detail & Related papers (2023-05-10T18:22:31Z) - Are Commercial Face Detection Models as Biased as Academic Models? [64.71318433419636]
We compare academic and commercial face detection systems, specifically examining robustness to noise.
We find that state-of-the-art academic face detection models exhibit demographic disparities in their noise robustness.
We conclude that commercial models are always as biased or more biased than an academic model.
arXiv Detail & Related papers (2022-01-25T02:21:42Z) - SynFace: Face Recognition with Synthetic Data [83.15838126703719]
We devise the SynFace with identity mixup (IM) and domain mixup (DM) to mitigate the performance gap.
We also perform a systematically empirical analysis on synthetic face images to provide some insights on how to effectively utilize synthetic data for face recognition.
arXiv Detail & Related papers (2021-08-18T03:41:54Z) - Unravelling the Effect of Image Distortions for Biased Prediction of
Pre-trained Face Recognition Models [86.79402670904338]
We evaluate the performance of four state-of-the-art deep face recognition models in the presence of image distortions.
We have observed that image distortions have a relationship with the performance gap of the model across different subgroups.
arXiv Detail & Related papers (2021-08-14T16:49:05Z) - Transitioning from Real to Synthetic data: Quantifying the bias in model [1.6134566438137665]
This study aims to establish a trade-off between bias and fairness in the models trained using synthetic data.
We demonstrate there exist a varying levels of bias impact on models trained using synthetic data.
arXiv Detail & Related papers (2021-05-10T06:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.