Related papers: Synthetic Data for the Mitigation of Demographic Biases in Face Recognition

Synthetic Data for the Mitigation of Demographic Biases in Face Recognition

URL: http://arxiv.org/abs/2402.01472v1
Date: Fri, 2 Feb 2024 14:57:42 GMT
Title: Synthetic Data for the Mitigation of Demographic Biases in Face Recognition
Authors: Pietro Melzi and Christian Rathgeb and Ruben Tolosana and Ruben Vera-Rodriguez and Aythami Morales and Dominik Lawatsch and Florian Domin and Maxim Schaubert
Abstract summary: This study investigates the possibility of mitigating the demographic biases that affect face recognition technologies through the use of synthetic data. We use synthetic datasets generated with GANDiffFace, a novel framework able to synthesize datasets for face recognition with controllable demographic distribution and realistic intra-class variations. Our results support the proposed approach and the use of synthetic data to mitigate demographic biases in face recognition.
Score: 10.16490522214987
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This study investigates the possibility of mitigating the demographic biases that affect face recognition technologies through the use of synthetic data. Demographic biases have the potential to impact individuals from specific demographic groups, and can be identified by observing disparate performance of face recognition systems across demographic groups. They primarily arise from the unequal representations of demographic groups in the training data. In recent times, synthetic data have emerged as a solution to some problems that affect face recognition systems. In particular, during the generation process it is possible to specify the desired demographic and facial attributes of images, in order to control the demographic distribution of the synthesized dataset, and fairly represent the different demographic groups. We propose to fine-tune with synthetic data existing face recognition systems that present some demographic biases. We use synthetic datasets generated with GANDiffFace, a novel framework able to synthesize datasets for face recognition with controllable demographic distribution and realistic intra-class variations. We consider multiple datasets representing different demographic groups for training and evaluation. Also, we fine-tune different face recognition systems, and evaluate their demographic fairness with different metrics. Our results support the proposed approach and the use of synthetic data to mitigate demographic biases in face recognition.

Related papers

Synthetic Counterfactual Faces [1.3062016289815055]
We build a generative AI framework to construct targeted, counterfactual, high-quality synthetic face data. Our pipeline has many use cases, including face recognition systems sensitivity evaluations and image understanding system probes. We showcase the efficacy of our face generation pipeline on a leading commercial vision model.
arXiv Detail & Related papers (2024-07-18T22:22:49Z)
SDFR: Synthetic Data for Face Recognition Competition [51.9134406629509]
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns. Recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets. This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024) The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones.
arXiv Detail & Related papers (2024-04-06T10:30:31Z)
Bias and Diversity in Synthetic-based Face Recognition [12.408456748469426]
We investigate how the diversity of synthetic face recognition datasets compares to authentic datasets. We look at the distribution of gender, ethnicity, age, and head position. With regard to bias, it can be seen that the synthetic-based models share a similar bias behavior with the authentic-based models.
arXiv Detail & Related papers (2023-11-07T13:12:34Z)
Toward responsible face datasets: modeling the distribution of a disentangled latent space for sampling face images from demographic groups [0.0]
Recently, it has been exposed that some modern facial recognition systems could discriminate specific demographic groups. We propose to use a simple method for modeling and sampling a disentangled projection of a StyleGAN latent space to generate any combination of demographic groups. Our experiments show that we can synthesis any combination of demographic groups effectively and the identities are different from the original training dataset.
arXiv Detail & Related papers (2023-09-15T14:42:04Z)
Zero-shot racially balanced dataset generation using an existing biased StyleGAN2 [5.463417677777276]
We propose a methodology that leverages the biased generative model StyleGAN2 to create demographically diverse images of synthetic individuals. By training face recognition models with the resulting balanced dataset containing 50,000 identities per race, we can improve their performance and minimize biases that might have been present in a model trained on a real dataset.
arXiv Detail & Related papers (2023-05-12T18:07:10Z)
Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts. We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z)
Data Representativeness in Accessibility Datasets: A Meta-Analysis [7.6597163467929805]
We review datasets sourced by people with disabilities and older adults. We find that accessibility datasets represent diverse ages, but have gender and race representation gaps. We hope our effort expands the space of possibility for greater inclusion of marginalized communities in AI-infused systems.
arXiv Detail & Related papers (2022-07-16T23:32:19Z)
Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models. We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups. We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)
Mitigating Face Recognition Bias via Group Adaptive Classifier [53.15616844833305]
This work aims to learn a fair face representation, where faces of every group could be more equally represented. Our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy.
arXiv Detail & Related papers (2020-06-13T06:43:37Z)
Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race. We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns. We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
Investigating Bias in Deep Face Analysis: The KANFace Dataset and Empirical Study [67.3961439193994]
We introduce the most comprehensive, large-scale dataset of facial images and videos to date. The data are manually annotated in terms of identity, exact age, gender and kinship. A method to debias network embeddings is introduced and tested on the proposed benchmarks.
arXiv Detail & Related papers (2020-05-15T00:14:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.