Toward responsible face datasets: modeling the distribution of a
disentangled latent space for sampling face images from demographic groups
- URL: http://arxiv.org/abs/2309.08442v1
- Date: Fri, 15 Sep 2023 14:42:04 GMT
- Title: Toward responsible face datasets: modeling the distribution of a
disentangled latent space for sampling face images from demographic groups
- Authors: Parsa Rahimi, Christophe Ecabert, Sebastien Marcel
- Abstract summary: Recently, it has been exposed that some modern facial recognition systems could discriminate specific demographic groups.
We propose to use a simple method for modeling and sampling a disentangled projection of a StyleGAN latent space to generate any combination of demographic groups.
Our experiments show that we can synthesis any combination of demographic groups effectively and the identities are different from the original training dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, it has been exposed that some modern facial recognition systems
could discriminate specific demographic groups and may lead to unfair attention
with respect to various facial attributes such as gender and origin. The main
reason are the biases inside datasets, unbalanced demographics, used to train
theses models. Unfortunately, collecting a large-scale balanced dataset with
respect to various demographics is impracticable.
In this paper, we investigate as an alternative the generation of a balanced
and possibly bias-free synthetic dataset that could be used to train, to
regularize or to evaluate deep learning-based facial recognition models. We
propose to use a simple method for modeling and sampling a disentangled
projection of a StyleGAN latent space to generate any combination of
demographic groups (e.g. $hispanic-female$). Our experiments show that we can
synthesis any combination of demographic groups effectively and the identities
are different from the original training dataset. We also released the source
code.
Related papers
- Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a
Large Language Model Based on Group-Level Demographic Information [15.435605802794408]
Large language models exhibit societal biases associated with demographic information.
We propose "random silicon sampling," a method to emulate the opinions of the human population sub-group.
We find that language models can generate response distributions remarkably similar to the actual U.S. public opinion polls.
arXiv Detail & Related papers (2024-02-28T08:09:14Z) - Synthetic Data for the Mitigation of Demographic Biases in Face
Recognition [10.16490522214987]
This study investigates the possibility of mitigating the demographic biases that affect face recognition technologies through the use of synthetic data.
We use synthetic datasets generated with GANDiffFace, a novel framework able to synthesize datasets for face recognition with controllable demographic distribution and realistic intra-class variations.
Our results support the proposed approach and the use of synthetic data to mitigate demographic biases in face recognition.
arXiv Detail & Related papers (2024-02-02T14:57:42Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Bias and Diversity in Synthetic-based Face Recognition [12.408456748469426]
We investigate how the diversity of synthetic face recognition datasets compares to authentic datasets.
We look at the distribution of gender, ethnicity, age, and head position.
With regard to bias, it can be seen that the synthetic-based models share a similar bias behavior with the authentic-based models.
arXiv Detail & Related papers (2023-11-07T13:12:34Z) - Zero-shot racially balanced dataset generation using an existing biased
StyleGAN2 [5.463417677777276]
We propose a methodology that leverages the biased generative model StyleGAN2 to create demographically diverse images of synthetic individuals.
By training face recognition models with the resulting balanced dataset containing 50,000 identities per race, we can improve their performance and minimize biases that might have been present in a model trained on a real dataset.
arXiv Detail & Related papers (2023-05-12T18:07:10Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Gender Stereotyping Impact in Facial Expression Recognition [1.5340540198612824]
In recent years, machine learning-based models have become the most popular approach to Facial Expression Recognition (FER)
In publicly available FER datasets, apparent gender representation is usually mostly balanced, but their representation in the individual label is not.
We generate derivative datasets with different amounts of stereotypical bias by altering the gender proportions of certain labels.
We observe a discrepancy in the recognition of certain emotions between genders of up to $29 %$ under the worst bias conditions.
arXiv Detail & Related papers (2022-10-11T10:52:23Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Mitigating Face Recognition Bias via Group Adaptive Classifier [53.15616844833305]
This work aims to learn a fair face representation, where faces of every group could be more equally represented.
Our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy.
arXiv Detail & Related papers (2020-06-13T06:43:37Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z) - Investigating Bias in Deep Face Analysis: The KANFace Dataset and
Empirical Study [67.3961439193994]
We introduce the most comprehensive, large-scale dataset of facial images and videos to date.
The data are manually annotated in terms of identity, exact age, gender and kinship.
A method to debias network embeddings is introduced and tested on the proposed benchmarks.
arXiv Detail & Related papers (2020-05-15T00:14:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.