How Does Gender Balance In Training Data Affect Face Recognition
Accuracy?
- URL: http://arxiv.org/abs/2002.02934v2
- Date: Mon, 6 Apr 2020 21:30:35 GMT
- Title: How Does Gender Balance In Training Data Affect Face Recognition
Accuracy?
- Authors: V\'itor Albiero, Kai Zhang, and Kevin W. Bowyer
- Abstract summary: It is often speculated that lower accuracy for women is caused by under-representation in the training data.
This work investigates female under-representation in the training data is truly the cause of lower accuracy for females on test data.
- Score: 12.362029427868206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning methods have greatly increased the accuracy of face
recognition, but an old problem still persists: accuracy is usually higher for
men than women. It is often speculated that lower accuracy for women is caused
by under-representation in the training data. This work investigates female
under-representation in the training data is truly the cause of lower accuracy
for females on test data. Using a state-of-the-art deep CNN, three different
loss functions, and two training datasets, we train each on seven subsets with
different male/female ratios, totaling forty two trainings, that are tested on
three different datasets. Results show that (1) gender balance in the training
data does not translate into gender balance in the test accuracy, (2) the
"gender gap" in test accuracy is not minimized by a gender-balanced training
set, but by a training set with more male images than female images, and (3)
training to minimize the accuracy gap does not result in highest female, male
or average accuracy
Related papers
- AI Gender Bias, Disparities, and Fairness: Does Training Data Matter? [3.509963616428399]
This study delves into the pervasive issue of gender issues in artificial intelligence (AI)
It analyzes more than 1000 human-graded student responses from male and female participants across six assessment items.
Results indicate that scoring accuracy for mixed-trained models shows an insignificant difference from either male- or female-trained models.
arXiv Detail & Related papers (2023-12-17T22:37:06Z) - Exploring the Impact of Training Data Distribution and Subword
Tokenization on Gender Bias in Machine Translation [19.719314005149883]
We study the effect of tokenization on gender bias in machine translation.
We observe that female and non-stereotypical gender inflections of profession names tend to be split into subword tokens.
We show that analyzing subword splits provides good estimates of gender-form imbalance in the training data.
arXiv Detail & Related papers (2023-09-21T21:21:55Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - Language Models Get a Gender Makeover: Mitigating Gender Bias with
Few-Shot Data Interventions [50.67412723291881]
Societal biases present in pre-trained large language models are a critical issue.
We propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models.
arXiv Detail & Related papers (2023-06-07T16:50:03Z) - The Gender Gap in Face Recognition Accuracy Is a Hairy Problem [8.768049933358968]
We first demonstrate that female and male hairstyles have important differences that impact face recognition accuracy.
We then demonstrate that when the data used to estimate recognition accuracy is balanced across gender for how hairstyles occlude the face, the initially observed gender gap in accuracy largely disappears.
arXiv Detail & Related papers (2022-06-10T04:32:47Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - Gendered Differences in Face Recognition Accuracy Explained by
Hairstyles, Makeup, and Facial Morphology [11.50297186426025]
There is consensus in the research literature that face recognition accuracy is lower for females.
Controlling for equal amount of visible face in the test images mitigates the apparent higher false non-match rate for females.
Additional analysis shows that makeup-balanced datasets further improves females to achieve lower false non-match rates.
arXiv Detail & Related papers (2021-12-29T17:07:33Z) - Improving Gender Fairness of Pre-Trained Language Models without
Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin.
We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z) - Analysis of Gender Inequality In Face Recognition Accuracy [11.6168015920729]
We show that accuracy is lower for women due to the combination of (1) the impostor distribution for women having a skew toward higher similarity scores, and (2) the genuine distribution for women having a skew toward lower similarity scores.
We show that this phenomenon of the impostor and genuine distributions for women shifting closer towards each other is general across datasets of African-American, Caucasian, and Asian faces.
arXiv Detail & Related papers (2020-01-31T21:32:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.