A Deep Dive into Dataset Imbalance and Bias in Face Identification
- URL: http://arxiv.org/abs/2203.08235v1
- Date: Tue, 15 Mar 2022 20:23:13 GMT
- Title: A Deep Dive into Dataset Imbalance and Bias in Face Identification
- Authors: Valeriia Cherepanova, Steven Reich, Samuel Dooley, Hossein Souri,
Micah Goldblum, Tom Goldstein
- Abstract summary: Media portrayals often center imbalance as the main source of bias in automated face recognition systems.
Previous studies of data imbalance in FR have focused exclusively on the face verification setting.
This work thoroughly explores the effects of each kind of imbalance possible in face identification, and discuss other factors which may impact bias in this setting.
- Score: 49.210042420757894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the deployment of automated face recognition (FR) systems proliferates,
bias in these systems is not just an academic question, but a matter of public
concern. Media portrayals often center imbalance as the main source of bias,
i.e., that FR models perform worse on images of non-white people or women
because these demographic groups are underrepresented in training data. Recent
academic research paints a more nuanced picture of this relationship. However,
previous studies of data imbalance in FR have focused exclusively on the face
verification setting, while the face identification setting has been largely
ignored, despite being deployed in sensitive applications such as law
enforcement. This is an unfortunate omission, as 'imbalance' is a more complex
matter in identification; imbalance may arise in not only the training data,
but also the testing data, and furthermore may affect the proportion of
identities belonging to each demographic group or the number of images
belonging to each identity. In this work, we address this gap in the research
by thoroughly exploring the effects of each kind of imbalance possible in face
identification, and discuss other factors which may impact bias in this
setting.
Related papers
- What Should Be Balanced in a "Balanced" Face Recognition Dataset? [8.820019122897154]
Various face image datasets have been proposed as 'fair' or 'balanced' to assess the accuracy of face recognition algorithms across demographics.
It is important to note that the number of identities and images in an evaluation dataset are em not driving factors for 1-to-1 face matching accuracy.
We propose a bias-aware toolkit that facilitates creation of cross-demographic evaluation datasets balanced on factors mentioned in this paper.
arXiv Detail & Related papers (2023-04-17T22:02:03Z) - Gender Stereotyping Impact in Facial Expression Recognition [1.5340540198612824]
In recent years, machine learning-based models have become the most popular approach to Facial Expression Recognition (FER)
In publicly available FER datasets, apparent gender representation is usually mostly balanced, but their representation in the individual label is not.
We generate derivative datasets with different amounts of stereotypical bias by altering the gender proportions of certain labels.
We observe a discrepancy in the recognition of certain emotions between genders of up to $29 %$ under the worst bias conditions.
arXiv Detail & Related papers (2022-10-11T10:52:23Z) - Anatomizing Bias in Facial Analysis [86.79402670904338]
Existing facial analysis systems have been shown to yield biased results against certain demographic subgroups.
It has become imperative to ensure that these systems do not discriminate based on gender, identity, or skin tone of individuals.
This has led to research in the identification and mitigation of bias in AI systems.
arXiv Detail & Related papers (2021-12-13T09:51:13Z) - Comparing Human and Machine Bias in Face Recognition [46.170389064229354]
We release improvements to the LFW and CelebA datasets which will enable future researchers to obtain measurements of algorithmic bias.
We also use these new data to develop a series of challenging facial identification and verification questions.
We find that both computer models and human survey participants perform significantly better at the verification task.
arXiv Detail & Related papers (2021-10-15T22:26:20Z) - Robustness Disparities in Commercial Face Detection [72.25318723264215]
We present the first of its kind detailed benchmark of the robustness of three such systems: Amazon Rekognition, Microsoft Azure, and Google Cloud Platform.
We generally find that photos of individuals who are older, masculine presenting, of darker skin type, or have dim lighting are more susceptible to errors than their counterparts in other identities.
arXiv Detail & Related papers (2021-08-27T21:37:16Z) - Unravelling the Effect of Image Distortions for Biased Prediction of
Pre-trained Face Recognition Models [86.79402670904338]
We evaluate the performance of four state-of-the-art deep face recognition models in the presence of image distortions.
We have observed that image distortions have a relationship with the performance gap of the model across different subgroups.
arXiv Detail & Related papers (2021-08-14T16:49:05Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Asymmetric Rejection Loss for Fairer Face Recognition [1.52292571922932]
Research has shown differences in face recognition performance across different ethnic groups due to the racial imbalance in the training datasets.
This is actually symptomatic of the under-representation of non-Caucasian ethnic groups in the celebdom from which face datasets are usually gathered.
We propose an Asymmetric Rejection Loss, which aims at making full use of unlabeled images of those under-represented groups.
arXiv Detail & Related papers (2020-02-09T04:01:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.