One Label, One Billion Faces: Usage and Consistency of Racial Categories
in Computer Vision
- URL: http://arxiv.org/abs/2102.02320v1
- Date: Wed, 3 Feb 2021 22:50:04 GMT
- Title: One Label, One Billion Faces: Usage and Consistency of Racial Categories
in Computer Vision
- Authors: Zaid Khan and Yun Fu
- Abstract summary: We study the racial system encoded by computer vision datasets supplying categorical race labels for face images.
We find that each dataset encodes a substantially unique racial system, despite nominally equivalent racial categories.
We find evidence that racial categories encode stereotypes, and exclude ethnic groups from categories on the basis of nonconformity to stereotypes.
- Score: 75.82110684355979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer vision is widely deployed, has highly visible, society altering
applications, and documented problems with bias and representation. Datasets
are critical for benchmarking progress in fair computer vision, and often
employ broad racial categories as population groups for measuring group
fairness. Similarly, diversity is often measured in computer vision datasets by
ascribing and counting categorical race labels. However, racial categories are
ill-defined, unstable temporally and geographically, and have a problematic
history of scientific use. Although the racial categories used across datasets
are superficially similar, the complexity of human race perception suggests the
racial system encoded by one dataset may be substantially inconsistent with
another. Using the insight that a classifier can learn the racial system
encoded by a dataset, we conduct an empirical study of computer vision datasets
supplying categorical race labels for face images to determine the
cross-dataset consistency and generalization of racial categories. We find that
each dataset encodes a substantially unique racial system, despite nominally
equivalent racial categories, and some racial categories are systemically less
consistent than others across datasets. We find evidence that racial categories
encode stereotypes, and exclude ethnic groups from categories on the basis of
nonconformity to stereotypes. Representing a billion humans under one racial
category may obscure disparities and create new ones by encoding stereotypes of
racial systems. The difficulty of adequately converting the abstract concept of
race into a tool for measuring fairness underscores the need for a method more
flexible and culturally aware than racial categories.
Related papers
- Racial/Ethnic Categories in AI and Algorithmic Fairness: Why They Matter and What They Represent [0.0]
We show how racial categories with unclear assumptions and little justification can lead to varying datasets that poorly represent groups.
We also develop a framework, CIRCSheets, for documenting the choices and assumptions in choosing racial categories and the process of racialization into these categories.
arXiv Detail & Related papers (2024-04-10T04:04:05Z) - Leveraging Diffusion Perturbations for Measuring Fairness in Computer
Vision [25.414154497482162]
We demonstrate that diffusion models can be leveraged to create such a dataset.
We benchmark several vision-language models on a multi-class occupation classification task.
We find that images generated with non-Caucasian labels have a significantly higher occupation misclassification rate than images generated with Caucasian labels.
arXiv Detail & Related papers (2023-11-25T19:40:13Z) - An Empirical Analysis of Racial Categories in the Algorithmic Fairness
Literature [2.2713084727838115]
We analyze how race is conceptualized and formalized in algorithmic fairness frameworks.
We find that differing notions of race are adopted inconsistently, at times even within a single analysis.
We argue that the construction of racial categories is a value-laden process with significant social and political consequences.
arXiv Detail & Related papers (2023-09-12T21:23:29Z) - Addressing Racial Bias in Facial Emotion Recognition [1.4896509623302834]
This study focuses on analyzing racial bias by sub-sampling training sets with varied racial distributions.
Our findings indicate that smaller datasets with posed faces improve on both fairness and performance metrics as the simulations approach racial balance.
In larger datasets with greater facial variation, fairness metrics generally remain constant, suggesting that racial balance by itself is insufficient to achieve parity in test performance across different racial groups.
arXiv Detail & Related papers (2023-08-09T03:03:35Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Studying Bias in GANs through the Lens of Race [91.95264864405493]
We study how the performance and evaluation of generative image models are impacted by the racial composition of their training datasets.
Our results show that the racial compositions of generated images successfully preserve that of the training data.
However, we observe that truncation, a technique used to generate higher quality images during inference, exacerbates racial imbalances in the data.
arXiv Detail & Related papers (2022-09-06T22:25:56Z) - Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of
Demographic Data Collection in the Pursuit of Fairness [0.0]
We consider calls to collect more data on demographics to enable algorithmic fairness.
We show how these techniques largely ignore broader questions of data governance and systemic oppression.
arXiv Detail & Related papers (2022-04-18T04:50:09Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z) - Contrastive Examples for Addressing the Tyranny of the Majority [83.93825214500131]
We propose to create a balanced training dataset, consisting of the original dataset plus new data points in which the group memberships are intervened.
We show that current generative adversarial networks are a powerful tool for learning these data points, called contrastive examples.
arXiv Detail & Related papers (2020-04-14T14:06:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.