Avoiding bias when inferring race using name-based approaches
- URL: http://arxiv.org/abs/2104.12553v3
- Date: Tue, 12 Oct 2021 15:00:17 GMT
- Title: Avoiding bias when inferring race using name-based approaches
- Authors: Diego Kozlowski, Dakota S. Murray, Alexis Bell, Will Hulsey, Vincent
Larivi\`ere, Thema Monroe-White and Cassidy R. Sugimoto
- Abstract summary: We use information from the U.S. Census and mortgage applications to infer the race of U.S. affiliated authors in the Web of Science.
Our results demonstrate that the validity of name based inference varies by race/ethnicity and that threshold approaches underestimate Black authors and overestimate White authors.
- Score: 0.8543368663496084
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Racial disparity in academia is a widely acknowledged problem. The
quantitative understanding of racial based systemic inequalities is an
important step towards a more equitable research system. However, because of
the lack of robust information on authors' race, few large scale analyses have
been performed on this topic. Algorithmic approaches offer one solution, using
known information about authors, such as their names, to infer their perceived
race. As with any other algorithm, the process of racial inference can generate
biases if it is not carefully considered. The goal of this article is to assess
the extent to which algorithmic bias is introduced using different approaches
for name based racial inference. We use information from the U.S. Census and
mortgage applications to infer the race of U.S. affiliated authors in the Web
of Science. We estimate the effects of using given and family names, thresholds
or continuous distributions, and imputation. Our results demonstrate that the
validity of name based inference varies by race/ethnicity and that threshold
approaches underestimate Black authors and overestimate White authors. We
conclude with recommendations to avoid potential biases. This article lays the
foundation for more systematic and less biased investigations into racial
disparities in science.
Related papers
- What's in a Name? Auditing Large Language Models for Race and Gender
Bias [49.28899492966893]
We employ an audit design to investigate biases in state-of-the-art large language models, including GPT-4.
We find that the advice systematically disadvantages names that are commonly associated with racial minorities and women.
arXiv Detail & Related papers (2024-02-21T18:25:25Z) - A Novel Method for Analysing Racial Bias: Collection of Person Level
References [6.345851712811529]
We propose a novel method to analyze the differences in representation between two groups.
We examine the representation of African Americans and White Americans in books between 1850 to 2000 with the Google Books dataset.
arXiv Detail & Related papers (2023-10-24T14:00:01Z) - Towards Fair Face Verification: An In-depth Analysis of Demographic
Biases [11.191375513738361]
Deep learning-based person identification and verification systems have remarkably improved in terms of accuracy in recent years.
However, such systems have been found to exhibit significant biases related to race, age, and gender.
This paper presents an in-depth analysis, with a particular emphasis on the intersectionality of these demographic factors.
arXiv Detail & Related papers (2023-07-19T14:49:14Z) - Estimating Racial Disparities When Race is Not Observed [3.0931877196387196]
We introduce a new class of models that produce racial disparity estimates by using surnames as an instrumental variable for race.
A validation study based on the North Carolina voter file shows that BISG+BIRDiE reduces error by up to 84% when estimating racial differences in party registration.
We apply the proposed methodology to estimate racial differences in who benefits from the home mortgage interest deduction using individual-level tax data from the U.S. Internal Revenue Service.
arXiv Detail & Related papers (2023-03-05T04:46:16Z) - Diversity matters: Robustness of bias measurements in Wikidata [4.950095974653716]
We reveal data biases that surface in Wikidata for thirteen different demographics selected from seven continents.
We conduct our extensive experiments on a large number of occupations sampled from the thirteen demographics with respect to the sensitive attribute, i.e., gender.
We show that the choice of the state-of-the-art KG embedding algorithm has a strong impact on the ranking of biased occupations irrespective of gender.
arXiv Detail & Related papers (2023-02-27T18:38:10Z) - Inside the Black Box: Detecting and Mitigating Algorithmic Bias across Racialized Groups in College Student-Success Prediction [1.124958340749622]
We examine how the accuracy of college student success predictions differs between racialized groups, signaling algorithmic bias.
We demonstrate how models incorporating commonly used features to predict college-student success are less accurate when predicting success for racially minoritized students.
Common approaches to mitigating algorithmic bias are generally ineffective at eliminating disparities in prediction outcomes and accuracy between racialized groups.
arXiv Detail & Related papers (2023-01-10T04:48:51Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Anatomizing Bias in Facial Analysis [86.79402670904338]
Existing facial analysis systems have been shown to yield biased results against certain demographic subgroups.
It has become imperative to ensure that these systems do not discriminate based on gender, identity, or skin tone of individuals.
This has led to research in the identification and mitigation of bias in AI systems.
arXiv Detail & Related papers (2021-12-13T09:51:13Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - One Label, One Billion Faces: Usage and Consistency of Racial Categories
in Computer Vision [75.82110684355979]
We study the racial system encoded by computer vision datasets supplying categorical race labels for face images.
We find that each dataset encodes a substantially unique racial system, despite nominally equivalent racial categories.
We find evidence that racial categories encode stereotypes, and exclude ethnic groups from categories on the basis of nonconformity to stereotypes.
arXiv Detail & Related papers (2021-02-03T22:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.