Inferring gender from name: a large scale performance evaluation study
- URL: http://arxiv.org/abs/2308.12381v1
- Date: Tue, 22 Aug 2023 13:38:45 GMT
- Title: Inferring gender from name: a large scale performance evaluation study
- Authors: Kriste Krstovski, Yao Lu, Ye Xu
- Abstract summary: Researchers need to infer gender from readily available information, primarily from persons' names.
Name-to-gender inference has generated an ever-growing domain of algorithmic approaches and software products.
We conduct a large scale performance evaluation of existing approaches for name-to-gender inference.
We propose two new hybrid approaches that achieve better performance than any single existing approach.
- Score: 4.934579134540613
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A person's gender is a crucial piece of information when performing research
across a wide range of scientific disciplines, such as medicine, sociology,
political science, and economics, to name a few. However, in increasing
instances, especially given the proliferation of big data, gender information
is not readily available. In such cases researchers need to infer gender from
readily available information, primarily from persons' names. While inferring
gender from name may raise some ethical questions, the lack of viable
alternatives means that researchers have to resort to such approaches when the
goal justifies the means - in the majority of such studies the goal is to
examine patterns and determinants of gender disparities. The necessity of
name-to-gender inference has generated an ever-growing domain of algorithmic
approaches and software products. These approaches have been used throughout
the world in academia, industry, governmental and non-governmental
organizations. Nevertheless, the existing approaches have yet to be
systematically evaluated and compared, making it challenging to determine the
optimal approach for future research. In this work, we conducted a large scale
performance evaluation of existing approaches for name-to-gender inference.
Analysis are performed using a variety of large annotated datasets of names. We
further propose two new hybrid approaches that achieve better performance than
any single existing approach.
Related papers
- The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [58.130894823145205]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.
Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.
We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z) - GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - Unveiling Gender Bias in Terms of Profession Across LLMs: Analyzing and
Addressing Sociological Implications [0.0]
The study examines existing research on gender bias in AI language models and identifies gaps in the current knowledge.
The findings shed light on gendered word associations, language usage, and biased narratives present in the outputs of Large Language Models.
The paper presents strategies for reducing gender bias in LLMs, including algorithmic approaches and data augmentation techniques.
arXiv Detail & Related papers (2023-07-18T11:38:45Z) - Much Ado About Gender: Current Practices and Future Recommendations for
Appropriate Gender-Aware Information Access [3.3903891679981593]
Information access research (and development) sometimes makes use of gender.
This work makes a variety of assumptions about gender that are not aligned with current understandings of what gender is.
Most papers we review rely on a binary notion of gender, even if they acknowledge that gender cannot be split into two categories.
arXiv Detail & Related papers (2023-01-12T01:21:02Z) - Gender Bias in Big Data Analysis [0.0]
It measures gender bias when gender prediction software tools are used in historical big data research.
Gender bias is measured by contrasting personally identified computer science authors in the well-regarded DBLP dataset.
arXiv Detail & Related papers (2022-11-17T20:13:04Z) - Temporal Analysis and Gender Bias in Computing [0.0]
Many names change ascribed gender over decades: the "Leslie problem"
This article identifies 300 given names with measurable "gender shifts" across 1925-1975.
This article demonstrates, quantitatively, there is net "female shift" that likely results in the overcounting of women (and undercounting of men) in earlier decades.
arXiv Detail & Related papers (2022-09-29T00:29:43Z) - Theories of "Gender" in NLP Bias Research [0.0]
We survey nearly 200 articles concerning gender bias in NLP.
We find that the majority of the articles do not make their theorization of gender explicit.
Many conflate sex characteristics, social gender, and linguistic gender in ways that disregard the existence and experience of trans, nonbinary, and intersex people.
arXiv Detail & Related papers (2022-05-05T09:20:53Z) - Are Commercial Face Detection Models as Biased as Academic Models? [64.71318433419636]
We compare academic and commercial face detection systems, specifically examining robustness to noise.
We find that state-of-the-art academic face detection models exhibit demographic disparities in their noise robustness.
We conclude that commercial models are always as biased or more biased than an academic model.
arXiv Detail & Related papers (2022-01-25T02:21:42Z) - They, Them, Theirs: Rewriting with Gender-Neutral English [56.14842450974887]
We perform a case study on the singular they, a common way to promote gender inclusion in English.
We show how a model can be trained to produce gender-neutral English with 1% word error rate with no human-labeled data.
arXiv Detail & Related papers (2021-02-12T21:47:48Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.