Gender inference: can chatGPT outperform common commercial tools?
- URL: http://arxiv.org/abs/2312.00805v1
- Date: Fri, 24 Nov 2023 22:09:14 GMT
- Title: Gender inference: can chatGPT outperform common commercial tools?
- Authors: Michelle Alexopoulos, Kelly Lyons, Kaushar Mahetaji, Marcus Emmanuel
Barnes, Rogan Gutwillinger
- Abstract summary: We compare the performance of a generative Artificial Intelligence (AI) tool ChatGPT with three commercially available list-based and machine learning-based gender inference tools.
Specifically, we use a large Olympic athlete dataset and report how variations in the input (e.g., first name and first and last name) impact the accuracy of their predictions.
ChatGPT performs at least as well as Namsor and often outperforms it, especially for the female sample when country and/or last name information is available.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: An increasing number of studies use gender information to understand
phenomena such as gender bias, inequity in access and participation, or the
impact of the Covid pandemic response. Unfortunately, most datasets do not
include self-reported gender information, making it necessary for researchers
to infer gender from other information, such as names or names and country
information. An important limitation of these tools is that they fail to
appropriately capture the fact that gender exists on a non-binary scale,
however, it remains important to evaluate and compare how well these tools
perform in a variety of contexts. In this paper, we compare the performance of
a generative Artificial Intelligence (AI) tool ChatGPT with three commercially
available list-based and machine learning-based gender inference tools (Namsor,
Gender-API, and genderize.io) on a unique dataset. Specifically, we use a large
Olympic athlete dataset and report how variations in the input (e.g., first
name and first and last name, with and without country information) impact the
accuracy of their predictions. We report results for the full set, as well as
for the subsets: medal versus non-medal winners, athletes from the largest
English-speaking countries, and athletes from East Asia. On these sets, we find
that Namsor is the best traditional commercially available tool. However,
ChatGPT performs at least as well as Namsor and often outperforms it,
especially for the female sample when country and/or last name information is
available. All tools perform better on medalists versus non-medalists and on
names from English-speaking countries. Although not designed for this purpose,
ChatGPT may be a cost-effective tool for gender prediction. In the future, it
might even be possible for ChatGPT or other large scale language models to
better identify self-reported gender rather than report gender on a binary
scale.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Exploring the Impact of Training Data Distribution and Subword
Tokenization on Gender Bias in Machine Translation [19.719314005149883]
We study the effect of tokenization on gender bias in machine translation.
We observe that female and non-stereotypical gender inflections of profession names tend to be split into subword tokens.
We show that analyzing subword splits provides good estimates of gender-form imbalance in the training data.
arXiv Detail & Related papers (2023-09-21T21:21:55Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - For the Underrepresented in Gender Bias Research: Chinese Name Gender
Prediction with Heterogeneous Graph Attention Network [1.13608321568471]
We design a Chinese Heterogeneous Graph Attention (CHGAT) model to capture the heterogeneity in component relationships and incorporate the pronunciations of characters.
Our model largely surpasses current tools and also outperforms the state-of-the-art algorithm.
We open-source a more balanced multi-character dataset from an official source together with our code, hoping to help future research promoting gender equality.
arXiv Detail & Related papers (2023-02-01T13:08:50Z) - Exploring Gender Bias in Retrieval Models [2.594412743115663]
Mitigating gender bias in information retrieval is important to avoid propagating stereotypes.
We employ a dataset consisting of two components: (1) relevance of a document to a query and (2) "gender" of a document.
We show that pre-trained models for IR do not perform well in zero-shot retrieval tasks when full fine-tuning of a large pre-trained BERT encoder is performed.
We also illustrate that pre-trained models have gender biases that result in retrieved articles tending to be more often male than female.
arXiv Detail & Related papers (2022-08-02T21:12:05Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - What's in a Name? -- Gender Classification of Names with Character Based
Machine Learning Models [6.805167389805055]
We consider the problem of predicting the gender of registered users based on their declared name.
By analyzing the first names of 100M+ users, we found that genders can be very effectively classified using the composition of the name strings.
arXiv Detail & Related papers (2021-02-07T01:01:32Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z) - Gender in Danger? Evaluating Speech Translation Technology on the
MuST-SHE Corpus [20.766890957411132]
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.
Can audio provide additional information to reduce gender bias?
We present the first thorough investigation of gender bias in speech translation, contributing with the release of a benchmark useful for future studies.
arXiv Detail & Related papers (2020-06-10T09:55:38Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.