Gender Gap in Natural Language Processing Research: Disparities in
Authorship and Citations
- URL: http://arxiv.org/abs/2005.00962v2
- Date: Thu, 3 Sep 2020 20:00:08 GMT
- Title: Gender Gap in Natural Language Processing Research: Disparities in
Authorship and Citations
- Authors: Saif M. Mohammad
- Abstract summary: Only about 29% of first authors are female and only about 25% of last authors are female.
On average, female first authors are cited less than male first authors, even when controlling for experience and area of research.
- Score: 31.87319293259599
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Disparities in authorship and citations across gender can have substantial
adverse consequences not just on the disadvantaged genders, but also on the
field of study as a whole. Measuring gender gaps is a crucial step towards
addressing them. In this work, we examine female first author percentages and
the citations to their papers in Natural Language Processing (1965 to 2019). We
determine aggregate-level statistics using an existing manually curated
author--gender list as well as first names strongly associated with a gender.
We find that only about 29% of first authors are female and only about 25% of
last authors are female. Notably, this percentage has not improved since the
mid 2000s. We also show that, on average, female first authors are cited less
than male first authors, even when controlling for experience and area of
research. Finally, we discuss the ethical considerations involved in automatic
demographic analysis.
Related papers
- Evaluating Gender Bias in Large Language Models [0.8636148452563583]
The study examines the extent to which Large Language Models (LLMs) exhibit gender bias in pronoun selection in occupational contexts.
The jobs considered include a range of occupations, from those with a significant male presence to those with a notable female concentration.
The results show a positive correlation between the models' pronoun choices and the gender distribution present in U.S. labor force data.
arXiv Detail & Related papers (2024-11-14T22:23:13Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - Voices of Her: Analyzing Gender Differences in the AI Publication World [26.702520904075044]
We identify several gender differences using the AI Scholar dataset of 78K researchers in the field of AI.
Female first-authored papers show distinct linguistic styles, such as longer text, more positive emotion words, and more catchy titles.
Our analysis provides a window into the current demographic trends in our AI community, and encourages more gender equality and diversity in the future.
arXiv Detail & Related papers (2023-05-24T00:40:49Z) - Temporal Analysis and Gender Bias in Computing [0.0]
Many names change ascribed gender over decades: the "Leslie problem"
This article identifies 300 given names with measurable "gender shifts" across 1925-1975.
This article demonstrates, quantitatively, there is net "female shift" that likely results in the overcounting of women (and undercounting of men) in earlier decades.
arXiv Detail & Related papers (2022-09-29T00:29:43Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - Investigating writing style as a contributor to gender gaps in science and technology [0.0]
We find significant differences in writing style by gender, with women using more involved features in their writing.
Papers and patents with more involved features also tend to be cited more by women.
Our findings suggest that scientific text is not devoid of personal character, which could contribute to bias in evaluation.
arXiv Detail & Related papers (2022-04-28T22:33:36Z) - The effect of the COVID-19 pandemic on gendered research productivity
and its correlates [0.0]
This study examined how the proportion of female authors in academic journals on a global scale changed in 2020.
We observed a decrease in research productivity for female researchers in 2020, mostly as first authors, followed by last author position.
Female researchers were not necessarily excluded from but were marginalised in research.
arXiv Detail & Related papers (2021-11-29T06:20:44Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.