Welcome to the Modern World of Pronouns: Identity-Inclusive Natural
Language Processing beyond Gender
- URL: http://arxiv.org/abs/2202.11923v1
- Date: Thu, 24 Feb 2022 06:42:11 GMT
- Title: Welcome to the Modern World of Pronouns: Identity-Inclusive Natural
Language Processing beyond Gender
- Authors: Anne Lauscher, Archie Crowley, Dirk Hovy
- Abstract summary: We provide an overview of 3rd person pronoun issues for Natural Language Processing.
We evaluate existing and novel modeling approaches.
We quantify the impact of a more discrimination-free approach on established benchmark data.
- Score: 23.92148222207458
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The world of pronouns is changing. From a closed class of words with few
members to a much more open set of terms to reflect identities. However,
Natural Language Processing (NLP) is barely reflecting this linguistic shift,
even though recent work outlined the harms of gender-exclusive language
technology. Particularly problematic is the current modeling 3rd person
pronouns, as it largely ignores various phenomena like neopronouns, i.e.,
pronoun sets that are novel and not (yet) widely established. This omission
contributes to the discrimination of marginalized and underrepresented groups,
e.g., non-binary individuals. However, other identity-expression phenomena
beyond gender are also ignored by current NLP technology. In this paper, we
provide an overview of 3rd person pronoun issues for NLP. Based on our
observations and ethical considerations, we define a series of desiderata for
modeling pronouns in language technology. We evaluate existing and novel
modeling approaches w.r.t. these desiderata qualitatively, and quantify the
impact of a more discrimination-free approach on established benchmark data.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Transforming Dutch: Debiasing Dutch Coreference Resolution Systems for Non-binary Pronouns [5.5514102920271196]
Gender-neutral pronouns are increasingly being introduced across Western languages.
Recent evaluations have demonstrated that English NLP systems are unable to correctly process gender-neutral pronouns.
This paper examines a Dutch coreference resolution system's performance on gender-neutral pronouns.
arXiv Detail & Related papers (2024-04-30T18:31:19Z) - Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies [75.85462924188076]
Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM)
We find that misgendering is significantly influenced by Byte-Pair (BPE) tokenization.
We propose two techniques: (1) pronoun tokenization parity, a method to enforce consistent tokenization across gendered pronouns, and (2) utilizing pre-existing LLM pronoun knowledge to improve neopronoun proficiency.
arXiv Detail & Related papers (2023-12-19T01:28:46Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - MISGENDERED: Limits of Large Language Models in Understanding Pronouns [46.276320374441056]
We evaluate popular language models for their ability to correctly use English gender-neutral pronouns.
We introduce MISGENDERED, a framework for evaluating large language models' ability to correctly use preferred pronouns.
arXiv Detail & Related papers (2023-06-06T18:27:52Z) - What about em? How Commercial Machine Translation Fails to Handle
(Neo-)Pronouns [26.28827649737955]
Wrong pronoun translations can discriminate against marginalized groups, e.g., non-binary individuals.
We study how three commercial machine translation systems translate 3rd-person pronouns.
Our error analysis shows that the presence of a gender-neutral pronoun often leads to grammatical and semantic translation errors.
arXiv Detail & Related papers (2023-05-25T13:34:09Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - How Conservative are Language Models? Adapting to the Introduction of
Gender-Neutral Pronouns [0.15293427903448023]
We show that gender-neutral pronouns (in Swedish) are not associated with human processing difficulties.
We show that gender-neutral pronouns in Danish, English, and Swedish are associated with higher perplexity, more dispersed attention patterns, and worse downstream performance.
arXiv Detail & Related papers (2022-04-11T09:42:02Z) - They, Them, Theirs: Rewriting with Gender-Neutral English [56.14842450974887]
We perform a case study on the singular they, a common way to promote gender inclusion in English.
We show how a model can be trained to produce gender-neutral English with 1% word error rate with no human-labeled data.
arXiv Detail & Related papers (2021-02-12T21:47:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.