Related papers: Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender

Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender

URL: http://arxiv.org/abs/2202.11923v1
Date: Thu, 24 Feb 2022 06:42:11 GMT
Title: Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender
Authors: Anne Lauscher, Archie Crowley, Dirk Hovy
Abstract summary: We provide an overview of 3rd person pronoun issues for Natural Language Processing. We evaluate existing and novel modeling approaches. We quantify the impact of a more discrimination-free approach on established benchmark data.
Score: 23.92148222207458
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The world of pronouns is changing. From a closed class of words with few members to a much more open set of terms to reflect identities. However, Natural Language Processing (NLP) is barely reflecting this linguistic shift, even though recent work outlined the harms of gender-exclusive language technology. Particularly problematic is the current modeling 3rd person pronouns, as it largely ignores various phenomena like neopronouns, i.e., pronoun sets that are novel and not (yet) widely established. This omission contributes to the discrimination of marginalized and underrepresented groups, e.g., non-binary individuals. However, other identity-expression phenomena beyond gender are also ignored by current NLP technology. In this paper, we provide an overview of 3rd person pronoun issues for NLP. Based on our observations and ethical considerations, we define a series of desiderata for modeling pronouns in language technology. We evaluate existing and novel modeling approaches w.r.t. these desiderata qualitatively, and quantify the impact of a more discrimination-free approach on established benchmark data.

Related papers

A Bayesian account of pronoun and neopronoun acquisition [10.775624456460063]
We argue for explicitly modeling individual differences in pronoun selection. We present a probabilistic graphical modeling approach based on the nested Chinese Restaurant Franchise Process. We show that such a model can account for variability in how quickly pronouns or names are integrated into symbolic knowledge.
arXiv Detail & Related papers (2025-04-03T18:49:08Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
Transforming Dutch: Debiasing Dutch Coreference Resolution Systems for Non-binary Pronouns [5.5514102920271196]
Gender-neutral pronouns are increasingly being introduced across Western languages. Recent evaluations have demonstrated that English NLP systems are unable to correctly process gender-neutral pronouns. This paper examines a Dutch coreference resolution system's performance on gender-neutral pronouns.
arXiv Detail & Related papers (2024-04-30T18:31:19Z)
Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies [75.85462924188076]
Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM) We find that misgendering is significantly influenced by Byte-Pair (BPE) tokenization. We propose two techniques: (1) pronoun tokenization parity, a method to enforce consistent tokenization across gendered pronouns, and (2) utilizing pre-existing LLM pronoun knowledge to improve neopronoun proficiency.
arXiv Detail & Related papers (2023-12-19T01:28:46Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
MISGENDERED: Limits of Large Language Models in Understanding Pronouns [46.276320374441056]
We evaluate popular language models for their ability to correctly use English gender-neutral pronouns. We introduce MISGENDERED, a framework for evaluating large language models' ability to correctly use preferred pronouns.
arXiv Detail & Related papers (2023-06-06T18:27:52Z)
What about em? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns [26.28827649737955]
Wrong pronoun translations can discriminate against marginalized groups, e.g., non-binary individuals. We study how three commercial machine translation systems translate 3rd-person pronouns. Our error analysis shows that the presence of a gender-neutral pronoun often leads to grammatical and semantic translation errors.
arXiv Detail & Related papers (2023-05-25T13:34:09Z)
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation. We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z)
How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns [0.15293427903448023]
We show that gender-neutral pronouns (in Swedish) are not associated with human processing difficulties. We show that gender-neutral pronouns in Danish, English, and Swedish are associated with higher perplexity, more dispersed attention patterns, and worse downstream performance.
arXiv Detail & Related papers (2022-04-11T09:42:02Z)
They, Them, Theirs: Rewriting with Gender-Neutral English [56.14842450974887]
We perform a case study on the singular they, a common way to promote gender inclusion in English. We show how a model can be trained to produce gender-neutral English with 1% word error rate with no human-labeled data.
arXiv Detail & Related papers (2021-02-12T21:47:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.