Harms of Gender Exclusivity and Challenges in Non-Binary Representation
in Language Technologies
- URL: http://arxiv.org/abs/2108.12084v1
- Date: Fri, 27 Aug 2021 01:58:58 GMT
- Title: Harms of Gender Exclusivity and Challenges in Non-Binary Representation
in Language Technologies
- Authors: Sunipa Dev and Masoud Monajatipoor and Anaelia Ovalle and Arjun
Subramonian and Jeff M Phillips and Kai-Wei Chang
- Abstract summary: We explain the complexity of gender and language around it.
We survey non-binary persons to understand harms associated with the treatment of gender as binary.
- Score: 30.096268927587214
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gender is widely discussed in the context of language tasks and when
examining the stereotypes propagated by language models. However, current
discussions primarily treat gender as binary, which can perpetuate harms such
as the cyclical erasure of non-binary gender identities. These harms are driven
by model and dataset biases, which are consequences of the non-recognition and
lack of understanding of non-binary genders in society. In this paper, we
explain the complexity of gender and language around it, and survey non-binary
persons to understand harms associated with the treatment of gender as binary
in English language technologies. We also detail how current language
representations (e.g., GloVe, BERT) capture and perpetuate these harms and
related challenges that need to be acknowledged and addressed for
representations to equitably encode gender information.
Related papers
- EuroGEST: Investigating gender stereotypes in multilingual language models [58.871032460235575]
We introduce EuroGEST, a dataset designed to measure gender-stereotypical reasoning in LLMs across English and 29 European languages.<n>We show that the strongest stereotypes in all models across all languages are that women are 'beautiful', 'empathetic' and 'neat' and men are 'leaders','strong, tough' and 'professional'
arXiv Detail & Related papers (2025-06-04T11:58:18Z) - Gender Trouble in Language Models: An Empirical Audit Guided by Gender Performativity Theory [0.19116784879310028]
Language models encode and perpetuate harmful gendered stereotypes.<n>Gendered terms that do not neatly fall into one of these binary categories are erased and pathologized.<n>Our findings lead us to call for a re-evaluation of how gendered harms in language models are defined and addressed.
arXiv Detail & Related papers (2025-05-20T08:36:47Z) - mGeNTE: A Multilingual Resource for Gender-Neutral Language and Translation [21.461095625903504]
mGeNTE is a dataset of English-Italian/German/Spanish language pairs.
It enables research in both automatic Gender-Neutral Translation (GNT) and language modelling for three grammatical gender languages.
arXiv Detail & Related papers (2025-01-16T09:35:15Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora [9.959039325564744]
Gender bias in text corpora can lead to perpetuation and amplification of societal inequalities.
Existing methods to measure gender representation bias in text corpora have mainly been proposed for English.
This paper introduces a novel methodology to quantitatively measure gender representation bias in Spanish corpora.
arXiv Detail & Related papers (2024-06-19T16:30:58Z) - Refusal as Silence: Gendered Disparities in Vision-Language Model Responses [0.4199844472131921]
This study investigates refusal as a sociotechnical outcome through a counterfactual persona design.<n>We find that transgender and non-binary personas experience significantly higher refusal rates, even in non-harmful contexts.
arXiv Detail & Related papers (2024-06-12T13:52:30Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Participatory Research as a Path to Community-Informed, Gender-Fair
Machine Translation [19.098548371499678]
We propose a method and case study building on participatory action research to include queer and non-binary people, translators, and MT experts.
The case study focuses on German, where central findings are the importance of context dependency to avoid identity invalidation.
arXiv Detail & Related papers (2023-06-15T07:20:14Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study.
We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z) - Gender in Danger? Evaluating Speech Translation Technology on the
MuST-SHE Corpus [20.766890957411132]
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.
Can audio provide additional information to reduce gender bias?
We present the first thorough investigation of gender bias in speech translation, contributing with the release of a benchmark useful for future studies.
arXiv Detail & Related papers (2020-06-10T09:55:38Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.