Gender, names and other mysteries: Towards the ambiguous for
gender-inclusive translation
- URL: http://arxiv.org/abs/2306.04573v1
- Date: Wed, 7 Jun 2023 16:21:59 GMT
- Title: Gender, names and other mysteries: Towards the ambiguous for
gender-inclusive translation
- Authors: Danielle Saunders, Katrina Olsen
- Abstract summary: This paper explores the case where the source sentence lacks explicit gender markers, but the target sentence contains them due to richer grammatical gender.
We find that many name-gender co-occurrences in MT data are not resolvable with 'unambiguous gender' in the source language.
We discuss potential steps toward gender-inclusive translation which accepts the ambiguity in both gender and translation.
- Score: 7.322734499960981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The vast majority of work on gender in MT focuses on 'unambiguous' inputs,
where gender markers in the source language are expected to be resolved in the
output. Conversely, this paper explores the widespread case where the source
sentence lacks explicit gender markers, but the target sentence contains them
due to richer grammatical gender. We particularly focus on inputs containing
person names.
Investigating such sentence pairs casts a new light on research into MT
gender bias and its mitigation. We find that many name-gender co-occurrences in
MT data are not resolvable with 'unambiguous gender' in the source language,
and that gender-ambiguous examples can make up a large proportion of training
examples. From this, we discuss potential steps toward gender-inclusive
translation which accepts the ambiguity in both gender and translation.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German [17.924716793621627]
We study gender-fair language in English-to-German machine translation (MT)
We conduct the first benchmark study involving two commercial systems and six neural MT models.
Our findings show that most systems produce mainly masculine forms and rarely gender-neutral variants.
arXiv Detail & Related papers (2024-06-10T09:39:19Z) - Probing Explicit and Implicit Gender Bias through LLM Conditional Text
Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses.
We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - MISGENDERED: Limits of Large Language Models in Understanding Pronouns [46.276320374441056]
We evaluate popular language models for their ability to correctly use English gender-neutral pronouns.
We introduce MISGENDERED, a framework for evaluating large language models' ability to correctly use preferred pronouns.
arXiv Detail & Related papers (2023-06-06T18:27:52Z) - GATE: A Challenge Set for Gender-Ambiguous Translation Examples [0.31498833540989407]
When source gender is ambiguous, machine translation models typically default to stereotypical gender roles, perpetuating harmful bias.
Recent work has led to the development of "gender rewriters" that generate alternative gender translations on such ambiguous inputs, but such systems are plagued by poor linguistic coverage.
We present and release GATE, a linguistically diverse corpus of gender-ambiguous source sentences along with multiple alternative target language translations.
arXiv Detail & Related papers (2023-03-07T15:23:38Z) - Extending Challenge Sets to Uncover Gender Bias in Machine Translation:
Impact of Stereotypical Verbs and Adjectives [0.45687771576879593]
State-of-the-art machine translation (MT) systems are trained on large corpora of text, mostly generated by humans.
Recent research showed that MT systems are biased towards stereotypical translation of occupations.
In this paper we present an extension of this challenge set, called WiBeMT, with gender-biased adjectives and adds sentences with gender-biased verbs.
arXiv Detail & Related papers (2021-07-24T11:22:10Z) - Neural Machine Translation Doesn't Translate Gender Coreference Right
Unless You Make It [18.148675498274866]
We propose schemes for incorporating explicit word-level gender inflection tags into Neural Machine Translation.
We find that simple existing approaches can over-generalize a gender-feature to multiple entities in a sentence.
We also propose an extension to assess translations of gender-neutral entities from English given a corresponding linguistic convention.
arXiv Detail & Related papers (2020-10-11T20:05:42Z) - Gender in Danger? Evaluating Speech Translation Technology on the
MuST-SHE Corpus [20.766890957411132]
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.
Can audio provide additional information to reduce gender bias?
We present the first thorough investigation of gender bias in speech translation, contributing with the release of a benchmark useful for future studies.
arXiv Detail & Related papers (2020-06-10T09:55:38Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.