Mitigating Gender Bias in Machine Translation with Target Gender
Annotations
- URL: http://arxiv.org/abs/2010.06203v2
- Date: Sun, 18 Oct 2020 16:41:56 GMT
- Title: Mitigating Gender Bias in Machine Translation with Target Gender
Annotations
- Authors: Art\=urs Stafanovi\v{c}s, Toms Bergmanis, M\=arcis Pinnis
- Abstract summary: When translating "The secretary asked for details" to a language with grammatical gender, it might be necessary to determine the gender of the subject "secretary"
In such cases, machine translation systems select the most common translation option, which often corresponds to the stereotypical translations.
We argue that the information necessary for an adequate translation can not always be deduced from the sentence being translated.
We present a method for training machine translation systems to use word-level annotations containing information about subject's gender.
- Score: 3.3194866396158
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: When translating "The secretary asked for details." to a language with
grammatical gender, it might be necessary to determine the gender of the
subject "secretary". If the sentence does not contain the necessary
information, it is not always possible to disambiguate. In such cases, machine
translation systems select the most common translation option, which often
corresponds to the stereotypical translations, thus potentially exacerbating
prejudice and marginalisation of certain groups and people. We argue that the
information necessary for an adequate translation can not always be deduced
from the sentence being translated or even might depend on external knowledge.
Therefore, in this work, we propose to decouple the task of acquiring the
necessary information from the task of learning to translate correctly when
such information is available. To that end, we present a method for training
machine translation systems to use word-level annotations containing
information about subject's gender. To prepare training data, we annotate
regular source language words with grammatical gender information of the
corresponding target language words. Using such data to train machine
translation systems reduces their reliance on gender stereotypes when
information about the subject's gender is available. Our experiments on five
language pairs show that this allows improving accuracy on the WinoMT test set
by up to 25.8 percentage points.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in
Multilingual Machine Translation [28.471506840241602]
Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques.
We propose a bias mitigation method based on a novel approach.
Gender-Aware Contrastive Learning, GACL, encodes contextual gender information into the representations of non-explicit gender words.
arXiv Detail & Related papers (2023-05-23T12:53:39Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Mitigating Gender Bias in Machine Translation through Adversarial
Learning [0.8883733362171032]
We present an adversarial learning framework that addresses challenges to mitigate gender bias in seq2seq machine translation.
Our framework improves the disparity in translation quality for sentences with male vs. female entities by 86% for English-German translation and 91% for English-French translation.
arXiv Detail & Related papers (2022-03-20T23:35:09Z) - Improving Gender Translation Accuracy with Filtered Self-Training [14.938401898546548]
Machine translation systems often output incorrect gender, even when the gender is clear from context.
We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs.
arXiv Detail & Related papers (2021-04-15T18:05:29Z) - They, Them, Theirs: Rewriting with Gender-Neutral English [56.14842450974887]
We perform a case study on the singular they, a common way to promote gender inclusion in English.
We show how a model can be trained to produce gender-neutral English with 1% word error rate with no human-labeled data.
arXiv Detail & Related papers (2021-02-12T21:47:48Z) - Neural Machine Translation Doesn't Translate Gender Coreference Right
Unless You Make It [18.148675498274866]
We propose schemes for incorporating explicit word-level gender inflection tags into Neural Machine Translation.
We find that simple existing approaches can over-generalize a gender-feature to multiple entities in a sentence.
We also propose an extension to assess translations of gender-neutral entities from English given a corresponding linguistic convention.
arXiv Detail & Related papers (2020-10-11T20:05:42Z) - Gender in Danger? Evaluating Speech Translation Technology on the
MuST-SHE Corpus [20.766890957411132]
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.
Can audio provide additional information to reduce gender bias?
We present the first thorough investigation of gender bias in speech translation, contributing with the release of a benchmark useful for future studies.
arXiv Detail & Related papers (2020-06-10T09:55:38Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.