Related papers: Mitigating Gender Bias in Machine Translation with Target Gender Annotations

Mitigating Gender Bias in Machine Translation with Target Gender Annotations

URL: http://arxiv.org/abs/2010.06203v2
Date: Sun, 18 Oct 2020 16:41:56 GMT
Title: Mitigating Gender Bias in Machine Translation with Target Gender Annotations
Authors: Art\=urs Stafanovi\v{c}s, Toms Bergmanis, M\=arcis Pinnis
Abstract summary: When translating "The secretary asked for details" to a language with grammatical gender, it might be necessary to determine the gender of the subject "secretary" In such cases, machine translation systems select the most common translation option, which often corresponds to the stereotypical translations. We argue that the information necessary for an adequate translation can not always be deduced from the sentence being translated. We present a method for training machine translation systems to use word-level annotations containing information about subject's gender.
Score: 3.3194866396158
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: When translating "The secretary asked for details." to a language with grammatical gender, it might be necessary to determine the gender of the subject "secretary". If the sentence does not contain the necessary information, it is not always possible to disambiguate. In such cases, machine translation systems select the most common translation option, which often corresponds to the stereotypical translations, thus potentially exacerbating prejudice and marginalisation of certain groups and people. We argue that the information necessary for an adequate translation can not always be deduced from the sentence being translated or even might depend on external knowledge. Therefore, in this work, we propose to decouple the task of acquiring the necessary information from the task of learning to translate correctly when such information is available. To that end, we present a method for training machine translation systems to use word-level annotations containing information about subject's gender. To prepare training data, we annotate regular source language words with grammatical gender information of the corresponding target language words. Using such data to train machine translation systems reduces their reliance on gender stereotypes when information about the subject's gender is available. Our experiments on five language pairs show that this allows improving accuracy on the WinoMT test set by up to 25.8 percentage points.

Related papers

Gender Bias in English-to-Greek Machine Translation [0.0]
We find persistent gender bias in translations by both Google Translate and DeepL.<n>GPT-4o shows promise, generating appropriate gendered and neutral alternatives for most ambiguous cases.
arXiv Detail & Related papers (2025-06-11T09:44:12Z)
Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks [0.25049267048783647]
This paper is a collaborative effort between Linguistics, Law, and Computer Science to evaluate stereotypes and biases in automated translation systems. We advocate gender-neutral translation as a means to promote gender inclusion and improve the objectivity of machine translation.
arXiv Detail & Related papers (2025-02-17T09:55:32Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages. The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text. We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation [28.471506840241602]
Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques. We propose a bias mitigation method based on a novel approach. Gender-Aware Contrastive Learning, GACL, encodes contextual gender information into the representations of non-explicit gender words.
arXiv Detail & Related papers (2023-05-23T12:53:39Z)
The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations. An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z)
Mitigating Gender Bias in Machine Translation through Adversarial Learning [0.8883733362171032]
We present an adversarial learning framework that addresses challenges to mitigate gender bias in seq2seq machine translation. Our framework improves the disparity in translation quality for sentences with male vs. female entities by 86% for English-German translation and 91% for English-French translation.
arXiv Detail & Related papers (2022-03-20T23:35:09Z)
Improving Gender Translation Accuracy with Filtered Self-Training [14.938401898546548]
Machine translation systems often output incorrect gender, even when the gender is clear from context. We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs.
arXiv Detail & Related papers (2021-04-15T18:05:29Z)
They, Them, Theirs: Rewriting with Gender-Neutral English [56.14842450974887]
We perform a case study on the singular they, a common way to promote gender inclusion in English. We show how a model can be trained to produce gender-neutral English with 1% word error rate with no human-labeled data.
arXiv Detail & Related papers (2021-02-12T21:47:48Z)
Neural Machine Translation Doesn't Translate Gender Coreference Right Unless You Make It [18.148675498274866]
We propose schemes for incorporating explicit word-level gender inflection tags into Neural Machine Translation. We find that simple existing approaches can over-generalize a gender-feature to multiple entities in a sentence. We also propose an extension to assess translations of gender-neutral entities from English given a corresponding linguistic convention.
arXiv Detail & Related papers (2020-10-11T20:05:42Z)
Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus [20.766890957411132]
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with the release of a benchmark useful for future studies.
arXiv Detail & Related papers (2020-06-10T09:55:38Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.