Related papers: Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation

Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation

URL: http://arxiv.org/abs/2305.14016v2
Date: Fri, 10 Nov 2023 04:35:23 GMT
Title: Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation
Authors: Minwoo Lee, Hyukhun Koh, Kang-il Lee, Dongdong Zhang, Minsung Kim, Kyomin Jung
Abstract summary: Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques. We propose a bias mitigation method based on a novel approach. Gender-Aware Contrastive Learning, GACL, encodes contextual gender information into the representations of non-explicit gender words.
Score: 28.471506840241602
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques. However, most works focus on debiasing bilingual models without much consideration for multilingual systems. In this paper, we specifically target the gender bias issue of multilingual machine translation models for unambiguous cases where there is a single correct translation, and propose a bias mitigation method based on a novel approach. Specifically, we propose Gender-Aware Contrastive Learning, GACL, which encodes contextual gender information into the representations of non-explicit gender words. Our method is target language-agnostic and is applicable to pre-trained multilingual machine translation models via fine-tuning. Through multilingual evaluation, we show that our approach improves gender accuracy by a wide margin without hampering translation performance. We also observe that incorporated gender information transfers and benefits other target languages regarding gender accuracy. Finally, we demonstrate that our method is applicable and beneficial to models of various sizes.

Related papers

EuroGEST: Investigating gender stereotypes in multilingual language models [53.88459905621724]
Large language models increasingly support multiple languages, yet most benchmarks for gender bias remain English-centric.<n>We introduce EuroGEST, a dataset designed to measure gender-stereotypical reasoning in LLMs across English and 29 European languages.
arXiv Detail & Related papers (2025-06-04T11:58:18Z)
Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation [5.482673673984126]
Mitigation of biases, such as language models' reliance on gender stereotypes, is a crucial endeavor required for the creation of reliable and useful language technology. We introduce a streamlined Dual Dabiasing Algorithm through Model Adaptation (2DAMA) We show that 2DAMA effectively reduces gender bias in English and is one of the first approaches to the mitigation of stereotypical tendencies in translation.
arXiv Detail & Related papers (2025-01-17T12:23:30Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora [9.959039325564744]
Gender bias in text corpora can lead to perpetuation and amplification of societal inequalities. Existing methods to measure gender representation bias in text corpora have mainly been proposed for English. This paper introduces a novel methodology to quantitatively measure gender representation bias in Spanish corpora.
arXiv Detail & Related papers (2024-06-19T16:30:58Z)
What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models [8.618945530676614]
This paper proposes an approach to estimate gender bias in multilingual lexicons from 5 languages: Chinese, English, German, Portuguese, and Spanish. A novel model-based method is presented to generate sentence pairs for a more robust analysis of gender bias. Our results suggest that gender bias should be studied on a large dataset using multiple evaluation metrics for best practice.
arXiv Detail & Related papers (2024-04-09T21:12:08Z)
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do. We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models. Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z)
A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation [35.44115368160656]
We investigate whether and to what extent machine translation models exhibit gender bias. We find that IFT models default to male-inflected translations, even disregarding female occupational stereotypes. We propose an easy-to-implement and effective bias mitigation solution.
arXiv Detail & Related papers (2023-10-18T17:36:55Z)
Gender Lost In Translation: How Bridging The Gap Between Languages Affects Gender Bias in Zero-Shot Multilingual Translation [12.376309678270275]
bridging the gap between languages for which parallel data is not available affects gender bias in multilingual NMT. We study the effect of encouraging language-agnostic hidden representations on models' ability to preserve gender. We find that language-agnostic representations mitigate zero-shot models' masculine bias, and with increased levels of gender inflection in the bridge language, pivoting surpasses zero-shot translation regarding fairer gender preservation for speaker-related gender agreement.
arXiv Detail & Related papers (2023-05-26T13:51:50Z)
The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations. An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z)
Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study. We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z)
Improving Gender Translation Accuracy with Filtered Self-Training [14.938401898546548]
Machine translation systems often output incorrect gender, even when the gender is clear from context. We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs.
arXiv Detail & Related papers (2021-04-15T18:05:29Z)
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications. We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.