Related papers: Investigating Markers and Drivers of Gender Bias in Machine Translations

Investigating Markers and Drivers of Gender Bias in Machine Translations

URL: http://arxiv.org/abs/2403.11896v2
Date: Tue, 2 Apr 2024 10:07:39 GMT
Title: Investigating Markers and Drivers of Gender Bias in Machine Translations
Authors: Peter J Barclay, Ashkan Sami,
Abstract summary: Implicit gender bias in large language models (LLMs) is a well-documented problem. We use the DeepL translation API to investigate the bias evinced when repeatedly translating a set of 56 Software Engineering tasks. We find that some languages display similar patterns of pronoun use, falling into three loose groups. We identify the main verb appearing in a sentence as a likely significant driver of implied gender in the translations.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Implicit gender bias in Large Language Models (LLMs) is a well-documented problem, and implications of gender introduced into automatic translations can perpetuate real-world biases. However, some LLMs use heuristics or post-processing to mask such bias, making investigation difficult. Here, we examine bias in LLMss via back-translation, using the DeepL translation API to investigate the bias evinced when repeatedly translating a set of 56 Software Engineering tasks used in a previous study. Each statement starts with 'she', and is translated first into a 'genderless' intermediate language then back into English; we then examine pronoun-choice in the back-translated texts. We expand prior research in the following ways: (1) by comparing results across five intermediate languages, namely Finnish, Indonesian, Estonian, Turkish and Hungarian; (2) by proposing a novel metric for assessing the variation in gender implied in the repeated translations, avoiding the over-interpretation of individual pronouns, apparent in earlier work; (3) by investigating sentence features that drive bias; (4) and by comparing results from three time-lapsed datasets to establish the reproducibility of the approach. We found that some languages display similar patterns of pronoun use, falling into three loose groups, but that patterns vary between groups; this underlines the need to work with multiple languages. We also identify the main verb appearing in a sentence as a likely significant driver of implied gender in the translations. Moreover, we see a good level of replicability in the results, and establish that our variation metric proves robust despite an obvious change in the behaviour of the DeepL translation API during the course of the study. These results show that the back-translation method can provide further insights into bias in language models.

Related papers

Exploring Gender Bias in Large Language Models: An In-depth Dive into the German Language [21.87606488958834]
We present five German datasets for gender bias evaluation in large language models (LLMs)<n>The datasets are grounded in well-established concepts of gender bias and are accessible through multiple methodologies.<n>Our findings, reported for eight multilingual LLM models, reveal unique challenges associated with gender bias in German.
arXiv Detail & Related papers (2025-07-22T13:09:41Z)
The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification [57.06913662622832]
Gender-fair language fosters inclusion by addressing all genders or using neutral forms. Gender-fair language substantially impacts predictions by flipping labels, reducing certainty, and altering attention patterns. While we offer initial insights on the effect on German text classification, the findings likely apply to other languages.
arXiv Detail & Related papers (2024-09-26T15:08:17Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency in Large Language Model (LLM)-generated content.<n>We introduce the Language Agency Bias Evaluation benchmark, which comprehensively evaluates biases in LLMs.<n>Using LABE, we unveil language agency social biases in 3 recent LLMs: ChatGPT, Llama3, and Mistral.
arXiv Detail & Related papers (2024-04-16T12:27:54Z)
What is Your Favorite Gender, MLM? Gender Bias Evaluation in Multilingual Masked Language Models [8.618945530676614]
This paper proposes an approach to estimate gender bias in multilingual lexicons from 5 languages: Chinese, English, German, Portuguese, and Spanish. A novel model-based method is presented to generate sentence pairs for a more robust analysis of gender bias. Our results suggest that gender bias should be studied on a large dataset using multiple evaluation metrics for best practice.
arXiv Detail & Related papers (2024-04-09T21:12:08Z)
Gender Bias in Large Language Models across Multiple Languages [10.068466432117113]
We examine gender bias in large language models (LLMs) generated for different languages. We use three measurements: 1) gender bias in selecting descriptive words given the gender-related context. 2) gender bias in selecting gender-related pronouns (she/he) given the descriptive words.
arXiv Detail & Related papers (2024-03-01T04:47:16Z)
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do. We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models. Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z)
A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation [35.44115368160656]
We investigate whether and to what extent machine translation models exhibit gender bias. We find that IFT models default to male-inflected translations, even disregarding female occupational stereotypes. We propose an easy-to-implement and effective bias mitigation solution.
arXiv Detail & Related papers (2023-10-18T17:36:55Z)
Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation [28.471506840241602]
Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques. We propose a bias mitigation method based on a novel approach. Gender-Aware Contrastive Learning, GACL, encodes contextual gender information into the representations of non-explicit gender words.
arXiv Detail & Related papers (2023-05-23T12:53:39Z)
A Survey on Zero Pronoun Translation [69.09774294082965]
Zero pronouns (ZPs) are frequently omitted in pro-drop languages, but should be recalled in non-pro-drop languages. This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution. We uncover a number of insightful findings such as: 1) ZPT is in line with the development trend of large language model; 2) data limitation causes learning bias in languages and domains; 3) performance improvements are often reported on single benchmarks, but advanced methods are still far from real-world use.
arXiv Detail & Related papers (2023-05-17T13:19:01Z)
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications. We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.