Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation
- URL: http://arxiv.org/abs/2401.04972v2
- Date: Fri, 12 Jul 2024 06:48:09 GMT
- Title: Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation
- Authors: Ian Stewart, Rada Mihalcea,
- Abstract summary: Machine translation often suffers from biased data and algorithms that can lead to unacceptable errors in system output.
We investigate the degree of bias against same-gender relationships in MT systems.
We find that three popular MT services consistently fail to accurately translate sentences concerning relationships between entities of the same gender.
- Score: 26.676686759877597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine translation often suffers from biased data and algorithms that can lead to unacceptable errors in system output. While bias in gender norms has been investigated, less is known about whether MT systems encode bias about social relationships, e.g., "the lawyer kissed her wife." We investigate the degree of bias against same-gender relationships in MT systems, using generated template sentences drawn from several noun-gender languages (e.g., Spanish) and comprised of popular occupation nouns. We find that three popular MT services consistently fail to accurately translate sentences concerning relationships between entities of the same gender. The error rate varies considerably based on the context, and same-gender sentences referencing high female-representation occupations are translated with lower accuracy. We provide this work as a case study in the evaluation of intrinsic bias in NLP systems with respect to social relationships.
Related papers
- Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation [28.01631390361754]
Masculine-inflected translations score higher than feminine-inflected ones, and gender-neutral translations are penalized.
context-aware QE metrics reduce errors for masculine-inflected references but fail to address feminine referents.
Our findings underscore the need to address gender bias in QE metrics to ensure equitable and unbiased machine translation systems.
arXiv Detail & Related papers (2024-10-14T18:24:52Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Don't Overlook the Grammatical Gender: Bias Evaluation for Hindi-English
Machine Translation [0.0]
Existing evaluation benchmarks primarily focus on English as the source language of translation.
For source languages other than English, studies often employ gender-neutral sentences for bias evaluation.
We emphasise the significance of tailoring bias evaluation test sets to account for grammatical gender markers in the source language.
arXiv Detail & Related papers (2023-11-11T09:28:43Z) - Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for
Bias Evaluation in Machine Translation [0.0]
We use Hindi as the source language and construct two sets of gender-specific sentences to evaluate different Hindi-English (HI-EN) NMT systems.
Our work highlights the importance of considering the nature of language when designing such extrinsic bias evaluation datasets.
arXiv Detail & Related papers (2023-11-07T07:09:59Z) - ''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT
Generated English Text [11.085070600065801]
Language serves as a powerful tool for the manifestation of societal belief systems.
Gender bias is one of the most pervasive biases in our society.
We create the first dataset of GPT-generated English text with normative ratings of gender bias.
arXiv Detail & Related papers (2023-10-26T14:34:06Z) - A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for
Fairer Instruction-Tuned Machine Translation [35.44115368160656]
We investigate whether and to what extent machine translation models exhibit gender bias.
We find that IFT models default to male-inflected translations, even disregarding female occupational stereotypes.
We propose an easy-to-implement and effective bias mitigation solution.
arXiv Detail & Related papers (2023-10-18T17:36:55Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.