GOSt-MT: A Knowledge Graph for Occupation-related Gender Biases in Machine Translation
- URL: http://arxiv.org/abs/2409.10989v2
- Date: Fri, 4 Oct 2024 12:13:42 GMT
- Title: GOSt-MT: A Knowledge Graph for Occupation-related Gender Biases in Machine Translation
- Authors: Orfeas Menis Mastromichalakis, Giorgos Filandrianos, Eva Tsouparopoulou, Dimitris Parsanoglou, Maria Symeonaki, Giorgos Stamou,
- Abstract summary: Gender bias in machine translation (MT) systems poses significant challenges that often result in the reinforcement of harmful stereotypes.
This paper introduces a novel approach to studying occupation-related gender bias through the creation of the GOSt-MT Knowledge Graph.
- Score: 2.3154290513589784
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Gender bias in machine translation (MT) systems poses significant challenges that often result in the reinforcement of harmful stereotypes. Especially in the labour domain where frequently occupations are inaccurately associated with specific genders, such biases perpetuate traditional gender stereotypes with a significant impact on society. Addressing these issues is crucial for ensuring equitable and accurate MT systems. This paper introduces a novel approach to studying occupation-related gender bias through the creation of the GOSt-MT (Gender and Occupation Statistics for Machine Translation) Knowledge Graph. GOSt-MT integrates comprehensive gender statistics from real-world labour data and textual corpora used in MT training. This Knowledge Graph allows for a detailed analysis of gender bias across English, French, and Greek, facilitating the identification of persistent stereotypes and areas requiring intervention. By providing a structured framework for understanding how occupations are gendered in both labour markets and MT systems, GOSt-MT contributes to efforts aimed at making MT systems more equitable and reducing gender biases in automated translations.
Related papers
- Assumed Identities: Quantifying Gender Bias in Machine Translation of Gender-Ambiguous Occupational Terms [2.5764960393034615]
We introduce GRAPE, a probability-based metric designed to evaluate gender bias.<n>We present GAMBIT-MT, a benchmarking dataset in English with gender-ambiguous occupational terms.
arXiv Detail & Related papers (2025-03-06T12:16:14Z) - Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs) [82.57490175399693]
We study gender bias in 22 popular image-to-text vision-language assistants (VLAs)
Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances.
To eliminate the gender bias in these models, we find that finetuning-based debiasing methods achieve the best tradeoff between debiasing and retaining performance on downstream tasks.
arXiv Detail & Related papers (2024-10-25T05:59:44Z) - What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study [18.464888281674806]
Gender bias in machine translation (MT) is recognized as an issue that can harm people and society.
We conduct an extensive human-centered study to examine if and to what extent bias in MT brings harms with tangible costs.
arXiv Detail & Related papers (2024-10-01T09:38:34Z) - Generating Gender Alternatives in Machine Translation [13.153018685139413]
Machine translation systems often translate terms with ambiguous gender into the gendered form that is most prevalent in the systems' training data.
This often reflects and perpetuates harmful stereotypes present in society.
We study the problem of generating all grammatically correct gendered translation alternatives.
arXiv Detail & Related papers (2024-07-29T22:10:51Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for
Fairer Instruction-Tuned Machine Translation [35.44115368160656]
We investigate whether and to what extent machine translation models exhibit gender bias.
We find that IFT models default to male-inflected translations, even disregarding female occupational stereotypes.
We propose an easy-to-implement and effective bias mitigation solution.
arXiv Detail & Related papers (2023-10-18T17:36:55Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - Examining Covert Gender Bias: A Case Study in Turkish and English
Machine Translation Models [7.648784748888186]
We examine cases of both overt and covert gender bias in Machine Translation models.
Specifically, we introduce a method to investigate asymmetrical gender markings.
We also assess bias in the attribution of personhood and examine occupational and personality stereotypes.
arXiv Detail & Related papers (2021-08-23T19:25:56Z) - Extending Challenge Sets to Uncover Gender Bias in Machine Translation:
Impact of Stereotypical Verbs and Adjectives [0.45687771576879593]
State-of-the-art machine translation (MT) systems are trained on large corpora of text, mostly generated by humans.
Recent research showed that MT systems are biased towards stereotypical translation of occupations.
In this paper we present an extension of this challenge set, called WiBeMT, with gender-biased adjectives and adds sentences with gender-biased verbs.
arXiv Detail & Related papers (2021-07-24T11:22:10Z) - How True is GPT-2? An Empirical Analysis of Intersectional Occupational
Biases [50.591267188664666]
Downstream applications are at risk of inheriting biases contained in natural language models.
We analyze the occupational biases of a popular generative language model, GPT-2.
For a given job, GPT-2 reflects the societal skew of gender and ethnicity in the US, and in some cases, pulls the distribution towards gender parity.
arXiv Detail & Related papers (2021-02-08T11:10:27Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.