Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for
Bias Evaluation in Machine Translation
- URL: http://arxiv.org/abs/2311.03767v1
- Date: Tue, 7 Nov 2023 07:09:59 GMT
- Title: Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for
Bias Evaluation in Machine Translation
- Authors: Pushpdeep Singh
- Abstract summary: We use Hindi as the source language and construct two sets of gender-specific sentences to evaluate different Hindi-English (HI-EN) NMT systems.
Our work highlights the importance of considering the nature of language when designing such extrinsic bias evaluation datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Machine Translation (NMT) models are state-of-the-art for machine
translation. However, these models are known to have various social biases,
especially gender bias. Most of the work on evaluating gender bias in NMT has
focused primarily on English as the source language. For source languages
different from English, most of the studies use gender-neutral sentences to
evaluate gender bias. However, practically, many sentences that we encounter do
have gender information. Therefore, it makes more sense to evaluate for bias
using such sentences. This allows us to determine if NMT models can identify
the correct gender based on the grammatical gender cues in the source sentence
rather than relying on biased correlations with, say, occupation terms. To
demonstrate our point, in this work, we use Hindi as the source language and
construct two sets of gender-specific sentences: OTSC-Hindi and WinoMT-Hindi
that we use to evaluate different Hindi-English (HI-EN) NMT systems
automatically for gender bias. Our work highlights the importance of
considering the nature of language when designing such extrinsic bias
evaluation datasets.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Don't Overlook the Grammatical Gender: Bias Evaluation for Hindi-English
Machine Translation [0.0]
Existing evaluation benchmarks primarily focus on English as the source language of translation.
For source languages other than English, studies often employ gender-neutral sentences for bias evaluation.
We emphasise the significance of tailoring bias evaluation test sets to account for grammatical gender markers in the source language.
arXiv Detail & Related papers (2023-11-11T09:28:43Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Efficient Gender Debiasing of Pre-trained Indic Language Models [0.0]
The gender bias present in the data on which language models are pre-trained gets reflected in the systems that use these models.
In our paper, we measure gender bias associated with occupations in Hindi language models.
Our results reflect that the bias is reduced post-introduction of our proposed mitigation techniques.
arXiv Detail & Related papers (2022-09-08T09:15:58Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - Mitigating Gender Stereotypes in Hindi and Marathi [1.2891210250935146]
This paper evaluates the gender stereotypes in Hindi and Marathi languages.
We create a dataset of neutral and gendered occupation words, emotion words and measure bias with the help of Embedding Coherence Test (ECT) and Relative Norm Distance (RND)
Experiments show that our proposed debiasing techniques reduce gender bias in these languages.
arXiv Detail & Related papers (2022-05-12T06:46:53Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.