Evaluating Gender Bias in Speech Translation
- URL: http://arxiv.org/abs/2010.14465v4
- Date: Sat, 14 May 2022 10:44:05 GMT
- Title: Evaluating Gender Bias in Speech Translation
- Authors: Marta R. Costa-juss\`a and Christine Basta and Gerard I. G\'allego
- Abstract summary: This paper introduces WinoST, a new freely available challenge set for evaluating gender bias in speech translation.
Using a state-of-the-art end-to-end speech translation system, we report the gender bias evaluation on four language pairs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The scientific community is increasingly aware of the necessity to embrace
pluralism and consistently represent major and minor social groups. Currently,
there are no standard evaluation techniques for different types of biases.
Accordingly, there is an urgent need to provide evaluation sets and protocols
to measure existing biases in our automatic systems. Evaluating the biases
should be an essential step towards mitigating them in the systems.
This paper introduces WinoST, a new freely available challenge set for
evaluating gender bias in speech translation. WinoST is the speech version of
WinoMT which is a MT challenge set and both follow an evaluation protocol to
measure gender accuracy. Using a state-of-the-art end-to-end speech translation
system, we report the gender bias evaluation on four language pairs and we show
that gender accuracy in speech translation is more than 23% lower than in MT.
Related papers
- Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation [28.01631390361754]
This paper is the first to investigate gender bias in quality estimation (QE) metrics and its downstream impact on machine translation (MT)
Masculine-inflected translations score higher than feminine-inflected ones, and gender-neutral translations are penalized.
We show that QE metrics can perpetuate gender bias in MT systems when used in quality-aware decoding.
arXiv Detail & Related papers (2024-10-14T18:24:52Z) - GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Don't Overlook the Grammatical Gender: Bias Evaluation for Hindi-English
Machine Translation [0.0]
Existing evaluation benchmarks primarily focus on English as the source language of translation.
For source languages other than English, studies often employ gender-neutral sentences for bias evaluation.
We emphasise the significance of tailoring bias evaluation test sets to account for grammatical gender markers in the source language.
arXiv Detail & Related papers (2023-11-11T09:28:43Z) - Gender Inflected or Bias Inflicted: On Using Grammatical Gender Cues for
Bias Evaluation in Machine Translation [0.0]
We use Hindi as the source language and construct two sets of gender-specific sentences to evaluate different Hindi-English (HI-EN) NMT systems.
Our work highlights the importance of considering the nature of language when designing such extrinsic bias evaluation datasets.
arXiv Detail & Related papers (2023-11-07T07:09:59Z) - Social Biases in Automatic Evaluation Metrics for NLG [53.76118154594404]
We propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics.
We construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks.
arXiv Detail & Related papers (2022-10-17T08:55:26Z) - Decoding and Diversity in Machine Translation [90.33636694717954]
We characterize differences between cost diversity paid for the BLEU scores enjoyed by NMT.
Our study implicates search as a salient source of known bias when translating gender pronouns.
arXiv Detail & Related papers (2020-11-26T21:09:38Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - Reducing Gender Bias in Neural Machine Translation as a Domain
Adaptation Problem [21.44025591721678]
Training data for NLP tasks often exhibits gender bias in that fewer sentences refer to women than to men.
Recent WinoMT challenge set allows us to measure this effect directly.
We use transfer learning on a small set of trusted, gender-balanced examples.
arXiv Detail & Related papers (2020-04-09T11:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.