How To Build Competitive Multi-gender Speech Translation Models For
Controlling Speaker Gender Translation
- URL: http://arxiv.org/abs/2310.15114v1
- Date: Mon, 23 Oct 2023 17:21:32 GMT
- Title: How To Build Competitive Multi-gender Speech Translation Models For
Controlling Speaker Gender Translation
- Authors: Marco Gaido, Dennis Fucci, Matteo Negri and Luisa Bentivogli
- Abstract summary: When translating from notional gender languages into grammatical gender languages, the generated translation requires explicit gender assignments for various words, including those referring to the speaker.
To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender.
This paper aims to achieve the same results by integrating the speaker's gender metadata into a single "multi-gender" neural ST model, easier to maintain.
- Score: 21.125217707038356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When translating from notional gender languages (e.g., English) into
grammatical gender languages (e.g., Italian), the generated translation
requires explicit gender assignments for various words, including those
referring to the speaker. When the source sentence does not convey the
speaker's gender, speech translation (ST) models either rely on the
possibly-misleading vocal traits of the speaker or default to the masculine
gender, the most frequent in existing training corpora. To avoid such biased
and not inclusive behaviors, the gender assignment of speaker-related
expressions should be guided by externally-provided metadata about the
speaker's gender. While previous work has shown that the most effective
solution is represented by separate, dedicated gender-specific models, the goal
of this paper is to achieve the same results by integrating the speaker's
gender metadata into a single "multi-gender" neural ST model, easier to
maintain. Our experiments demonstrate that a single multi-gender model
outperforms gender-specialized ones when trained from scratch (with gender
accuracy gains up to 12.9 for feminine forms), while fine-tuning from existing
ST models does not lead to competitive results.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - Integrating Language Models into Direct Speech Translation: An
Inference-Time Solution to Control Gender Inflection [23.993869026482415]
We propose the first inference-time solution to control speaker-related gender inflections in speech translation.
Our solution partially replaces the (biased) internal language model (LM) implicitly learned by the ST decoder with gender-specific external LMs.
arXiv Detail & Related papers (2023-10-24T11:55:16Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Gender Lost In Translation: How Bridging The Gap Between Languages
Affects Gender Bias in Zero-Shot Multilingual Translation [12.376309678270275]
bridging the gap between languages for which parallel data is not available affects gender bias in multilingual NMT.
We study the effect of encouraging language-agnostic hidden representations on models' ability to preserve gender.
We find that language-agnostic representations mitigate zero-shot models' masculine bias, and with increased levels of gender inflection in the bridge language, pivoting surpasses zero-shot translation regarding fairer gender preservation for speaker-related gender agreement.
arXiv Detail & Related papers (2023-05-26T13:51:50Z) - Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study.
We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z) - Breeding Gender-aware Direct Speech Translation Systems [14.955696163410254]
We show that gender-aware direct ST solutions can significantly outperform strong - but gender-unaware - direct ST models.
The translation of gender-marked words can increase up to 30 points in accuracy while preserving overall translation quality.
arXiv Detail & Related papers (2020-12-09T10:18:03Z) - Gender in Danger? Evaluating Speech Translation Technology on the
MuST-SHE Corpus [20.766890957411132]
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.
Can audio provide additional information to reduce gender bias?
We present the first thorough investigation of gender bias in speech translation, contributing with the release of a benchmark useful for future studies.
arXiv Detail & Related papers (2020-06-10T09:55:38Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.