No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech
Recognition through Pitch Manipulation
- URL: http://arxiv.org/abs/2310.06590v1
- Date: Tue, 10 Oct 2023 12:55:22 GMT
- Title: No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech
Recognition through Pitch Manipulation
- Authors: Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa
Bentivogli
- Abstract summary: We propose a data augmentation technique that manipulates the fundamental frequency (f0) and formants.
This technique reduces the data unbalance among genders by simulating voices of the under-represented female speakers.
Experiments on spontaneous English speech show that our technique yields a relative WER improvement up to 9.87% for utterances by female speakers.
- Score: 20.731375136671605
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Automatic speech recognition (ASR) systems are known to be sensitive to the
sociolinguistic variability of speech data, in which gender plays a crucial
role. This can result in disparities in recognition accuracy between male and
female speakers, primarily due to the under-representation of the latter group
in the training data. While in the context of hybrid ASR models several
solutions have been proposed, the gender bias issue has not been explicitly
addressed in end-to-end neural architectures. To fill this gap, we propose a
data augmentation technique that manipulates the fundamental frequency (f0) and
formants. This technique reduces the data unbalance among genders by simulating
voices of the under-represented female speakers and increases the variability
within each gender group. Experiments on spontaneous English speech show that
our technique yields a relative WER improvement up to 9.87% for utterances by
female speakers, with larger gains for the least-represented f0 ranges.
Related papers
- GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps [25.95711246919163]
Current automatic speech recognition (ASR) models are designed to be used across many languages and tasks without substantial changes.
Our study systematically evaluates the performance of two widely used multilingual ASR models on three datasets.
Our findings reveal clear gender disparities, with the advantaged group varying across languages and models.
arXiv Detail & Related papers (2024-02-28T00:24:29Z) - Integrating Language Models into Direct Speech Translation: An
Inference-Time Solution to Control Gender Inflection [23.993869026482415]
We propose the first inference-time solution to control speaker-related gender inflections in speech translation.
Our solution partially replaces the (biased) internal language model (LM) implicitly learned by the ST decoder with gender-specific external LMs.
arXiv Detail & Related papers (2023-10-24T11:55:16Z) - How To Build Competitive Multi-gender Speech Translation Models For
Controlling Speaker Gender Translation [21.125217707038356]
When translating from notional gender languages into grammatical gender languages, the generated translation requires explicit gender assignments for various words, including those referring to the speaker.
To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender.
This paper aims to achieve the same results by integrating the speaker's gender metadata into a single "multi-gender" neural ST model, easier to maintain.
arXiv Detail & Related papers (2023-10-23T17:21:32Z) - The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender
Characterisation in 55 Languages [51.2321117760104]
This paper describes the Gender-GAP Pipeline, an automatic pipeline to characterize gender representation in large-scale datasets for 55 languages.
The pipeline uses a multilingual lexicon of gendered person-nouns to quantify the gender representation in text.
We showcase it to report gender representation in WMT training data and development data for the News task, confirming that current data is skewed towards masculine representation.
arXiv Detail & Related papers (2023-08-31T17:20:50Z) - Elucidate Gender Fairness in Singing Voice Transcription [5.434559527051845]
We investigate whether gender-based characteristics lead to a performance disparity in singing voice transcription (SVT)
We find that different pitch distributions, rather than gender data imbalance, contribute to this disparity.
To address this issue, we propose using an attribute predictor to predict gender labels and adversarially training the SVT system to enforce the gender-invariance of acoustic representations.
arXiv Detail & Related papers (2023-08-05T15:15:01Z) - Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models.
By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes.
We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z) - Overlapped speech and gender detection with WavLM pre-trained features [6.054285771277486]
This article focuses on overlapped speech and gender detection in order to study interactions between women and men in French audiovisual media.
We propose to use WavLM model which has the advantage of being pre-trained on a huge amount of speech data.
A neural GD is trained with WavLM inputs on a gender balanced subset of the French broadcast news ALLIES data, and obtains an accuracy of 97.9%.
arXiv Detail & Related papers (2022-09-09T08:00:47Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.