The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese
- URL: http://arxiv.org/abs/2402.07513v1
- Date: Mon, 12 Feb 2024 09:35:13 GMT
- Title: The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese
- Authors: Ajinkya Kulkarni, Anna Tokareva, Rameez Qureshi, Miguel Couceiro
- Abstract summary: This study is dedicated to a comprehensive exploration of the Whisper and MMS systems.
Our investigation encompasses various categories, including gender, age, skin tone color, and geo-location.
We empirically show that oversampling techniques alleviate such stereotypical biases.
- Score: 5.308321515594125
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the field of spoken language understanding, systems like Whisper and
Multilingual Massive Speech (MMS) have shown state-of-the-art performances.
This study is dedicated to a comprehensive exploration of the Whisper and MMS
systems, with a focus on assessing biases in automatic speech recognition (ASR)
inherent to casual conversation speech specific to the Portuguese language. Our
investigation encompasses various categories, including gender, age, skin tone
color, and geo-location. Alongside traditional ASR evaluation metrics such as
Word Error Rate (WER), we have incorporated p-value statistical significance
for gender bias analysis. Furthermore, we extensively examine the impact of
data distribution and empirically show that oversampling techniques alleviate
such stereotypical biases. This research represents a pioneering effort in
quantifying biases in the Portuguese language context through the application
of MMS and Whisper, contributing to a better understanding of ASR systems'
performance in multilingual settings.
Related papers
- Reexamining Racial Disparities in Automatic Speech Recognition Performance: The Role of Confounding by Provenance [7.882996636086014]
It is important that automatic speech recognition (ASR) models and their use is fair and equitable.
The current study seeks to understand the factors underlying this disparity by examining the performance of the current state-of-the-art neural network based ASR system.
arXiv Detail & Related papers (2024-07-19T02:14:17Z) - Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models [38.64792118903994]
We evaluate gender bias in SILLMs across four semantic-related tasks.
Our analysis reveals that bias levels are language-dependent and vary with different evaluation methods.
arXiv Detail & Related papers (2024-07-09T15:35:43Z) - An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios [76.11409260727459]
This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system.
We demonstrate that the similarity in phonetics between the pre-training and target languages, as well as the language category, affects the target language's adaptation performance.
arXiv Detail & Related papers (2024-06-13T08:16:52Z) - Multi-EuP: The Multilingual European Parliament Dataset for Analysis of
Bias in Information Retrieval [62.82448161570428]
This dataset is designed to investigate fairness in a multilingual information retrieval context.
It boasts an authentic multilingual corpus, featuring topics translated into all 24 languages.
It offers rich demographic information associated with its documents, facilitating the study of demographic bias.
arXiv Detail & Related papers (2023-11-03T12:29:11Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - A Deep Dive into the Disparity of Word Error Rates Across Thousands of
NPTEL MOOC Videos [4.809236881780707]
We describe the curation of a massive speech dataset of 8740 hours consisting of $sim9.8$K technical lectures in the English language along with their transcripts delivered by instructors representing various parts of Indian demography.
We use the curated dataset to measure the existing disparity in YouTube Automatic Captions and OpenAI Whisper model performance across the diverse demographic traits of speakers in India.
arXiv Detail & Related papers (2023-07-20T05:03:00Z) - BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric [66.73705349465207]
End-to-end speech-to-speech translation (S2ST) is generally evaluated with text-based metrics.
We propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems.
arXiv Detail & Related papers (2022-12-16T14:00:26Z) - Language Dependencies in Adversarial Attacks on Speech Recognition
Systems [0.0]
We compare the attackability of a German and an English ASR system.
We investigate if one of the language models is more susceptible to manipulations than the other.
arXiv Detail & Related papers (2022-02-01T13:27:40Z) - Quantifying Bias in Automatic Speech Recognition [28.301997555189462]
This paper quantifies the bias of a Dutch SotA ASR system against gender, age, regional accents and non-native accents.
Based on our findings, we suggest bias mitigation strategies for ASR development.
arXiv Detail & Related papers (2021-03-28T12:52:03Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.
We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.