The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese
- URL: http://arxiv.org/abs/2402.07513v1
- Date: Mon, 12 Feb 2024 09:35:13 GMT
- Title: The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese
- Authors: Ajinkya Kulkarni, Anna Tokareva, Rameez Qureshi, Miguel Couceiro
- Abstract summary: This study is dedicated to a comprehensive exploration of the Whisper and MMS systems.
Our investigation encompasses various categories, including gender, age, skin tone color, and geo-location.
We empirically show that oversampling techniques alleviate such stereotypical biases.
- Score: 5.308321515594125
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the field of spoken language understanding, systems like Whisper and
Multilingual Massive Speech (MMS) have shown state-of-the-art performances.
This study is dedicated to a comprehensive exploration of the Whisper and MMS
systems, with a focus on assessing biases in automatic speech recognition (ASR)
inherent to casual conversation speech specific to the Portuguese language. Our
investigation encompasses various categories, including gender, age, skin tone
color, and geo-location. Alongside traditional ASR evaluation metrics such as
Word Error Rate (WER), we have incorporated p-value statistical significance
for gender bias analysis. Furthermore, we extensively examine the impact of
data distribution and empirically show that oversampling techniques alleviate
such stereotypical biases. This research represents a pioneering effort in
quantifying biases in the Portuguese language context through the application
of MMS and Whisper, contributing to a better understanding of ASR systems'
performance in multilingual settings.
Related papers
- Advocating Character Error Rate for Multilingual ASR Evaluation [1.2597747768235845]
We document the limitations of the word error rate (WER) as an evaluation metric and advocate for the character error rate (CER) as the primary metric.
We show that CER avoids many of the challenges WER faces and exhibits greater consistency across writing systems.
Our findings suggest that CER should be prioritized, or at least supplemented, in multilingual ASR evaluations to account for the varying linguistic characteristics of different languages.
arXiv Detail & Related papers (2024-10-09T19:57:07Z) - The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification [57.06913662622832]
Gender-fair language fosters inclusion by addressing all genders or using neutral forms.
Gender-fair language substantially impacts predictions by flipping labels, reducing certainty, and altering attention patterns.
While we offer initial insights on the effect on German text classification, the findings likely apply to other languages.
arXiv Detail & Related papers (2024-09-26T15:08:17Z) - Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models [38.64792118903994]
We evaluate gender bias in SILLMs across four semantic-related tasks.
Our analysis reveals that bias levels are language-dependent and vary with different evaluation methods.
arXiv Detail & Related papers (2024-07-09T15:35:43Z) - An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios [76.11409260727459]
This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system.
We demonstrate that the similarity in phonetics between the pre-training and target languages, as well as the language category, affects the target language's adaptation performance.
arXiv Detail & Related papers (2024-06-13T08:16:52Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - A Deep Dive into the Disparity of Word Error Rates Across Thousands of
NPTEL MOOC Videos [4.809236881780707]
We describe the curation of a massive speech dataset of 8740 hours consisting of $sim9.8$K technical lectures in the English language along with their transcripts delivered by instructors representing various parts of Indian demography.
We use the curated dataset to measure the existing disparity in YouTube Automatic Captions and OpenAI Whisper model performance across the diverse demographic traits of speakers in India.
arXiv Detail & Related papers (2023-07-20T05:03:00Z) - Language Dependencies in Adversarial Attacks on Speech Recognition
Systems [0.0]
We compare the attackability of a German and an English ASR system.
We investigate if one of the language models is more susceptible to manipulations than the other.
arXiv Detail & Related papers (2022-02-01T13:27:40Z) - Quantifying Bias in Automatic Speech Recognition [28.301997555189462]
This paper quantifies the bias of a Dutch SotA ASR system against gender, age, regional accents and non-native accents.
Based on our findings, we suggest bias mitigation strategies for ASR development.
arXiv Detail & Related papers (2021-03-28T12:52:03Z) - Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by
Ranking Algorithms [68.85295025020942]
We propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a Search Engines to support gender stereotypes.
GSR is the first specifically tailored measure for Information Retrieval, capable of quantifying representational harms.
arXiv Detail & Related papers (2020-09-02T20:45:04Z) - Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.
We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.