Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models
- URL: http://arxiv.org/abs/2408.07665v1
- Date: Wed, 14 Aug 2024 16:55:06 GMT
- Title: Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models
- Authors: Yi-Cheng Lin, Wei-Chih Chen, Hung-yi Lee,
- Abstract summary: This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
- Score: 50.40276881893513
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Warning: This paper may contain texts with uncomfortable content. Large Language Models (LLMs) have achieved remarkable performance in various tasks, including those involving multimodal data like speech. However, these models often exhibit biases due to the nature of their training data. Recently, more Speech Large Language Models (SLLMs) have emerged, underscoring the urgent need to address these biases. This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in SLLMs. By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases. Our experiments reveal significant insights into their performance and bias levels. The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
Related papers
- Gender Bias in Instruction-Guided Speech Synthesis Models [55.2480439325792]
This study investigates the potential gender bias in how models interpret occupation-related prompts.
We explore whether these models exhibit tendencies to amplify gender stereotypes when interpreting such prompts.
Our experimental results reveal the model's tendency to exhibit gender bias for certain occupations.
arXiv Detail & Related papers (2025-02-08T17:38:24Z) - How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs.
Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z) - Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models [38.64792118903994]
We evaluate gender bias in SILLMs across four semantic-related tasks.
Our analysis reveals that bias levels are language-dependent and vary with different evaluation methods.
arXiv Detail & Related papers (2024-07-09T15:35:43Z) - Pre-trained Speech Processing Models Contain Human-Like Biases that
Propagate to Speech Emotion Recognition [4.4212441764241]
We present the Speech Embedding Association Test (SpEAT), a method for detecting bias in one type of model used for many speech tasks: pre-trained models.
Using the SpEAT, we test for six types of bias in 16 English speech models.
Our work provides evidence that, like text and image-based models, pre-trained speech based-models frequently learn human-like biases.
arXiv Detail & Related papers (2023-10-29T02:27:56Z) - Exposing Bias in Online Communities through Large-Scale Language Models [3.04585143845864]
This work uses the flaw of bias in language models to explore the biases of six different online communities.
The bias of the resulting models is evaluated by prompting the models with different demographics and comparing the sentiment and toxicity values of these generations.
This work not only affirms how easily bias is absorbed from training data but also presents a scalable method to identify and compare the bias of different datasets or communities.
arXiv Detail & Related papers (2023-06-04T08:09:26Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - ASR4REAL: An extended benchmark for speech models [19.348785785921446]
We introduce a set of benchmarks matching real-life conditions, aimed at spotting possible biases and weaknesses in models.
We have found out that even though recent models do not seem to exhibit a gender bias, they usually show important performance discrepancies by accent.
All tested models show a strong performance drop when tested on conversational speech.
arXiv Detail & Related papers (2021-10-16T14:34:25Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language
Models [17.90351661475405]
This work extends text-based bias analysis methods to investigate multimodal language models.
We demonstrate that VL-BERT exhibits gender biases, often preferring to reinforce a stereotype over faithfully describing the visual scene.
arXiv Detail & Related papers (2021-04-18T00:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.