Related papers: Acoustic-based Gender Differentiation in Speech-aware Language Models

Acoustic-based Gender Differentiation in Speech-aware Language Models

URL: http://arxiv.org/abs/2509.21125v1
Date: Thu, 25 Sep 2025 13:15:01 GMT
Title: Acoustic-based Gender Differentiation in Speech-aware Language Models
Authors: Junhyuk Choi, Jihwan Seol, Nayeon Kim, Chanhee Cho, EunBin Cho, Bugeun Kim,
Abstract summary: Speech-aware Language Models (SpeechLMs) have fundamentally transformed human-AI interaction by enabling voice-based communication.<n>This paper propose a new dataset that enables systematic analysis of this phenomenon, containing 9,208 speech samples across three categories: Gender-Independent, Gender-Stereotypical, and Gender-Dependent.
Score: 3.9845890275228277
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Speech-aware Language Models (SpeechLMs) have fundamentally transformed human-AI interaction by enabling voice-based communication, yet they may exhibit acoustic-based gender differentiation where identical questions lead to different responses based on the speaker's gender. This paper propose a new dataset that enables systematic analysis of this phenomenon, containing 9,208 speech samples across three categories: Gender-Independent, Gender-Stereotypical, and Gender-Dependent. We further evaluated LLaMA-Omni series and discovered a paradoxical pattern; while overall responses seems identical regardless of gender, the pattern is far from unbiased responses. Specifically, in Gender-Stereotypical questions, all models consistently exhibited male-oriented responses; meanwhile, in Gender-Dependent questions where gender differentiation would be contextually appropriate, models exhibited responses independent to gender instead. We also confirm that this pattern does not result from neutral options nor perceived gender of a voice. When we allow neutral response, models tends to respond neutrally also in Gender-Dependent questions. The paradoxical pattern yet retains when we applied gender neutralization methods on speech. Through comparison between SpeechLMs with corresponding backbone LLMs, we confirmed that these paradoxical patterns primarily stem from Whisper speech encoders, which generates male-oriented acoustic tokens. These findings reveal that current SpeechLMs may not successfully remove gender biases though they prioritized general fairness principles over contextual appropriateness, highlighting the need for more sophisticated techniques to utilize gender information properly in speech technology.

Related papers

Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation [25.126933196101703]
We investigate the mechanisms ST models use to assign gender to speaker-referring terms across three language pairs.<n>We find that models do not simply replicate term-specific gender associations from training data, but learn broader patterns of masculine prevalence.<n>Using contrastive feature attribution on spectrograms, we reveal that the model with higher gender accuracy relies on a previously unknown mechanism.
arXiv Detail & Related papers (2025-11-26T15:48:04Z)
Who Gets the Mic? Investigating Gender Bias in the Speaker Assignment of a Speech-LLM [4.12691471378072]
This study proposes a methodology leveraging speaker assignment as an analytic tool for bias investigation.<n>We evaluate Bark, a Text-to-Speech (TTS) model, analyzing its default speaker assignments for textual prompts.<n>If Bark's speaker selection systematically aligns with gendered associations, it may reveal patterns in its training data or model design.
arXiv Detail & Related papers (2025-08-19T08:10:55Z)
Gender Bias in Instruction-Guided Speech Synthesis Models [55.2480439325792]
This study investigates the potential gender bias in how models interpret occupation-related prompts.<n>We explore whether these models exhibit tendencies to amplify gender stereotypes when interpreting such prompts.<n>Our experimental results reveal the model's tendency to exhibit gender bias for certain occupations.
arXiv Detail & Related papers (2025-02-08T17:38:24Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
Speech After Gender: A Trans-Feminine Perspective on Next Steps for Speech Science and Technology [1.7126708168238125]
trans-feminine gender-affirming voice teachers have unique perspectives on voice that confound current understandings of speaker identity. We present the Versatile Voice dataset (VVD), a collection of three speakers modifying their voices along gendered axes.
arXiv Detail & Related papers (2024-07-09T21:19:49Z)
Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses. We propose an indirect probing framework based on conditional generation. We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z)
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses. We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z)
Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection [23.993869026482415]
We propose the first inference-time solution to control speaker-related gender inflections in speech translation. Our solution partially replaces the (biased) internal language model (LM) implicitly learned by the ST decoder with gender-specific external LMs.
arXiv Detail & Related papers (2023-10-24T11:55:16Z)
How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation [21.125217707038356]
When translating from notional gender languages into grammatical gender languages, the generated translation requires explicit gender assignments for various words, including those referring to the speaker. To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender. This paper aims to achieve the same results by integrating the speaker's gender metadata into a single "multi-gender" neural ST model, easier to maintain.
arXiv Detail & Related papers (2023-10-23T17:21:32Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Generating Multilingual Gender-Ambiguous Text-to-Speech Voices [4.005334718121374]
This work addresses the task of generating novel gender-ambiguous TTS voices in a multi-speaker, multilingual setting. To our knowledge, this is the first systematic and validated approach that can reliably generate a variety of gender-ambiguous voices.
arXiv Detail & Related papers (2022-11-01T10:40:24Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.