Elucidate Gender Fairness in Singing Voice Transcription
- URL: http://arxiv.org/abs/2308.02898v1
- Date: Sat, 5 Aug 2023 15:15:01 GMT
- Title: Elucidate Gender Fairness in Singing Voice Transcription
- Authors: Xiangming Gu and Wei Zeng and Ye Wang
- Abstract summary: We investigate whether gender-based characteristics lead to a performance disparity in singing voice transcription (SVT)
We find that different pitch distributions, rather than gender data imbalance, contribute to this disparity.
To address this issue, we propose using an attribute predictor to predict gender labels and adversarially training the SVT system to enforce the gender-invariance of acoustic representations.
- Score: 5.434559527051845
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is widely known that males and females typically possess different sound
characteristics when singing, such as timbre and pitch, but it has never been
explored whether these gender-based characteristics lead to a performance
disparity in singing voice transcription (SVT), whose target includes pitch.
Such a disparity could cause fairness issues and severely affect the user
experience of downstream SVT applications. Motivated by this, we first
demonstrate the female superiority of SVT systems, which is observed across
different models and datasets. We find that different pitch distributions,
rather than gender data imbalance, contribute to this disparity. To address
this issue, we propose using an attribute predictor to predict gender labels
and adversarially training the SVT system to enforce the gender-invariance of
acoustic representations. Leveraging the prior knowledge that pitch
distributions may contribute to the gender bias, we propose conditionally
aligning acoustic representations between demographic groups by feeding note
events to the attribute predictor. Empirical experiments on multiple benchmark
SVT datasets show that our method significantly reduces gender bias (up to more
than 50%) with negligible degradation of overall SVT performance, on both
in-domain and out-of-domain singing data, thus offering a better
fairness-utility trade-off.
Related papers
- Everyone deserves their voice to be heard: Analyzing Predictive Gender Bias in ASR Models Applied to Dutch Speech Data [13.91630413828167]
This study focuses on identifying the performance disparities of Whisper models on Dutch speech data.
We analyzed the word error rate, character error rate and a BERT-based semantic similarity across gender groups.
arXiv Detail & Related papers (2024-11-14T13:29:09Z) - Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs) [82.57490175399693]
We study gender bias in 22 popular image-to-text vision-language assistants (VLAs)
Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances.
To eliminate the gender bias in these models, we find that finetuning-based debiasing methods achieve the best tradeoff between debiasing and retaining performance on downstream tasks.
arXiv Detail & Related papers (2024-10-25T05:59:44Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Evaluating Bias and Fairness in Gender-Neutral Pretrained
Vision-and-Language Models [23.65626682262062]
We quantify bias amplification in pretraining and after fine-tuning on three families of vision-and-language models.
Overall, we find that bias amplification in pretraining and after fine-tuning are independent.
arXiv Detail & Related papers (2023-10-26T16:19:19Z) - No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech
Recognition through Pitch Manipulation [20.731375136671605]
We propose a data augmentation technique that manipulates the fundamental frequency (f0) and formants.
This technique reduces the data unbalance among genders by simulating voices of the under-represented female speakers.
Experiments on spontaneous English speech show that our technique yields a relative WER improvement up to 9.87% for utterances by female speakers.
arXiv Detail & Related papers (2023-10-10T12:55:22Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - A Study of Gender Impact in Self-supervised Models for Speech-to-Text
Systems [25.468558523679363]
We train and compare gender-specific wav2vec 2.0 models against models containing different degrees of gender balance in pre-training data.
We observe lower overall performance using gender-specific pre-training before fine-tuning an end-to-end ASR system.
arXiv Detail & Related papers (2022-04-04T11:28:19Z) - Improving Gender Fairness of Pre-Trained Language Models without
Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin.
We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.