Related papers: Beats of Bias: Analyzing Lyrics with Topic Modeling and Gender Bias Measurements

Beats of Bias: Analyzing Lyrics with Topic Modeling and Gender Bias Measurements

URL: http://arxiv.org/abs/2409.15949v1
Date: Tue, 24 Sep 2024 10:24:53 GMT
Title: Beats of Bias: Analyzing Lyrics with Topic Modeling and Gender Bias Measurements
Authors: Danqing Chen, Adithi Satish, Rasul Khanbayov, Carolin M. Schuster, Georg Groh,
Abstract summary: This paper uses topic modeling and bias measurement techniques to analyze and determine gender bias in English song lyrics. We observe large amounts of profanity and misogynistic lyrics on various topics, especially in the overall biggest cluster. We find that words related to intelligence and strength tend to show a male bias across genres, as opposed to appearance and weakness words, which are more female-biased.
Score: 1.5379084885764847
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper uses topic modeling and bias measurement techniques to analyze and determine gender bias in English song lyrics. We utilize BERTopic to cluster 537,553 English songs into distinct topics and chart their development over time. Our analysis shows the thematic shift in song lyrics over the years, from themes of romance to the increasing sexualization of women in songs. We observe large amounts of profanity and misogynistic lyrics on various topics, especially in the overall biggest cluster. Furthermore, to analyze gender bias across topics and genres, we employ the Single Category Word Embedding Association Test (SC-WEAT) to compute bias scores for the word embeddings trained on the most popular topics as well as for each genre. We find that words related to intelligence and strength tend to show a male bias across genres, as opposed to appearance and weakness words, which are more female-biased; however, a closer look also reveals differences in biases across topics.

Related papers

Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs) By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases. The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT Generated English Text [11.085070600065801]
Language serves as a powerful tool for the manifestation of societal belief systems. Gender bias is one of the most pervasive biases in our society. We create the first dataset of GPT-generated English text with normative ratings of gender bias.
arXiv Detail & Related papers (2023-10-26T14:34:06Z)
Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases. This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z)
The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets. Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children's Fairy Tales [46.65377334112404]
Social biases and stereotypes are embedded in our culture in part through their presence in our stories. We propose a computational pipeline that automatically extracts a story's temporal narrative verb-based event chain for each of its characters. We also present a verb-based event annotation scheme that can facilitate bias analysis by including categories such as those that align with traditional stereotypes.
arXiv Detail & Related papers (2023-05-26T05:29:37Z)
Large scale analysis of gender bias and sexism in song lyrics [3.437656066916039]
We identify sexist lyrics at a larger scale than previous studies using small samples of manually annotated popular songs. We find sexist content to increase across time, especially from male artists and for popular songs appearing in Billboard charts. This is the first large scale analysis of this type, giving insights into language usage in such an influential part of popular culture.
arXiv Detail & Related papers (2022-08-03T13:18:42Z)
Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics [3.4048739113355215]
We provide a comprehensive analysis of group-based biases in widely-used static English word embeddings trained on internet corpora. Using the Single-Category Word Embedding Association Test, we demonstrate the widespread prevalence of gender biases. We find that, of the 1,000 most frequent words in the vocabulary, 77% are more associated with men than women.
arXiv Detail & Related papers (2022-06-07T15:35:10Z)
Quantifying Gender Bias in Consumer Culture [0.0]
Song lyrics may help drive shifts in societal stereotypes towards women, and that lyrical shifts are driven by male artists. Natural language processing of a quarter of a million songs over 50 years quantifies misogyny. Women are less likely to be associated with desirable traits (i.e., competence) and while this bias has decreased, it persists.
arXiv Detail & Related papers (2022-01-10T05:44:54Z)
Gender Bias Hidden Behind Chinese Word Embeddings: The Case of Chinese Adjectives [0.0]
This paper investigates gender bias in static word embeddings from a unique perspective, Chinese adjectives. Through a comparison between the produced results and a human-scored data set, we demonstrate how gender bias encoded in word embeddings differentiates from people's attitudes.
arXiv Detail & Related papers (2021-06-01T02:12:45Z)
Gender bias in magazines oriented to men and women: a computational approach [58.720142291102135]
We compare the content of a women-oriented magazine with that of a men-oriented one, both produced by the same editorial group over a decade. With Topic Modelling techniques we identify the main themes discussed in the magazines and quantify how much the presence of these topics differs between magazines over time. Our results show that the frequency of appearance of the topics Family, Business and Women as sex objects, present an initial bias that tends to disappear over time.
arXiv Detail & Related papers (2020-11-24T14:02:49Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.