QueerBench: Quantifying Discrimination in Language Models Toward Queer Identities
- URL: http://arxiv.org/abs/2406.12399v1
- Date: Tue, 18 Jun 2024 08:40:29 GMT
- Title: QueerBench: Quantifying Discrimination in Language Models Toward Queer Identities
- Authors: Mae Sosto, Alberto Barrón-Cedeño,
- Abstract summary: We assess the potential harm caused by sentence completions generated by English large language models concerning LGBTQIA+ individuals.
The analysis indicates that large language models tend to exhibit discriminatory behaviour more frequently towards individuals within the LGBTQIA+ community.
- Score: 4.82206141686275
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With the increasing role of Natural Language Processing (NLP) in various applications, challenges concerning bias and stereotype perpetuation are accentuated, which often leads to hate speech and harm. Despite existing studies on sexism and misogyny, issues like homophobia and transphobia remain underexplored and often adopt binary perspectives, putting the safety of LGBTQIA+ individuals at high risk in online spaces. In this paper, we assess the potential harm caused by sentence completions generated by English large language models (LLMs) concerning LGBTQIA+ individuals. This is achieved using QueerBench, our new assessment framework, which employs a template-based approach and a Masked Language Modeling (MLM) task. The analysis indicates that large language models tend to exhibit discriminatory behaviour more frequently towards individuals within the LGBTQIA+ community, reaching a difference gap of 7.2% in the QueerBench score of harmfulness.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias [8.168722337906148]
This study investigates the presence of bias in harmful speech classification of gender-queer dialect online.
We introduce a novel dataset, QueerLex, based on 109 curated templates exemplifying non-derogatory uses of LGBTQ+ slurs.
We systematically evaluate the performance of five off-the-shelf language models in assessing the harm of these texts.
arXiv Detail & Related papers (2024-05-23T18:07:28Z) - Laissez-Faire Harms: Algorithmic Biases in Generative Language Models [0.0]
We show that synthetically generated texts from five of the most pervasive LMs perpetuate harms of omission, subordination, and stereotyping for minoritized individuals.
We find widespread evidence of bias to an extent that such individuals are hundreds to thousands of times more likely to encounter LM-generated outputs.
Our findings highlight the urgent need to protect consumers from discriminatory harms caused by language models.
arXiv Detail & Related papers (2024-04-11T05:09:03Z) - Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - Stereotypes and Smut: The (Mis)representation of Non-cisgender
Identities by Text-to-Image Models [6.92043136971035]
We investigate how multimodal models handle diverse gender identities.
We find certain non-cisgender identities are consistently (mis)represented as less human, more stereotyped and more sexualised.
These improvements could pave the way for a future where change is led by the affected community.
arXiv Detail & Related papers (2023-05-26T16:28:49Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Detection of Homophobia & Transphobia in Dravidian Languages: Exploring
Deep Learning Methods [1.5687561161428403]
Homophobia and transphobia constitute offensive comments against LGBT+ community.
The paper attempts to explore applicability of different deep learning mod-els for classification of the social media comments in Malayalam and Tamil lan-guages.
arXiv Detail & Related papers (2023-04-03T12:15:27Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - How True is GPT-2? An Empirical Analysis of Intersectional Occupational
Biases [50.591267188664666]
Downstream applications are at risk of inheriting biases contained in natural language models.
We analyze the occupational biases of a popular generative language model, GPT-2.
For a given job, GPT-2 reflects the societal skew of gender and ethnicity in the US, and in some cases, pulls the distribution towards gender parity.
arXiv Detail & Related papers (2021-02-08T11:10:27Z) - A Framework for the Computational Linguistic Analysis of Dehumanization [52.735780962665814]
We analyze discussions of LGBTQ people in the New York Times from 1986 to 2015.
We find increasingly humanizing descriptions of LGBTQ people over time.
The ability to analyze dehumanizing language at a large scale has implications for automatically detecting and understanding media bias as well as abusive language online.
arXiv Detail & Related papers (2020-03-06T03:02:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.