Related papers: QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks

QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks

URL: http://arxiv.org/abs/2601.20731v1
Date: Wed, 28 Jan 2026 16:06:04 GMT
Title: QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks
Authors: Mae Sosto, Delfina Sol Martinez Pandiani, Laura Hollink,
Abstract summary: We investigate whether explicit information about a subject's gender or sexuality influences responses across three subject categories.<n>Our findings show that Masked Language Models (MLMs) produce the least favorable sentiment, higher toxicity, and more negative regard for queer-marked subjects.
Score: 0.38887448816036313
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper examines how Large Language Models (LLMs) reproduce societal norms, particularly heterocisnormativity, and how these norms translate into measurable biases in their text generations. We investigate whether explicit information about a subject's gender or sexuality influences LLM responses across three subject categories: queer-marked, non-queer-marked, and the normalized "unmarked" category. Representational imbalances are operationalized as measurable differences in English sentence completions across four dimensions: sentiment, regard, toxicity, and prediction diversity. Our findings show that Masked Language Models (MLMs) produce the least favorable sentiment, higher toxicity, and more negative regard for queer-marked subjects. Autoregressive Language Models (ARLMs) partially mitigate these patterns, while closed-access ARLMs tend to produce more harmful outputs for unmarked subjects. Results suggest that LLMs reproduce normative social assumptions, though the form and degree of bias depend strongly on specific model characteristics, which may redistribute, but not eliminate, representational harms.

Related papers

More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models [3.154053412440065]
Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases.<n>This study introduces a novel evaluation framework to uncover gender biases in LLMs.<n>A systematic analysis of ten prominent LLMs shows a consistent pattern of overrepresenting female characters across occupations.
arXiv Detail & Related papers (2025-03-20T07:15:45Z)
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.<n>GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z)
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency in Large Language Model (LLM)-generated content.<n>We introduce the Language Agency Bias Evaluation benchmark, which comprehensively evaluates biases in LLMs.<n>Using LABE, we unveil language agency social biases in 3 recent LLMs: ChatGPT, Llama3, and Mistral.
arXiv Detail & Related papers (2024-04-16T12:27:54Z)
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting [87.30837365008931]
Large language models (LLMs) equipped with Chain-of-Thought (CoT) prompting are able to make accurate incremental predictions even on unscalable tasks. This study examines the impact of LLMs' step-by-step predictions on gender bias in unscalable tasks.
arXiv Detail & Related papers (2024-01-28T06:50:10Z)
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses. We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z)
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content. This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z)
In-Contextual Gender Bias Suppression for Large Language Models [47.246504807946884]
Large Language Models (LLMs) have been reported to encode worrying-levels of gender biases. We propose bias suppression that prevents biased generations of LLMs by providing preambles constructed from manually designed templates. We find that bias suppression has acceptable adverse effect on downstream task performance with HellaSwag and COPA.
arXiv Detail & Related papers (2023-09-13T18:39:08Z)
Queer People are People First: Deconstructing Sexual Identity Stereotypes in Large Language Models [3.974379576408554]
Large Language Models (LLMs) are trained primarily on minimally processed web text. LLMs can inadvertently perpetuate stereotypes towards marginalized groups, like the LGBTQIA+ community.
arXiv Detail & Related papers (2023-06-30T19:39:01Z)
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models [33.157279170602784]
We present Marked Personas, a prompt-based method to measure stereotypes in large language models (LLMs) We find that portrayals generated by GPT-3.5 and GPT-4 contain higher rates of racial stereotypes than human-written portrayals using the same prompts. An intersectional lens reveals tropes that dominate portrayals of marginalized groups, such as tropicalism and the hypersexualization of minoritized women.
arXiv Detail & Related papers (2023-05-29T16:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.