Related papers: The Effects of Demographic Instructions on LLM Personas

The Effects of Demographic Instructions on LLM Personas

URL: http://arxiv.org/abs/2505.11795v1
Date: Sat, 17 May 2025 02:49:15 GMT
Title: The Effects of Demographic Instructions on LLM Personas
Authors: Angel Felipe Magnossão de Paula, J. Shane Culpepper, Alistair Moffat, Sachin Pathiyan Cherumanal, Falk Scholer, Johanne Trippas,
Abstract summary: Social media platforms must filter sexist content in compliance with governmental regulations.<n>Current machine learning approaches can reliably detect sexism based on standardized definitions.<n>We adopt a perspectivist approach, retaining diverse annotations rather than enforcing gold-standard labels.
Score: 14.283869154967835
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Social media platforms must filter sexist content in compliance with governmental regulations. Current machine learning approaches can reliably detect sexism based on standardized definitions, but often neglect the subjective nature of sexist language and fail to consider individual users' perspectives. To address this gap, we adopt a perspectivist approach, retaining diverse annotations rather than enforcing gold-standard labels or their aggregations, allowing models to account for personal or group-specific views of sexism. Using demographic data from Twitter, we employ large language models (LLMs) to personalize the identification of sexism.

Related papers

MuSeD: A Multimodal Spanish Dataset for Sexism Detection in Social Media Videos [12.555579923843641]
We introduce MuSeD, a new Multimodal Spanish dataset for Sexism Detection consisting of $approx$ 11 hours of videos extracted from TikTok and BitChute.<n>We find that visual information plays a key role in labeling sexist content for both humans and models.
arXiv Detail & Related papers (2025-04-15T13:16:46Z)
The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [91.86718720024825]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.<n>Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.<n>We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z)
Adaptable Moral Stances of Large Language Models on Sexist Content: Implications for Society and Gender Discourse [17.084339235658085]
We show that all eight models produce comprehensible and contextually relevant text. Based on our observations, we caution against the potential misuse of LLMs to justify sexist language.
arXiv Detail & Related papers (2024-09-30T19:27:04Z)
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models. Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs. Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z)
Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning [0.7874708385247352]
This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models.<n>We fine-tuned the XLM-RoBERTa model and separately used GPT-3.5 with few-shot learning prompts to classify sexist content.
arXiv Detail & Related papers (2024-06-11T14:15:33Z)
On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z)
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses. We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z)
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content. This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z)
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models [33.157279170602784]
We present Marked Personas, a prompt-based method to measure stereotypes in large language models (LLMs) We find that portrayals generated by GPT-3.5 and GPT-4 contain higher rates of racial stereotypes than human-written portrayals using the same prompts. An intersectional lens reveals tropes that dominate portrayals of marginalized groups, such as tropicalism and the hypersexualization of minoritized women.
arXiv Detail & Related papers (2023-05-29T16:29:22Z)
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation. We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
"Call me sexist, but...": Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples [2.029924828197095]
We outline the different dimensions of sexism by grounding them in their implementation in psychological scales. From the scales, we derive a codebook for sexism in social media, which we use to annotate existing and novel datasets. Results indicate that current machine learning models pick up on a very narrow set of linguistic markers of sexism and do not generalize well to out-of-domain examples.
arXiv Detail & Related papers (2020-04-27T13:07:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.