Related papers: To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs

To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs

URL: http://arxiv.org/abs/2601.02858v1
Date: Tue, 06 Jan 2026 09:42:03 GMT
Title: To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs
Authors: Saurabh Kumar Pandey, Sougata Saha, Monojit Choudhury,
Abstract summary: Socio-demographic prompting (SDP) shows Large Language Models responses as stereotypical and biased.<n>To address this, we use inverse socio-demographic prompting (ISDP), where we prompt LLMs to discriminate and predict the demographic proxy from actual and simulated user behavior.<n>Results show that models perform better with actual behaviors than simulated ones, contrary to what SDP suggests.
Score: 19.492952437281005
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Socio-demographic prompting (SDP) - prompting Large Language Models (LLMs) using demographic proxies to generate culturally aligned outputs - often shows LLM responses as stereotypical and biased. While effective in assessing LLMs' cultural competency, SDP is prone to confounding factors such as prompt sensitivity, decoding parameters, and the inherent difficulty of generation over discrimination tasks due to larger output spaces. These factors complicate interpretation, making it difficult to determine if the poor performance is due to bias or the task design. To address this, we use inverse socio-demographic prompting (ISDP), where we prompt LLMs to discriminate and predict the demographic proxy from actual and simulated user behavior from different users. We use the Goodreads-CSI dataset (Saha et al., 2025), which captures difficulty in understanding English book reviews for users from India, Mexico, and the USA, and test four LLMs: Aya-23, Gemma-2, GPT-4o, and LLaMA-3.1 with ISDP. Results show that models perform better with actual behaviors than simulated ones, contrary to what SDP suggests. However, performance with both behavior types diminishes and becomes nearly equal at the individual level, indicating limits to personalization.

Related papers

Visualizing token importance for black-box language models [48.747801442240565]
We consider the problem of auditing black-box large language models (LLMs) to ensure they behave reliably when deployed in production settings.<n>We propose Distribution-Based Sensitivity Analysis (DBSA) to evaluate the sensitivity of the output of a language model for each input token.
arXiv Detail & Related papers (2025-12-12T14:01:43Z)
Linear socio-demographic representations emerge in Large Language Models from indirect cues [0.0]
LLMs encode sociodemographic attributes of human conversational partners inferred from indirect cues such as names and occupations.<n>We show that LLMs develop linear representations of user demographics within activation space, wherein stereotypically associated attributes are encoded along interpretable geometric directions.<n>Our study further highlights that models that pass bias benchmark tests may still harbor and leverage implicit biases, with implications for fairness when applied at scale.
arXiv Detail & Related papers (2025-12-10T20:36:36Z)
An Analysis of Large Language Models for Simulating User Responses in Surveys [8.614942349812429]
Using Large Language Models to simulate user opinions has received growing attention.<n>LLMs are known to exhibit biases toward dominant viewpoints, raising concerns about their ability to represent users from diverse demographic and cultural backgrounds.
arXiv Detail & Related papers (2025-12-07T15:03:09Z)
The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models [7.819021910077221]
We examine how different persona prompt strategies influence large language models (LLMs)<n>We find that the choice of demographic priming and role adoption strategy significantly impacts their portrayal.<n>Specifically, we find that prompting in an interview-style format and name-based priming can help reduce stereotyping.
arXiv Detail & Related papers (2025-07-21T21:23:29Z)
Revisiting LLM Value Probing Strategies: Are They Robust and Expressive? [81.49470136653665]
We evaluate the robustness and expressiveness of value representations across three widely used probing strategies.<n>We show that the demographic context has little effect on the free-text generation, and the models' values only weakly correlate with their preference for value-based actions.
arXiv Detail & Related papers (2025-07-17T18:56:41Z)
Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization [13.034294029448338]
Generative Large Language Models (LLMs) infer user's demographic information from subtle cues in the conversation.<n>Our results highlight the need for greater transparency and control in how LLMs represent user identity.
arXiv Detail & Related papers (2025-05-22T09:48:51Z)
Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear. By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z)
Aligning Language Models with Demonstrated Feedback [58.834937450242975]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.<n>We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z)
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM) We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions. Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z)
Large language models that replace human participants can harmfully misportray and flatten identity groups [36.36009232890876]
We show that there are two inherent limitations in the way current LLMs are trained that prevent this.<n>We argue analytically for why LLMs are likely to both misportray and flatten the representations of demographic groups.<n>We also discuss a third limitation about how identity prompts can essentialize identities.
arXiv Detail & Related papers (2024-02-02T21:21:06Z)
Sociodemographic Prompting is Not Yet an Effective Approach for Simulating Subjective Judgments with LLMs [13.744746481528711]
Large Language Models (LLMs) are widely used to simulate human responses across diverse contexts.<n>We evaluate nine popular LLMs on their ability to understand demographic differences in two subjective judgment tasks: politeness and offensiveness.<n>We find that in zero-shot settings, most models' predictions for both tasks align more closely with labels from White participants than those from Asian or Black participants.
arXiv Detail & Related papers (2023-11-16T10:02:24Z)
On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z)
Do LLMs exhibit human-like response biases? A case study in survey design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all. We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires. Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.