Related papers: UnWEIRDing LLM Entity Recommendations

UnWEIRDing LLM Entity Recommendations

URL: http://arxiv.org/abs/2511.18403v1
Date: Sun, 23 Nov 2025 11:14:32 GMT
Title: UnWEIRDing LLM Entity Recommendations
Authors: Aayush Kumar, Sanket Mhatre,
Abstract summary: We use the WEIRD framework to evaluate recommendations by various Large Language Models across a dataset of fine-grained entities.<n>Our results indicate that while such prompting strategies do reduce such biases, this reduction is not consistent across different models.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models have been widely been adopted by users for writing tasks such as sentence completions. While this can improve writing efficiency, prior research shows that LLM-generated suggestions may exhibit cultural biases which may be difficult for users to detect, especially in educational contexts for non-native English speakers. While such prior work has studied the biases in LLM moral value alignment, we aim to investigate cultural biases in LLM recommendations for real-world entities. To do so, we use the WEIRD (Western, Educated, Industrialized, Rich and Democratic) framework to evaluate recommendations by various LLMs across a dataset of fine-grained entities, and apply pluralistic prompt-based strategies to mitigate these biases. Our results indicate that while such prompting strategies do reduce such biases, this reduction is not consistent across different models, and recommendations for some types of entities are more biased than others.

Related papers

LLM-Specific Utility: A New Perspective for Retrieval-Augmented Generation [110.610512800947]
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating external knowledge.<n>Existing studies often treat utility as a generic attribute, ignoring the fact that different LLMs may benefit differently from the same passage.
arXiv Detail & Related papers (2025-10-13T12:57:45Z)
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations [18.862841015556995]
Large Language Models (LLMs) have demonstrated unprecedented language understanding and reasoning capabilities.<n>Recent studies have revealed that LLMs are likely to inherit stereotypes that are embedded ubiquitously in word embeddings.<n>This study reveals a new variant of fairness between stereotype groups containing both users and items, to quantify discrimination against stereotypes in LLM-RS.
arXiv Detail & Related papers (2025-04-05T15:09:39Z)
Evaluating how LLM annotations represent diverse views on contentious topics [3.405231040967506]
We show that generative large language models (LLMs) tend to be biased in the same directions on the same demographic categories within the same datasets.<n>We conclude with a discussion of the implications for researchers and practitioners using LLMs for automated data annotation tasks.
arXiv Detail & Related papers (2025-03-29T22:53:15Z)
Bayesian Teaching Enables Probabilistic Reasoning in Large Language Models [54.38054999271322]
We show that large language models (LLMs) don't update their beliefs as expected from the Bayesian framework.<n>We teach the LLMs to reason in a Bayesian manner by training them to mimic the predictions of the normative Bayesian model.<n>More generally, our results indicate that LLMs can effectively learn reasoning skills from examples and generalize those skills to new domains.
arXiv Detail & Related papers (2025-03-21T20:13:04Z)
The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs [21.97227334180969]
"LLM-as-an-annotator" and "LLM-as-a-judge" paradigms employ Large Language Models (LLMs) as annotators, judges, and evaluators in tasks traditionally performed by humans.<n>Despite their role in shaping study results and insights, there is no standard or rigorous procedure to determine whether LLMs can replace human annotators.<n>We propose a novel statistical procedure, the Alternative Annotator Test (alt-test), that requires only a modest subset of annotated examples to justify using LLM annotations.
arXiv Detail & Related papers (2025-01-19T07:09:11Z)
A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions [0.0]
Large Language Models (LLMs) have revolutionized various applications in natural language processing (NLP) by providing unprecedented text generation, translation, and comprehension capabilities. Their widespread deployment has brought to light significant concerns regarding biases embedded within these models. This paper presents a comprehensive survey of biases in LLMs, aiming to provide an extensive review of the types, sources, impacts, and mitigation strategies related to these biases.
arXiv Detail & Related papers (2024-09-24T19:50:38Z)
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective [66.34066553400108]
We conduct a rigorous evaluation of large language models' implicit bias towards certain demographics.<n>Inspired by psychometric principles, we propose three attack approaches, i.e., Disguise, Deception, and Teaching.<n>Our methods can elicit LLMs' inner bias more effectively than competitive baselines.
arXiv Detail & Related papers (2024-06-20T06:42:08Z)
Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction [62.409807640887834]
Chinese Grammatical Error Correction (CGEC) aims to correct all potential grammatical errors in the input sentences. LLMs' performance as correctors on CGEC remains unsatisfactory due to its challenging task focus. We rethink the roles of LLMs in the CGEC task so that they can be better utilized and explored in CGEC.
arXiv Detail & Related papers (2024-02-18T01:40:34Z)
A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive [53.08398658452411]
Large Language Models (LLMs) are increasingly utilized in autonomous decision-making.<n>We show that this sampling behavior resembles that of human decision-making.<n>We show that this deviation of a sample from the statistical norm towards a prescriptive component consistently appears in concepts across diverse real-world domains.
arXiv Detail & Related papers (2024-02-16T18:28:43Z)
Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review [2.780460221321639]
The paper underscores the significance of Large Language Models in reshaping recommender systems. LLMs exhibit exceptional proficiency in recommending items, showcasing their adeptness in comprehending intricacies of language. Despite their transformative potential, challenges persist, including sensitivity to input prompts, occasional misinterpretations, and unforeseen recommendations.
arXiv Detail & Related papers (2024-02-11T00:24:17Z)
LLM-Rec: Personalized Recommendation via Prompting Large Language Models [62.481065357472964]
Large language models (LLMs) have showcased their ability to harness commonsense knowledge and reasoning. Recent advances in large language models (LLMs) have showcased their remarkable ability to harness commonsense knowledge and reasoning. This study introduces a novel approach, coined LLM-Rec, which incorporates four distinct prompting strategies of text enrichment for improving personalized text-based recommendations.
arXiv Detail & Related papers (2023-07-24T18:47:38Z)
A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.