Related papers: Stereotype or Personalization? User Identity Biases Chatbot Recommendations

Stereotype or Personalization? User Identity Biases Chatbot Recommendations

URL: http://arxiv.org/abs/2410.05613v1
Date: Tue, 8 Oct 2024 01:51:55 GMT
Title: Stereotype or Personalization? User Identity Biases Chatbot Recommendations
Authors: Anjali Kantharuban, Jeremiah Milbauer, Emma Strubell, Graham Neubig,
Abstract summary: We show that large language models (LLMs) produce recommendations that reflect both what the user wants and who the user is. We find that models generate racially stereotypical recommendations regardless of whether the user revealed their identity intentionally. Our experiments show that even though a user's revealed identity significantly influences model recommendations, model responses obfuscate this fact in response to user queries.
Score: 54.38329151781466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We demonstrate that when people use large language models (LLMs) to generate recommendations, the LLMs produce responses that reflect both what the user wants and who the user is. While personalized recommendations are often desired by users, it can be difficult in practice to distinguish cases of bias from cases of personalization: we find that models generate racially stereotypical recommendations regardless of whether the user revealed their identity intentionally through explicit indications or unintentionally through implicit cues. We argue that chatbots ought to transparently indicate when recommendations are influenced by a user's revealed identity characteristics, but observe that they currently fail to do so. Our experiments show that even though a user's revealed identity significantly influences model recommendations (p < 0.001), model responses obfuscate this fact in response to user queries. This bias and lack of transparency occurs consistently across multiple popular consumer LLMs (gpt-4o-mini, gpt-4-turbo, llama-3-70B, and claude-3.5) and for four American racial groups.

Related papers

User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal [58.43749783815486]
We study implicit user feedback in two user-LM interaction datasets.<n>We find that the contents of user feedback can improve model performance in short human-designed questions.<n>We also find that the usefulness of user feedback is largely tied to the quality of the user's initial prompt.
arXiv Detail & Related papers (2025-07-30T23:33:29Z)
Biases in LLM-Generated Musical Taste Profiles for Recommendation [6.482557558168364]
Large Language Models (LLMs) for recommendation can generate Natural Language (NL) user taste profiles from consumption data.<n>But it remains unclear whether users consider these profiles to be an accurate representation of their taste.<n>We study this issue in the context of music streaming, where personalization is challenged by a large and culturally diverse catalog.
arXiv Detail & Related papers (2025-07-22T15:44:10Z)
Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models [49.41113560646115]
We investigate various proxy measures of bias in large language models (LLMs)<n>We find that evaluating models with pre-prompted personae on a multi-subject benchmark (MMLU) leads to negligible and mostly random differences in scores.<n>With the recent trend for LLM assistant memory and personalization, these problems open up from a different angle.
arXiv Detail & Related papers (2025-06-12T08:47:40Z)
Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization [6.781972039785424]
Generative Large Language Models (LLMs) infer user's demographic information from subtle cues in the conversation.<n>Our results highlight the need for greater transparency and control in how LLMs represent user identity.
arXiv Detail & Related papers (2025-05-22T09:48:51Z)
Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries [85.909363478929]
In this study, we focus on 19 real-world statistics collected from authoritative sources. We develop a checklist comprising objective and subjective queries to analyze behavior of large language models. We propose metrics to assess factuality and fairness, and formally prove the inherent trade-off between these two aspects.
arXiv Detail & Related papers (2025-02-09T10:54:11Z)
ComPO: Community Preferences for Language Model Personalization [122.54846260663922]
ComPO is a method to personalize preference optimization in language models. We collect and release ComPRed, a question answering dataset with community-level preferences from Reddit.
arXiv Detail & Related papers (2024-10-21T14:02:40Z)
Personalized Language Modeling from Personalized Human Feedback [49.344833339240566]
Reinforcement Learning from Human Feedback (RLHF) is commonly used to fine-tune large language models to better align with human preferences. In this work, we aim to address this problem by developing methods for building personalized language models.
arXiv Detail & Related papers (2024-02-06T04:18:58Z)
Separating and Learning Latent Confounders to Enhancing User Preferences Modeling [6.0853798070913845]
We propose a novel framework, Separating and Learning Latent Confounders For Recommendation (SLFR) SLFR obtains the representation of unmeasured confounders to identify the counterfactual feedback by disentangling user preferences and unmeasured confounders. Experiments in five real-world datasets validate the advantages of our method.
arXiv Detail & Related papers (2023-11-02T08:42:50Z)
Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems [103.416202777731]
We study "persona biases", which we define to be the sensitivity of dialogue models' harmful behaviors contingent upon the personas they adopt. We categorize persona biases into biases in harmful expression and harmful agreement, and establish a comprehensive evaluation framework to measure persona biases in five aspects: Offensiveness, Toxic Continuation, Regard, Stereotype Agreement, and Toxic Agreement.
arXiv Detail & Related papers (2023-10-08T21:03:18Z)
Aligning Language Models to User Opinions [10.953326025836475]
We find that the opinions of a user and their demographics and ideologies are not mutual predictors. We use this insight to align LLMs by modeling both user opinions as well as user demographics and ideology. In addition to the typical approach of prompting LLMs with demographics and ideology, we discover that utilizing the most relevant past opinions from individual users enables the model to predict user opinions more accurately.
arXiv Detail & Related papers (2023-05-24T09:11:11Z)
Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform. Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online. Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z)
Recommendation with User Active Disclosing Willingness [20.306413327597603]
We study a novel recommendation paradigm, where the users are allowed to indicate their "willingness" on disclosing different behaviors. We conduct extensive experiments to demonstrate the effectiveness of our model on balancing the recommendation quality and user disclosing willingness.
arXiv Detail & Related papers (2022-10-25T04:43:40Z)
Causal Disentanglement with Network Information for Debiased Recommendations [34.698181166037564]
Recent research proposes to debias by modeling a recommender system from a causal perspective. The critical challenge in this setting is accounting for the hidden confounders. We propose to leverage network information (i.e., user-social and user-item networks) to better approximate hidden confounders.
arXiv Detail & Related papers (2022-04-14T20:55:11Z)
PURS: Personalized Unexpected Recommender System for Improving User Satisfaction [76.98616102965023]
We describe a novel Personalized Unexpected Recommender System (PURS) model that incorporates unexpectedness into the recommendation process. Extensive offline experiments on three real-world datasets illustrate that the proposed PURS model significantly outperforms the state-of-the-art baseline approaches.
arXiv Detail & Related papers (2021-06-05T01:33:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.