Language Models Change Facts Based on the Way You Talk
- URL: http://arxiv.org/abs/2507.14238v1
- Date: Thu, 17 Jul 2025 13:21:17 GMT
- Title: Language Models Change Facts Based on the Way You Talk
- Authors: Matthew Kearney, Reuben Binns, Yarin Gal,
- Abstract summary: We find that large language models (LLMs) are extremely sensitive to markers of identity in user queries.<n>These biases mean that the use of off-the-shelf LLMs for these applications may cause harmful differences in medical care, foster wage gaps, and create different political factual realities for people of different identities.
- Score: 38.44076602344941
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Large language models (LLMs) are increasingly being used in user-facing applications, from providing medical consultations to job interview advice. Recent research suggests that these models are becoming increasingly proficient at inferring identity information about the author of a piece of text from linguistic patterns as subtle as the choice of a few words. However, little is known about how LLMs use this information in their decision-making in real-world applications. We perform the first comprehensive analysis of how identity markers present in a user's writing bias LLM responses across five different high-stakes LLM applications in the domains of medicine, law, politics, government benefits, and job salaries. We find that LLMs are extremely sensitive to markers of identity in user queries and that race, gender, and age consistently influence LLM responses in these applications. For instance, when providing medical advice, we find that models apply different standards of care to individuals of different ethnicities for the same symptoms; we find that LLMs are more likely to alter answers to align with a conservative (liberal) political worldview when asked factual questions by older (younger) individuals; and that LLMs recommend lower salaries for non-White job applicants and higher salaries for women compared to men. Taken together, these biases mean that the use of off-the-shelf LLMs for these applications may cause harmful differences in medical care, foster wage gaps, and create different political factual realities for people of different identities. Beyond providing an analysis, we also provide new tools for evaluating how subtle encoding of identity in users' language choices impacts model decisions. Given the serious implications of these findings, we recommend that similar thorough assessments of LLM use in user-facing applications are conducted before future deployment.
Related papers
- Fairness Evaluation of Large Language Models in Academic Library Reference Services [6.335631290002225]
We evaluate whether large language models (LLMs) differentiate responses across user identities by prompting six state-of-the-art LLMs to assist patrons differing in sex, race/ethnicity, and institutional role.<n>We found no evidence of differentiation by race or ethnicity, and only minor evidence of stereotypical bias against women in one model.<n>These findings suggest that current LLMs show a promising degree of readiness to support equitable and contextually appropriate communication in academic library reference services.
arXiv Detail & Related papers (2025-07-06T03:28:24Z) - Evaluating how LLM annotations represent diverse views on contentious topics [3.405231040967506]
We show that generative large language models (LLMs) tend to be biased in the same directions on the same demographic categories within the same datasets.<n>We conclude with a discussion of the implications for researchers and practitioners using LLMs for automated data annotation tasks.
arXiv Detail & Related papers (2025-03-29T22:53:15Z) - Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays [19.405531377930977]
We investigate the use of large language models (LLM) in high-stakes admissions contexts.<n>We find that both types of LLM-generated essays are linguistically distinct from human-authored essays.<n>The demographically prompted and unprompted synthetic texts were also more similar to each other than to the human text.
arXiv Detail & Related papers (2025-03-25T20:54:50Z) - Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language.<n>This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z) - Are Large Language Models Chameleons? An Attempt to Simulate Social Surveys [1.5727456947901746]
We conducted millions of simulations in which large language models (LLMs) were asked to answer subjective questions.
A comparison of different LLM responses with the European Social Survey (ESS) data suggests that the effect of prompts on bias and variability is fundamental.
arXiv Detail & Related papers (2024-05-29T17:54:22Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models.
We show that models give substantively different answers when not forced.
We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z) - "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content.
This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z) - The Unequal Opportunities of Large Language Models: Revealing
Demographic Bias through Job Recommendations [5.898806397015801]
We propose a simple method for analyzing and comparing demographic bias in Large Language Models (LLMs)
We demonstrate the effectiveness of our method by measuring intersectional biases within ChatGPT and LLaMA.
We identify distinct biases in both models toward various demographic identities, such as both models consistently suggesting low-paying jobs for Mexican workers.
arXiv Detail & Related papers (2023-08-03T21:12:54Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.