Related papers: Who's Asking? Investigating Bias Through the Lens of Disability Framed Queries in LLMs

Who's Asking? Investigating Bias Through the Lens of Disability Framed Queries in LLMs

URL: http://arxiv.org/abs/2508.15831v2
Date: Wed, 22 Oct 2025 02:49:01 GMT
Title: Who's Asking? Investigating Bias Through the Lens of Disability Framed Queries in LLMs
Authors: Vishnu Hari, Kalpana Panda, Srikant Panda, Amit Agarwal, Hitesh Laxmichand Patel,
Abstract summary: Large Language Models (LLMs) routinely infer users demographic traits from phrasing alone.<n>Disability cues in shaping these inferences remains largely uncharted.<n>We present the first systematic audit of disability-conditioned demographic bias across eight state-of-the-art instruction-tuned LLMs.
Score: 2.722784054643991
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) routinely infer users demographic traits from phrasing alone, which can result in biased responses, even when no explicit demographic information is provided. The role of disability cues in shaping these inferences remains largely uncharted. Thus, we present the first systematic audit of disability-conditioned demographic bias across eight state-of-the-art instruction-tuned LLMs ranging from 3B to 72B parameters. Using a balanced template corpus that pairs nine disability categories with six real-world business domains, we prompt each model to predict five demographic attributes - gender, socioeconomic status, education, cultural background, and locality - under both neutral and disability-aware conditions. Across a varied set of prompts, models deliver a definitive demographic guess in up to 97\% of cases, exposing a strong tendency to make arbitrary inferences with no clear justification. Disability context heavily shifts predicted attribute distributions, and domain context can further amplify these deviations. We observe that larger models are simultaneously more sensitive to disability cues and more prone to biased reasoning, indicating that scale alone does not mitigate stereotype amplification. Our findings reveal persistent intersections between ableism and other demographic stereotypes, pinpointing critical blind spots in current alignment strategies. We release our evaluation framework and results to encourage disability-inclusive benchmarking and recommend integrating abstention calibration and counterfactual fine-tuning to curb unwarranted demographic inference. Code and data will be released on acceptance.

Related papers

Interpretable Debiasing of Vision-Language Models for Social Fairness [55.85977929985967]
We introduce an interpretable, model-agnostic bias mitigation framework, DeBiasLens, that localizes social attribute neurons in Vision-Language models.<n>We train SAEs on facial image or caption datasets without corresponding social attribute labels to uncover neurons highly responsive to specific demographics.<n>Our research lays the groundwork for future auditing tools, prioritizing social fairness in emerging real-world AI systems.
arXiv Detail & Related papers (2026-02-27T13:37:11Z)
Demographic Probing of Large Language Models Lacks Construct Validity [16.29607362682272]
We study how large language models adapt their behavior to demographic attributes.<n>This approach typically uses a single demographic cue in isolation as a signal for group membership.<n>We find that cues intended to represent the same demographic group induce only partially overlapping changes in model behavior.
arXiv Detail & Related papers (2026-01-26T13:41:35Z)
Auditing Disability Representation in Vision-Language Models [0.6987503477818553]
We study disability aware descriptions for person centric images.<n>We introduce a benchmark based on paired Neutral Prompts (NP) and Disability-Contextualised Prompts (DP)<n>We evaluate 15 state-of-the-art open- and closed-source vision-language models under a zero-shot setting across 9 disability categories.
arXiv Detail & Related papers (2026-01-24T07:25:43Z)
AccessEval: Benchmarking Disability Bias in Large Language Models [3.160274015679566]
Large Language Models (LLMs) are increasingly deployed across diverse domains but often exhibit disparities in how they handle real-life queries.<n>We introduce textbfAccessEval (Accessibility Evaluation), a benchmark evaluating 21 closed- and open-source LLMs across 6 real-world domains and 9 disability types.<n>Our analysis reveals that responses to disability-aware queries tend to have a more negative tone, increased stereotyping, and higher factual error compared to neutral queries.
arXiv Detail & Related papers (2025-09-22T17:49:03Z)
Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution [5.061421107401101]
Large language models (LLMs) have achieved impressive performance, leading to their widespread adoption as decision-support tools in resource-constrained contexts like hiring and admissions.<n>There is, however, scientific consensus that AI systems can reflect and exacerbate societal biases, raising concerns about identity-based harm when used in critical social contexts.<n>In this work, we extend single-axis fairness evaluations to examine intersectional bias, recognizing that when multiple axes of discrimination intersect, they create distinct patterns of disadvantage.
arXiv Detail & Related papers (2025-08-09T22:24:40Z)
Assessing the Reliability of LLMs Annotations in the Context of Demographic Bias and Model Explanation [5.907945985868999]
This study investigates the extent to which annotator demographic features influence labeling decisions compared to text content.<n>Using a Generalized Linear Mixed Model, we quantify this inf luence, finding that demographic factors account for a minor fraction ( 8%) of the observed variance.<n>We then assess the reliability of Generative AI (GenAI) models as annotators, specifically evaluating if guiding them with demographic personas improves alignment with human judgments.
arXiv Detail & Related papers (2025-07-17T14:00:13Z)
Fair Deepfake Detectors Can Generalize [51.21167546843708]
We show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions.<n>Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals.<n>DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art
arXiv Detail & Related papers (2025-07-03T14:10:02Z)
Robustly Improving LLM Fairness in Realistic Settings via Interpretability [0.16843915833103415]
Anti-bias prompts fail when realistic contextual details are introduced.<n>We find that adding realistic context such as company names, culture descriptions from public careers pages, and selective hiring constraints induces significant racial and gender biases.<n>Our internal bias mitigation identifies race and gender-correlated directions and applies affine concept editing at inference time.
arXiv Detail & Related papers (2025-06-12T17:34:38Z)
Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models [10.565316815513235]
Large language models (LLMs) may still exhibit implicit biases when simulating human behavior.<n>We show that state-of-the-art LLMs exhibit significant sociodemographic disparities in nearly all simulations.<n>When comparing our findings to real-world disparities reported in empirical studies, we find that the biases we uncovered are directionally aligned but markedly amplified.
arXiv Detail & Related papers (2025-01-29T05:21:31Z)
Who Does the Giant Number Pile Like Best: Analyzing Fairness in Hiring Contexts [5.111540255111445]
Race-based differences appear in approximately 10% of generated summaries, while gender-based differences occur in only 1%.<n>Retrieval models demonstrate comparable sensitivity to non-demographic changes, suggesting that fairness issues may stem from general brittleness issues.
arXiv Detail & Related papers (2025-01-08T07:28:10Z)
The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [91.86718720024825]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.<n>Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.<n>We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z)
Identifying and Mitigating Social Bias Knowledge in Language Models [52.52955281662332]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.<n>FAST surpasses state-of-the-art baselines with superior debiasing performance.<n>This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Measuring Fairness Under Unawareness of Sensitive Attributes: A Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes. We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z)
Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models. We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups. We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.