Related papers: ABLEIST: Intersectional Disability Bias in LLM-Generated Hiring Scenarios

ABLEIST: Intersectional Disability Bias in LLM-Generated Hiring Scenarios

URL: http://arxiv.org/abs/2510.10998v1
Date: Mon, 13 Oct 2025 04:18:23 GMT
Title: ABLEIST: Intersectional Disability Bias in LLM-Generated Hiring Scenarios
Authors: Mahika Phutane, Hayoung Jung, Matthew Kim, Tanushree Mitra, Aditya Vashistha,
Abstract summary: Large language models (LLMs) are under scrutiny for perpetuating identity-based discrimination in high-stakes domains such as hiring, particularly against people with disabilities (PwD)<n>We conduct a comprehensive audit of six LLMs across 2,820 hiring scenarios spanning diverse disability, gender, nationality, and caste profiles.<n>To capture subtle intersectional harms and biases, we introduce ABLEIST, a set of five ableism-specific and three intersectional harm metrics grounded in disability studies literature.
Score: 17.536416969288798
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are increasingly under scrutiny for perpetuating identity-based discrimination in high-stakes domains such as hiring, particularly against people with disabilities (PwD). However, existing research remains largely Western-centric, overlooking how intersecting forms of marginalization--such as gender and caste--shape experiences of PwD in the Global South. We conduct a comprehensive audit of six LLMs across 2,820 hiring scenarios spanning diverse disability, gender, nationality, and caste profiles. To capture subtle intersectional harms and biases, we introduce ABLEIST (Ableism, Inspiration, Superhumanization, and Tokenism), a set of five ableism-specific and three intersectional harm metrics grounded in disability studies literature. Our results reveal significant increases in ABLEIST harms towards disabled candidates--harms that many state-of-the-art models failed to detect. These harms were further amplified by sharp increases in intersectional harms (e.g., Tokenism) for gender and caste-marginalized disabled candidates, highlighting critical blind spots in current safety tools and the need for intersectional safety evaluations of frontier models in high-stakes domains like hiring.

Related papers

Large Language Models' Complicit Responses to Illicit Instructions across Socio-Legal Contexts [54.15982476754607]
Large language models (LLMs) are now deployed at unprecedented scale, assisting millions of users in daily tasks.<n>This study defines complicit facilitation as the provision of guidance or support that enables illicit user instructions.<n>Using real-world legal cases and established legal frameworks, we construct an evaluation benchmark spanning 269 illicit scenarios and 50 illicit intents.
arXiv Detail & Related papers (2025-11-25T16:01:31Z)
Who's Asking? Investigating Bias Through the Lens of Disability Framed Queries in LLMs [2.722784054643991]
Large Language Models (LLMs) routinely infer users demographic traits from phrasing alone.<n>Disability cues in shaping these inferences remains largely uncharted.<n>We present the first systematic audit of disability-conditioned demographic bias across eight state-of-the-art instruction-tuned LLMs.
arXiv Detail & Related papers (2025-08-18T21:03:09Z)
Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution [5.061421107401101]
Large language models (LLMs) have achieved impressive performance, leading to their widespread adoption as decision-support tools in resource-constrained contexts like hiring and admissions.<n>There is, however, scientific consensus that AI systems can reflect and exacerbate societal biases, raising concerns about identity-based harm when used in critical social contexts.<n>In this work, we extend single-axis fairness evaluations to examine intersectional bias, recognizing that when multiple axes of discrimination intersect, they create distinct patterns of disadvantage.
arXiv Detail & Related papers (2025-08-09T22:24:40Z)
DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis [20.36241144630387]
Large language models (LLMs) have revolutionized natural language processing (NLP)<n>LLMs have been shown to reflect and perpetuate harmful societal biases, including those based on ethnicity, gender, and religion.<n>We propose DECASTE, a novel framework designed to detect and assess both implicit and explicit caste biases in LLMs.
arXiv Detail & Related papers (2025-05-20T23:19:13Z)
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach [53.824673312331626]
Implicit Demography Inference (IDI) module uses k-means clustering to mitigate bias in Speech Emotion Recognition (SER)<n>Experiments show that pseudo-labeling IDI reduces subgroup disparities, improving fairness metrics by over 28%.<n>Unsupervised IDI yields more than a 4.6% improvement in fairness metrics with a drop of less than 3.6% in SER performance.
arXiv Detail & Related papers (2025-05-20T14:50:44Z)
The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [91.86718720024825]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.<n>Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.<n>We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z)
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.<n>GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z)
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation. We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z)
Unpacking the Interdependent Systems of Discrimination: Ableist Bias in NLP Systems through an Intersectional Lens [20.35460711907179]
We report on various analyses based on word predictions of a large-scale BERT language model. Statistically significant results demonstrate that people with disabilities can be disadvantaged. Findings also explore overlapping forms of discrimination related to interconnected gender and race identities.
arXiv Detail & Related papers (2021-10-01T16:40:58Z)
Towards Robust Fine-grained Recognition by Maximal Separation of Discriminative Features [72.72840552588134]
We identify the proximity of the latent representations of different classes in fine-grained recognition networks as a key factor to the success of adversarial attacks. We introduce an attention-based regularization mechanism that maximally separates the discriminative latent features of different classes.
arXiv Detail & Related papers (2020-06-10T18:34:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.