Related papers: Once Upon a Team: Investigating Bias in LLM-Driven Software Team Composition and Task Allocation

Once Upon a Team: Investigating Bias in LLM-Driven Software Team Composition and Task Allocation

URL: http://arxiv.org/abs/2601.03857v1
Date: Wed, 07 Jan 2026 12:13:22 GMT
Title: Once Upon a Team: Investigating Bias in LLM-Driven Software Team Composition and Task Allocation
Authors: Alessandra Parziale, Gianmario Voria, Valeria Pontillo, Amleto Di Salle, Patrizio Pelliccione, Gemma Catolino, Fabio Palomba,
Abstract summary: This study investigates whether LLMs exhibit bias in team composition and task assignment.<n>Using three LLMs and 3,000 simulated decisions, we find systematic disparities.<n>Our findings indicate that LLMs exacerbate demographic inequities in software engineering contexts.
Score: 48.2168236140771
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs are increasingly used to boost productivity and support software engineering tasks. However, when applied to socially sensitive decisions such as team composition and task allocation, they raise concerns of fairness. Prior studies have revealed that LLMs may reproduce stereotypes; however, these analyses remain exploratory and examine sensitive attributes in isolation. This study investigates whether LLMs exhibit bias in team composition and task assignment by analyzing the combined effects of candidates' country and pronouns. Using three LLMs and 3,000 simulated decisions, we find systematic disparities: demographic attributes significantly shaped both selection likelihood and task allocation, even when accounting for expertise-related factors. Task distributions further reflected stereotypes, with technical and leadership roles unevenly assigned across groups. Our findings indicate that LLMs exacerbate demographic inequities in software engineering contexts, underscoring the need for fairness-aware assessment.

Related papers

Evaluating LLM Behavior in Hiring: Implicit Weights, Fairness Across Groups, and Alignment with Human Preferences [0.8155575318208629]
We propose a framework to evaluate an LLM's decision logic in recruitment.<n>We build synthetic datasets from real freelancer profiles and project descriptions from a major European online freelance marketplace.<n>We identify which attributes the LLM prioritizes and analyze how these weights vary across project contexts and demographic subgroups.
arXiv Detail & Related papers (2026-01-16T15:38:03Z)
How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective [64.00022624183781]
Large language models (LLMs) can assess relevance and support information retrieval (IR) tasks.<n>We investigate how different LLM modules contribute to relevance judgment through the lens of mechanistic interpretability.
arXiv Detail & Related papers (2025-04-10T16:14:55Z)
Evaluating how LLM annotations represent diverse views on contentious topics [3.405231040967506]
We show that generative large language models (LLMs) tend to be biased in the same directions on the same demographic categories within the same datasets.<n>We conclude with a discussion of the implications for researchers and practitioners using LLMs for automated data annotation tasks.
arXiv Detail & Related papers (2025-03-29T22:53:15Z)
Evaluating Bias in LLMs for Job-Resume Matching: Gender, Race, and Education [8.235367170516769]
Large Language Models (LLMs) offer the potential to automate hiring by matching job descriptions with candidate resumes.<n>However, biases inherent in these models may lead to unfair hiring practices, reinforcing societal prejudices and undermining workplace diversity.<n>This study examines the performance and fairness of LLMs in job-resume matching tasks within the English language and U.S. context.
arXiv Detail & Related papers (2025-03-24T22:11:22Z)
The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks. This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z)
Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear. By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z)
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.<n>It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.<n>Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z)
Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks. We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
Fairness of ChatGPT [30.969927447499405]
This work aims to provide a systematic evaluation of the effectiveness and fairness of LLMs using ChatGPT as a study case. We focus on assessing ChatGPT's performance in high-takes fields including education, criminology, finance and healthcare. This work contributes to a deeper understanding of LLMs' fairness performance, facilitates bias mitigation and fosters the development of responsible AI systems.
arXiv Detail & Related papers (2023-05-22T17:51:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.