Investigations of Performance and Bias in Human-AI Teamwork in Hiring
- URL: http://arxiv.org/abs/2202.11812v1
- Date: Mon, 21 Feb 2022 17:58:07 GMT
- Title: Investigations of Performance and Bias in Human-AI Teamwork in Hiring
- Authors: Andi Peng, Besmira Nushi, Emre Kiciman, Kori Inkpen, Ece Kamar
- Abstract summary: In AI-assisted decision-making, effective hybrid teamwork (human-AI) is not solely dependent on AI performance alone.
We investigate how both a model's predictive performance and bias may transfer to humans in a recommendation-aided decision task.
- Score: 30.046502708053097
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In AI-assisted decision-making, effective hybrid (human-AI) teamwork is not
solely dependent on AI performance alone, but also on its impact on human
decision-making. While prior work studies the effects of model accuracy on
humans, we endeavour here to investigate the complex dynamics of how both a
model's predictive performance and bias may transfer to humans in a
recommendation-aided decision task. We consider the domain of ML-assisted
hiring, where humans -- operating in a constrained selection setting -- can
choose whether they wish to utilize a trained model's inferences to help select
candidates from written biographies. We conduct a large-scale user study
leveraging a re-created dataset of real bios from prior work, where humans
predict the ground truth occupation of given candidates with and without the
help of three different NLP classifiers (random, bag-of-words, and deep neural
network). Our results demonstrate that while high-performance models
significantly improve human performance in a hybrid setting, some models
mitigate hybrid bias while others accentuate it. We examine these findings
through the lens of decision conformity and observe that our model architecture
choices have an impact on human-AI conformity and bias, motivating the explicit
need to assess these complex dynamics prior to deployment.
Related papers
- Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks [6.520709313101523]
This work investigates cognitive bias identification in high-stake decision making process by human experts.
We propose bias-aware AI-augmented workflow that surpass human judgment.
In our experiments, both the proposed model and the agentic workflow significantly improves on both human judgment and alternative models.
arXiv Detail & Related papers (2024-11-13T10:42:11Z) - How Aligned are Generative Models to Humans in High-Stakes Decision-Making? [10.225573060836478]
Large generative models (LMs) are increasingly being considered for high-stakes decision-making.
This work considers how such models compare to humans and predictive AI models on a specific case of recidivism prediction.
arXiv Detail & Related papers (2024-10-20T19:00:59Z) - On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - Using LLMs to Model the Beliefs and Preferences of Targeted Populations [4.0849074543032105]
We consider the problem of aligning a large language model (LLM) to model the preferences of a human population.
Modeling the beliefs, preferences, and behaviors of a specific population can be useful for a variety of different applications.
arXiv Detail & Related papers (2024-03-29T15:58:46Z) - Offline Risk-sensitive RL with Partial Observability to Enhance
Performance in Human-Robot Teaming [1.3980986259786223]
We propose a method to incorporate model uncertainty, thus enabling risk-sensitive sequential decision-making.
Experiments were conducted with a group of twenty-six human participants within a simulated robot teleoperation environment.
arXiv Detail & Related papers (2024-02-08T14:27:34Z) - Secrets of RLHF in Large Language Models Part II: Reward Modeling [134.97964938009588]
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset.
We also introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses.
arXiv Detail & Related papers (2024-01-11T17:56:59Z) - Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly.
L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors.
We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z) - Towards Personalized Federated Learning via Heterogeneous Model
Reassembly [84.44268421053043]
pFedHR is a framework that leverages heterogeneous model reassembly to achieve personalized federated learning.
pFedHR dynamically generates diverse personalized models in an automated manner.
arXiv Detail & Related papers (2023-08-16T19:36:01Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Predicting human decisions with behavioral theories and machine learning [13.000185375686325]
We introduce BEAST Gradient Boosting (BEAST-GB), a novel hybrid model that synergizes behavioral theories with machine learning techniques.
We show that BEAST-GB achieves state-of-the-art performance on the largest publicly available dataset of human risky choice.
We also show BEAST-GB displays robust domain generalization capabilities as it effectively predicts choice behavior in new experimental contexts.
arXiv Detail & Related papers (2019-04-15T06:12:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.