A Dynamic Model of Performative Human-ML Collaboration: Theory and Empirical Evidence
- URL: http://arxiv.org/abs/2405.13753v3
- Date: Mon, 07 Oct 2024 08:20:55 GMT
- Title: A Dynamic Model of Performative Human-ML Collaboration: Theory and Empirical Evidence
- Authors: Tom Sühr, Samira Samadi, Chiara Farronato,
- Abstract summary: We present a novel framework for thinking about the deployment of machine learning models in a performative, human-ML collaborative system.
In our framework, the introduction of ML recommendations changes the data-generating process of human decisions.
We find that for many levels of ML performance, humans can improve upon the ML predictions.
- Score: 2.498836880652668
- License:
- Abstract: Machine learning (ML) models are increasingly used in various applications, from recommendation systems in e-commerce to diagnosis prediction in healthcare. In this paper, we present a novel dynamic framework for thinking about the deployment of ML models in a performative, human-ML collaborative system. In our framework, the introduction of ML recommendations changes the data-generating process of human decisions, which are only a proxy to the ground truth and which are then used to train future versions of the model. We show that this dynamic process in principle can converge to different stable points, i.e. where the ML model and the Human+ML system have the same performance. Some of these stable points are suboptimal with respect to the actual ground truth. As a proof of concept, we conduct an empirical user study with 1,408 participants. In the study, humans solve instances of the knapsack problem with the help of machine learning predictions of varying performance. This is an ideal setting because we can identify the actual ground truth, and evaluate the performance of human decisions supported by ML recommendations. We find that for many levels of ML performance, humans can improve upon the ML predictions. We also find that the improvement could be even higher if humans rationally followed the ML recommendations. Finally, we test whether monetary incentives can increase the quality of human decisions, but we fail to find any positive effect. Using our empirical data to approximate our collaborative system suggests that the learning process would dynamically reach an equilibrium performance that is around 92% of the maximum knapsack value. Our results have practical implications for the deployment of ML models in contexts where human decisions may deviate from the indisputable ground truth.
Related papers
- Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback [87.37721254914476]
We introduce a routing framework that combines inputs from humans and LMs to achieve better annotation quality.
We train a performance prediction model to predict a reward model's performance on an arbitrary combination of human and LM annotations.
We show that the selected hybrid mixture achieves better reward model performance compared to using either one exclusively.
arXiv Detail & Related papers (2024-10-24T20:04:15Z) - How Aligned are Generative Models to Humans in High-Stakes Decision-Making? [10.225573060836478]
Large generative models (LMs) are increasingly being considered for high-stakes decision-making.
This work considers how such models compare to humans and predictive AI models on a specific case of recidivism prediction.
arXiv Detail & Related papers (2024-10-20T19:00:59Z) - Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning [0.0]
Large Language Models (LLMs) have demonstrated their capabilities across various tasks.
This paper exploits the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks.
We compare the performance of LLMs with a cognitive instance-based learning model, which imitates human experiential decision-making.
arXiv Detail & Related papers (2024-07-12T14:13:06Z) - Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions [10.448052192725168]
This paper investigates the long-term impacts when machine learning models are retrained with model-annotated samples.
We find that agents are increasingly likely to receive positive decisions as the model gets retrained.
We propose a refined retraining process to stabilize the dynamics.
arXiv Detail & Related papers (2024-05-12T13:36:58Z) - Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models [115.501751261878]
Fine-tuning language models(LMs) on human-generated data remains a prevalent practice.
We investigate whether we can go beyond human data on tasks where we have access to scalar feedback.
We find that ReST$EM$ scales favorably with model size and significantly surpasses fine-tuning only on human data.
arXiv Detail & Related papers (2023-12-11T18:17:43Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision.
We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z) - Reinforcement Learning with Human Feedback: Learning Dynamic Choices via
Pessimism [91.52263068880484]
We study offline Reinforcement Learning with Human Feedback (RLHF)
We aim to learn the human's underlying reward and the MDP's optimal policy from a set of trajectories induced by human choices.
RLHF is challenging for multiple reasons: large state space but limited human feedback, the bounded rationality of human decisions, and the off-policy distribution shift.
arXiv Detail & Related papers (2023-05-29T01:18:39Z) - Quantifying Human Bias and Knowledge to guide ML models during Training [0.0]
We introduce an experimental approach to dealing with skewed datasets by including humans in the training process.
We ask humans to rank the importance of features of the dataset, and through rank aggregation, determine the initial weight bias for the model.
We show that collective human bias can allow ML models to learn insights about the true population instead of the biased sample.
arXiv Detail & Related papers (2022-11-19T20:49:07Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Insights into Performance Fitness and Error Metrics for Machine Learning [1.827510863075184]
Machine learning (ML) is the field of training machines to achieve high level of cognition and perform human-like analysis.
This paper examines a number of the most commonly-used performance fitness and error metrics for regression and classification algorithms.
arXiv Detail & Related papers (2020-05-17T22:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.