How Aligned are Generative Models to Humans in High-Stakes Decision-Making?
- URL: http://arxiv.org/abs/2410.15471v1
- Date: Sun, 20 Oct 2024 19:00:59 GMT
- Title: How Aligned are Generative Models to Humans in High-Stakes Decision-Making?
- Authors: Sarah Tan, Keri Mallari, Julius Adebayo, Albert Gordo, Martin T. Wells, Kori Inkpen,
- Abstract summary: Large generative models (LMs) are increasingly being considered for high-stakes decision-making.
This work considers how such models compare to humans and predictive AI models on a specific case of recidivism prediction.
- Score: 10.225573060836478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large generative models (LMs) are increasingly being considered for high-stakes decision-making. This work considers how such models compare to humans and predictive AI models on a specific case of recidivism prediction. We combine three datasets -- COMPAS predictive AI risk scores, human recidivism judgements, and photos -- into a dataset on which we study the properties of several state-of-the-art, multimodal LMs. Beyond accuracy and bias, we focus on studying human-LM alignment on the task of recidivism prediction. We investigate if these models can be steered towards human decisions, the impact of adding photos, and whether anti-discimination prompting is effective. We find that LMs can be steered to outperform humans and COMPAS using in context-learning. We find anti-discrimination prompting to have unintended effects, causing some models to inhibit themselves and significantly reduce their number of positive predictions.
Related papers
- Predicting Emergent Capabilities by Finetuning [98.9684114851891]
We find that finetuning language models can shift the point in scaling at which emergence occurs towards less capable models.
We validate this approach using four standard NLP benchmarks.
We find that, in some cases, we can accurately predict whether models trained with up to 4x more compute have emerged.
arXiv Detail & Related papers (2024-11-25T01:48:09Z) - Mind the Uncertainty in Human Disagreement: Evaluating Discrepancies between Model Predictions and Human Responses in VQA [26.968874222330978]
This study focuses on the Visual Question Answering (VQA) task.
We evaluate how well vision-language models correlate with the distribution of human responses.
arXiv Detail & Related papers (2024-09-17T13:44:25Z) - Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models [79.76293901420146]
Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial.
Our research investigates the fragility of uncertainty estimation and explores potential attacks.
We demonstrate that an attacker can embed a backdoor in LLMs, which, when activated by a specific trigger in the input, manipulates the model's uncertainty without affecting the final output.
arXiv Detail & Related papers (2024-07-15T23:41:11Z) - Self-Recognition in Language Models [10.649471089216489]
We propose a novel approach for assessing self-recognition in LMs using model-generated "security questions"
We use our test to examine self-recognition in ten of the most capable open- and closed-source LMs currently publicly available.
Our results suggest that given a set of alternatives, LMs seek to pick the "best" answer, regardless of its origin.
arXiv Detail & Related papers (2024-07-09T15:23:28Z) - Large Language Models Must Be Taught to Know What They Don't Know [97.90008709512921]
We show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead.
We also investigate the mechanisms that enable reliable uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators.
arXiv Detail & Related papers (2024-06-12T16:41:31Z) - Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models [115.501751261878]
Fine-tuning language models(LMs) on human-generated data remains a prevalent practice.
We investigate whether we can go beyond human data on tasks where we have access to scalar feedback.
We find that ReST$EM$ scales favorably with model size and significantly surpasses fine-tuning only on human data.
arXiv Detail & Related papers (2023-12-11T18:17:43Z) - Making Pre-trained Language Models both Task-solvers and
Self-calibrators [52.98858650625623]
Pre-trained language models (PLMs) serve as backbones for various real-world systems.
Previous work shows that introducing an extra calibration task can mitigate this issue.
We propose a training algorithm LM-TOAST to tackle the challenges.
arXiv Detail & Related papers (2023-07-21T02:51:41Z) - Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes [72.13373216644021]
We study the societal impact of machine learning by considering the collection of models that are deployed in a given context.
We find deployed machine learning is prone to systemic failure, meaning some users are exclusively misclassified by all models available.
These examples demonstrate ecosystem-level analysis has unique strengths for characterizing the societal impact of machine learning.
arXiv Detail & Related papers (2023-07-12T01:11:52Z) - Scaling Laws Do Not Scale [54.72120385955072]
Recent work has argued that as the size of a dataset increases, the performance of a model trained on that dataset will increase.
We argue that this scaling law relationship depends on metrics used to measure performance that may not correspond with how different groups of people perceive the quality of models' output.
Different communities may also have values in tension with each other, leading to difficult, potentially irreconcilable choices about metrics used for model evaluations.
arXiv Detail & Related papers (2023-07-05T15:32:21Z) - Less Likely Brainstorming: Using Language Models to Generate Alternative
Hypotheses [45.720065723998225]
We introduce a new task, "less likely brainstorming," that asks a model to generate outputs that humans think are relevant but less likely to happen.
We find that a baseline approach of training with less likely hypotheses as targets generates outputs that humans evaluate as either likely or irrelevant nearly half of the time.
We propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans.
arXiv Detail & Related papers (2023-05-30T18:05:34Z) - Large Language Models as Zero-Shot Human Models for Human-Robot Interaction [12.455647753787442]
Large-language models (LLMs) can act as zero-shot human models for human-robot interaction.
LLMs achieve performance comparable to purpose-built models.
We present one case study on a simulated trust-based table-clearing task.
arXiv Detail & Related papers (2023-03-06T23:16:24Z) - Ground(less) Truth: A Causal Framework for Proxy Labels in
Human-Algorithm Decision-Making [29.071173441651734]
We identify five sources of target variable bias that can impact the validity of proxy labels in human-AI decision-making tasks.
We develop a causal framework to disentangle the relationship between each bias.
We conclude by discussing opportunities to better address target variable bias in future research.
arXiv Detail & Related papers (2023-02-13T16:29:11Z) - Evidence > Intuition: Transferability Estimation for Encoder Selection [16.490047604583882]
We generate quantitative evidence to predict which LM will perform best on a target task without having to fine-tune all candidates.
We adopt the state-of-the-art Logarithm Maximum of Evidence (LogME) measure from Computer Vision (CV) and find that it positively correlates with final LM performance in 94% of setups.
arXiv Detail & Related papers (2022-10-20T13:25:21Z) - Investigations of Performance and Bias in Human-AI Teamwork in Hiring [30.046502708053097]
In AI-assisted decision-making, effective hybrid teamwork (human-AI) is not solely dependent on AI performance alone.
We investigate how both a model's predictive performance and bias may transfer to humans in a recommendation-aided decision task.
arXiv Detail & Related papers (2022-02-21T17:58:07Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - Probabilistic Human Motion Prediction via A Bayesian Neural Network [71.16277790708529]
We propose a probabilistic model for human motion prediction in this paper.
Our model could generate several future motions when given an observed motion sequence.
We extensively validate our approach on a large scale benchmark dataset Human3.6m.
arXiv Detail & Related papers (2021-07-14T09:05:33Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - How can I choose an explainer? An Application-grounded Evaluation of
Post-hoc Explanations [2.7708222692419735]
Explanations are seldom evaluated based on their true practical impact on decision-making tasks.
This study proposes XAI Test, an application-grounded evaluation methodology tailored to isolate the impact of providing the end-user with different levels of information.
Using strong statistical analysis, we show that, in general, popular explainers have a worse impact than desired.
arXiv Detail & Related papers (2021-01-21T18:15:13Z) - When Does Uncertainty Matter?: Understanding the Impact of Predictive
Uncertainty in ML Assisted Decision Making [68.19284302320146]
We carry out user studies to assess how people with differing levels of expertise respond to different types of predictive uncertainty.
We found that showing posterior predictive distributions led to smaller disagreements with the ML model's predictions.
This suggests that posterior predictive distributions can potentially serve as useful decision aids which should be used with caution and take into account the type of distribution and the expertise of the human.
arXiv Detail & Related papers (2020-11-12T02:23:53Z) - An Information-Theoretic Approach to Personalized Explainable Machine
Learning [92.53970625312665]
We propose a simple probabilistic model for the predictions and user knowledge.
We quantify the effect of an explanation by the conditional mutual information between the explanation and prediction.
arXiv Detail & Related papers (2020-03-01T13:06:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.