Related papers: (De)Noise: Moderating the Inconsistency Between Human Decision-Makers

(De)Noise: Moderating the Inconsistency Between Human Decision-Makers

URL: http://arxiv.org/abs/2407.11225v1
Date: Mon, 15 Jul 2024 20:24:36 GMT
Title: (De)Noise: Moderating the Inconsistency Between Human Decision-Makers
Authors: Nina Grgić-Hlača, Junaid Ali, Krishna P. Gummadi, Jennifer Wortman Vaughan,
Abstract summary: We study whether algorithmic decision aids can be used to moderate the degree of inconsistency in human decision-making in the context of real estate appraisal. We find that both (i) asking respondents to review their estimates in a series of algorithmically chosen pairwise comparisons and (ii) providing respondents with traditional machine advice are effective strategies for influencing human responses.
Score: 15.291993233528526
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prior research in psychology has found that people's decisions are often inconsistent. An individual's decisions vary across time, and decisions vary even more across people. Inconsistencies have been identified not only in subjective matters, like matters of taste, but also in settings one might expect to be more objective, such as sentencing, job performance evaluations, or real estate appraisals. In our study, we explore whether algorithmic decision aids can be used to moderate the degree of inconsistency in human decision-making in the context of real estate appraisal. In a large-scale human-subject experiment, we study how different forms of algorithmic assistance influence the way that people review and update their estimates of real estate prices. We find that both (i) asking respondents to review their estimates in a series of algorithmically chosen pairwise comparisons and (ii) providing respondents with traditional machine advice are effective strategies for influencing human responses. Compared to simply reviewing initial estimates one by one, the aforementioned strategies lead to (i) a higher propensity to update initial estimates, (ii) a higher accuracy of post-review estimates, and (iii) a higher degree of consistency between the post-review estimates of different respondents. While these effects are more pronounced with traditional machine advice, the approach of reviewing algorithmically chosen pairs can be implemented in a wider range of settings, since it does not require access to ground truth data.

Related papers

Treatment Effect Estimation for Optimal Decision-Making [65.30942348196443]
We study optimal decision-making based on two-stage CATE estimators.<n>We propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance.
arXiv Detail & Related papers (2025-05-19T13:24:57Z)
Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks [6.520709313101523]
This work investigates cognitive bias identification in high-stake decision making process by human experts. We propose bias-aware AI-augmented workflow that surpass human judgment. In our experiments, both the proposed model and the agentic workflow significantly improves on both human judgment and alternative models.
arXiv Detail & Related papers (2024-11-13T10:42:11Z)
Diverging Preferences: When do Annotators Disagree and do Models Know? [92.24651142187989]
We develop a taxonomy of disagreement sources spanning 10 categories across four high-level classes. We find that the majority of disagreements are in opposition with standard reward modeling approaches. We develop methods for identifying diverging preferences to mitigate their influence on evaluation and training.
arXiv Detail & Related papers (2024-10-18T17:32:22Z)
Mitigating Cognitive Biases in Multi-Criteria Crowd Assessment [22.540544209683592]
We focus on cognitive biases associated with a multi-criteria assessment in crowdsourcing. Crowdworkers who rate targets with multiple different criteria simultaneously may provide biased responses due to prominence of some criteria or global impressions of the evaluation targets. We propose two specific model structures for Bayesian opinion aggregation models that consider inter-criteria relations.
arXiv Detail & Related papers (2024-07-10T16:00:23Z)
Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori. In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty. We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z)
Decision Theoretic Foundations for Experiments Evaluating Human Decisions [18.27590643693167]
We argue that to attribute loss in human performance to forms of bias, an experiment must provide participants with the information that a rational agent would need to identify the utility-maximizing decision. As a demonstration, we evaluate the extent to which recent evaluations of decision-making from the literature on AI-assisted decisions achieve these criteria.
arXiv Detail & Related papers (2024-01-25T16:21:37Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
In Search of Insights, Not Magic Bullets: Towards Demystification of the Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria. We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z)
Robust Design and Evaluation of Predictive Algorithms under Unobserved Confounding [2.8498944632323755]
We propose a unified framework for the robust design and evaluation of predictive algorithms in selectively observed data. We impose general assumptions on how much the outcome may vary on average between unselected and selected units. We develop debiased machine learning estimators for the bounds on a large class of predictive performance estimands.
arXiv Detail & Related papers (2022-12-19T20:41:44Z)
Explainability's Gain is Optimality's Loss? -- How Explanations Bias Decision-making [0.0]
Explanations help to facilitate communication between the algorithm and the human decision-maker. Feature-based explanations' semantics of causal models induce leakage from the decision-maker's prior beliefs. Such differences can lead to sub-optimal and biased decision outcomes.
arXiv Detail & Related papers (2022-06-17T11:43:42Z)
Homophily and Incentive Effects in Use of Algorithms [17.55279695774825]
We present a crowdsourcing vignette study designed to assess the impacts of two plausible factors on AI-informed decision-making. First, we examine homophily -- do people defer more to models that tend to agree with them? Second, we consider incentives -- how do people incorporate a (known) cost structure in the hybrid decision-making setting?
arXiv Detail & Related papers (2022-05-19T17:11:04Z)
The Impact of Algorithmic Risk Assessments on Human Predictions and its Analysis via Crowdsourcing Studies [79.66833203975729]
We conduct a vignette study in which laypersons are tasked with predicting future re-arrests. Our key findings are as follows: Participants often predict that an offender will be rearrested even when they deem the likelihood of re-arrest to be well below 50%. Judicial decisions, unlike participants' predictions, depend in part on factors that are to the likelihood of re-arrest.
arXiv Detail & Related papers (2021-09-03T11:09:10Z)
Learning Overlapping Representations for the Estimation of Individualized Treatment Effects [97.42686600929211]
Estimating the likely outcome of alternatives from observational data is a challenging problem. We show that algorithms that learn domain-invariant representations of inputs are often inappropriate. We develop a deep kernel regression algorithm and posterior regularization framework that substantially outperforms the state-of-the-art on a variety of benchmarks data sets.
arXiv Detail & Related papers (2020-01-14T12:56:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.