On Modeling Human Perceptions of Allocation Policies with Uncertain
Outcomes
- URL: http://arxiv.org/abs/2103.05827v1
- Date: Wed, 10 Mar 2021 02:22:08 GMT
- Title: On Modeling Human Perceptions of Allocation Policies with Uncertain
Outcomes
- Authors: Hoda Heidari, Solon Barocas, Jon Kleinberg, and Karen Levy
- Abstract summary: We show that probability weighting can be used to make predictions about preferences over probabilistic distributions of harm and benefit.
We identify optimal policies for minimizing perceived total harm and maximizing perceived total benefit that take the distorting effects of probability weighting into account.
- Score: 6.729250803621849
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many policies allocate harms or benefits that are uncertain in nature: they
produce distributions over the population in which individuals have different
probabilities of incurring harm or benefit. Comparing different policies thus
involves a comparison of their corresponding probability distributions, and we
observe that in many instances the policies selected in practice are hard to
explain by preferences based only on the expected value of the total harm or
benefit they produce. In cases where the expected value analysis is not a
sufficient explanatory framework, what would be a reasonable model for societal
preferences over these distributions? Here we investigate explanations based on
the framework of probability weighting from the behavioral sciences, which over
several decades has identified systematic biases in how people perceive
probabilities. We show that probability weighting can be used to make
predictions about preferences over probabilistic distributions of harm and
benefit that function quite differently from expected-value analysis, and in a
number of cases provide potential explanations for policy preferences that
appear hard to motivate by other means. In particular, we identify optimal
policies for minimizing perceived total harm and maximizing perceived total
benefit that take the distorting effects of probability weighting into account,
and we discuss a number of real-world policies that resemble such allocational
strategies. Our analysis does not provide specific recommendations for policy
choices, but is instead fundamentally interpretive in nature, seeking to
describe observed phenomena in policy choices.
Related papers
- Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori.
In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty.
We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z) - Policy Learning with Distributional Welfare [1.0742675209112622]
Most literature on treatment choice has considered utilitarian welfare based on the conditional average treatment effect (ATE)
This paper proposes an optimal policy that allocates the treatment based on the conditional quantile of individual treatment effects (QoTE)
arXiv Detail & Related papers (2023-11-27T14:51:30Z) - Statistical Inference Under Constrained Selection Bias [20.862583584531322]
We propose a framework that enables statistical inference in the presence of selection bias.
The output is high-probability bounds on the value of an estimand for the target distribution.
We analyze the computational and statistical properties of methods to estimate these bounds and show that our method can produce informative bounds on a variety of simulated and semisynthetic tasks.
arXiv Detail & Related papers (2023-06-05T23:05:26Z) - Policy Dispersion in Non-Markovian Environment [53.05904889617441]
This paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment.
We first adopt a transformer-based method to learn policy embeddings.
Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies.
arXiv Detail & Related papers (2023-02-28T11:58:39Z) - A Risk-Sensitive Approach to Policy Optimization [21.684251937825234]
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy.
We propose a more direct approach whereby risk-sensitive objectives, specified in terms of the cumulative distribution function (CDF) of the distribution of full-episode rewards, are optimized.
We demonstrate that the use of moderately "pessimistic" risk profiles, which emphasize scenarios where the agent performs poorly, leads to enhanced exploration and a continual focus on addressing deficiencies.
arXiv Detail & Related papers (2022-08-19T00:55:05Z) - Cross-model Fairness: Empirical Study of Fairness and Ethics Under Model Multiplicity [10.144058870887061]
We argue that individuals can be harmed when one predictor is chosen ad hoc from a group of equally well performing models.
Our findings suggest that such unfairness can be readily found in real life and it may be difficult to mitigate by technical means alone.
arXiv Detail & Related papers (2022-03-14T14:33:39Z) - Identification of Subgroups With Similar Benefits in Off-Policy Policy
Evaluation [60.71312668265873]
We develop a method to balance the need for personalization with confident predictions.
We show that our method can be used to form accurate predictions of heterogeneous treatment effects.
arXiv Detail & Related papers (2021-11-28T23:19:12Z) - Safe Policy Learning through Extrapolation: Application to Pre-trial Risk Assessment [0.4999814847776098]
We examine a particular case of algorithmic pre-trial risk assessments in the US criminal justice system.
We analyze data from a unique field experiment on an algorithmic pre-trial risk assessment to investigate whether the scores and recommendations can be improved.
arXiv Detail & Related papers (2021-09-22T00:52:03Z) - Offline Policy Selection under Uncertainty [113.57441913299868]
We consider offline policy selection as learning preferences over a set of policy prospects given a fixed experience dataset.
Access to the full distribution over one's belief of the policy value enables more flexible selection algorithms under a wider range of downstream evaluation metrics.
We show how BayesDICE may be used to rank policies with respect to any arbitrary downstream policy selection metric.
arXiv Detail & Related papers (2020-12-12T23:09:21Z) - Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy.
We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z) - Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
Policies [80.42316902296832]
We study the estimation of policy value and gradient of a deterministic policy from off-policy data when actions are continuous.
In this setting, standard importance sampling and doubly robust estimators for policy value and gradient fail because the density ratio does not exist.
We propose several new doubly robust estimators based on different kernelization approaches.
arXiv Detail & Related papers (2020-06-06T15:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.