Externally Valid Policy Evaluation Combining Trial and Observational Data
- URL: http://arxiv.org/abs/2310.14763v3
- Date: Tue, 29 Oct 2024 16:20:19 GMT
- Title: Externally Valid Policy Evaluation Combining Trial and Observational Data
- Authors: Sofia Ek, Dave Zachariah,
- Abstract summary: We seek to use trial data to draw valid inferences about the outcome of a policy on the target population.
We develop a method that yields certifiably valid trial-based policy evaluations under any specified range of model miscalibrations.
- Score: 6.875312133832077
- License:
- Abstract: Randomized trials are widely considered as the gold standard for evaluating the effects of decision policies. Trial data is, however, drawn from a population which may differ from the intended target population and this raises a problem of external validity (aka. generalizability). In this paper we seek to use trial data to draw valid inferences about the outcome of a policy on the target population. Additional covariate data from the target population is used to model the sampling of individuals in the trial study. We develop a method that yields certifiably valid trial-based policy evaluations under any specified range of model miscalibrations. The method is nonparametric and the validity is assured even with finite samples. The certified policy evaluations are illustrated using both simulated and real data.
Related papers
- On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Conformal Off-Policy Evaluation in Markov Decision Processes [53.786439742572995]
Reinforcement Learning aims at identifying and evaluating efficient control policies from data.
Most methods for this learning task, referred to as Off-Policy Evaluation (OPE), do not come with accuracy and certainty guarantees.
We present a novel OPE method based on Conformal Prediction that outputs an interval containing the true reward of the target policy with a prescribed level of certainty.
arXiv Detail & Related papers (2023-04-05T16:45:11Z) - Improved Policy Evaluation for Randomized Trials of Algorithmic Resource
Allocation [54.72195809248172]
We present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT.
We prove theoretically that such an estimator is more accurate than common estimators based on sample means.
arXiv Detail & Related papers (2023-02-06T05:17:22Z) - Off-Policy Evaluation with Out-of-Sample Guarantees [21.527138355664174]
We consider the problem of evaluating the performance of a decision policy using past observational data.
We show that it is possible to draw such inferences with finite-sample coverage guarantees about the entire loss distribution.
The evaluation method can be used to certify the performance of a policy using observational data under a specified range of credible model assumptions.
arXiv Detail & Related papers (2023-01-20T15:56:39Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - Externally Valid Policy Choice [0.0]
We consider the problem of learning personalized treatment policies that are externally valid or generalizable.
We first show that welfare-maximizing policies for the experimental population are robust to shifts in the distribution of outcomes.
We then develop new methods for learning policies that are robust to shifts in outcomes and characteristics.
arXiv Detail & Related papers (2022-05-11T15:19:22Z) - Identification of Subgroups With Similar Benefits in Off-Policy Policy
Evaluation [60.71312668265873]
We develop a method to balance the need for personalization with confident predictions.
We show that our method can be used to form accurate predictions of heterogeneous treatment effects.
arXiv Detail & Related papers (2021-11-28T23:19:12Z) - Case-based off-policy policy evaluation using prototype learning [8.550140109387467]
We propose estimating the behavior policy for off-policy policy evaluation using prototype learning.
We show how the prototypes give a condensed summary of differences between the target and behavior policies.
We also describe estimated values in terms of the prototypes to better understand which parts of the target policies have the most impact on the estimates.
arXiv Detail & Related papers (2021-11-22T11:03:45Z) - Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under
Batch Update Policy [8.807587076209566]
The goal of off-policy evaluation (OPE) is to evaluate a new policy using historical data obtained via a behavior policy.
Because the contextual bandit updates the policy based on past observations, the samples are not independent and identically distributed.
This paper tackles this problem by constructing an estimator from a martingale difference sequence (MDS) for the dependent samples.
arXiv Detail & Related papers (2020-10-23T15:22:57Z) - Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
Policies [80.42316902296832]
We study the estimation of policy value and gradient of a deterministic policy from off-policy data when actions are continuous.
In this setting, standard importance sampling and doubly robust estimators for policy value and gradient fail because the density ratio does not exist.
We propose several new doubly robust estimators based on different kernelization approaches.
arXiv Detail & Related papers (2020-06-06T15:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.