Treatment Effect Estimation for Optimal Decision-Making
- URL: http://arxiv.org/abs/2505.13092v2
- Date: Thu, 22 May 2025 12:15:03 GMT
- Title: Treatment Effect Estimation for Optimal Decision-Making
- Authors: Dennis Frauen, Valentyn Melnychuk, Jonas Schweisthal, Mihaela van der Schaar, Stefan Feuerriegel,
- Abstract summary: We study optimal decision-making based on two-stage CATE estimators.<n>We propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance.
- Score: 65.30942348196443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision-making across various fields, such as medicine, heavily relies on conditional average treatment effects (CATEs). Practitioners commonly make decisions by checking whether the estimated CATE is positive, even though the decision-making performance of modern CATE estimators is poorly understood from a theoretical perspective. In this paper, we study optimal decision-making based on two-stage CATE estimators (e.g., DR-learner), which are considered state-of-the-art and widely used in practice. We prove that, while such estimators may be optimal for estimating CATE, they can be suboptimal when used for decision-making. Intuitively, this occurs because such estimators prioritize CATE accuracy in regions far away from the decision boundary, which is ultimately irrelevant to decision-making. As a remedy, we propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance. We then propose a neural method that optimizes an adaptively-smoothed approximation of our learning objective. Finally, we confirm the effectiveness of our method both empirically and theoretically. In sum, our work is the first to show how two-stage CATE estimators can be adapted for optimal decision-making.
Related papers
- A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding [68.43987626137512]
We propose a principled framework for randomized decision-making based on interval estimates of the quality of each item.<n>We introduce MERIT, an optimization-based method that maximizes the worst-case expected number of top candidates selected.<n>We prove that MERIT satisfies desirable axiomatic properties not guaranteed by existing approaches.
arXiv Detail & Related papers (2025-06-23T19:59:30Z) - Sufficient Decision Proxies for Decision-Focused Learning [2.7143637678944454]
Decision-focused learning aims at learning a predictive model such that decision quality, instead of prediction accuracy, is maximized.<n>This paper investigates for the first time problem properties that justify using either assumption.<n>We show the effectiveness of presented approaches in experiments on problems with continuous and discrete variables, as well as uncertainty in the objective function and in the constraints.
arXiv Detail & Related papers (2025-05-06T20:10:17Z) - Uplift modeling with continuous treatments: A predict-then-optimize approach [4.132346971686944]
The goal of uplift modeling is to recommend actions that optimize specific outcomes by determining which entities should receive treatment.<n>While uplift modeling typically focuses on binary treatments, many real-world applications are characterized by continuousvalued treatments.<n>This paper presents a predictthenoptimize framework to allow for continuous treatments in uplift modeling.
arXiv Detail & Related papers (2024-12-12T12:43:42Z) - Asymptotically Optimal Regret for Black-Box Predict-then-Optimize [7.412445894287709]
We study new black-box predict-then-optimize problems that lack special structure and where we only observe the reward from the action taken.
We present a novel loss function, which we call Empirical Soft Regret (ESR), designed to significantly improve reward when used in training.
We also show our approach significantly outperforms state-of-the-art algorithms on real-world decision-making problems in news recommendation and personalized healthcare.
arXiv Detail & Related papers (2024-06-12T04:46:23Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Unveiling the Potential of Robustness in Selecting Conditional Average Treatment Effect Estimators [19.053826145863113]
This paper introduces a Distributionally Robust Metric (DRM) for CATE estimator selection.
DRM is nuisance-free, eliminating the need to fit models for nuisance parameters.
It effectively prioritizes the selection of a distributionally robust CATE estimator.
arXiv Detail & Related papers (2024-02-28T15:12:24Z) - CATE Estimation With Potential Outcome Imputation From Local Regression [24.97657507206549]
We propose a model-agnostic data augmentation method for Conditional Average Treatment Effect estimation.<n>Inspired by this idea, we propose a contrastive learning approach that reliably imputes missing potential outcomes.<n>We provide both theoretical guarantees and extensive numerical studies demonstrating the effectiveness of our approach.
arXiv Detail & Related papers (2023-11-07T00:36:51Z) - B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under
Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding.
We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z) - Model-Free Reinforcement Learning with the Decision-Estimation
Coefficient [79.30248422988409]
We consider the problem of interactive decision making, encompassing structured bandits and reinforcement learning with general function approximation.
We use this approach to derive regret bounds for model-free reinforcement learning with value function approximation, and give structural results showing when it can and cannot help more generally.
arXiv Detail & Related papers (2022-11-25T17:29:40Z) - Policy-Adaptive Estimator Selection for Off-Policy Evaluation [12.1655494876088]
Off-policy evaluation (OPE) aims to accurately evaluate the performance of counterfactual policies using only offline logged data.
This paper studies this challenging problem of estimator selection for OPE for the first time.
In particular, we enable an estimator selection that is adaptive to a given OPE task, by appropriately subsampling available logged data and constructing pseudo policies.
arXiv Detail & Related papers (2022-11-25T05:31:42Z) - Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation [87.54604263202941]
We propose a tiny deep neural network of which partial layers are iteratively exploited for refining its previous estimations.
We employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model.
Our method consistently outperforms state-of-the-art 2D/3D hand pose estimation approaches in terms of both accuracy and efficiency for widely used benchmarks.
arXiv Detail & Related papers (2021-11-11T23:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.