Related papers: Policy-Aligned Estimation of Conditional Average Treatment Effects

Policy-Aligned Estimation of Conditional Average Treatment Effects

URL: http://arxiv.org/abs/2512.13400v1
Date: Mon, 15 Dec 2025 14:51:02 GMT
Title: Policy-Aligned Estimation of Conditional Average Treatment Effects
Authors: Artem Timoshenko, Caio Waisman,
Abstract summary: We propose an approach to estimate the conditional average treatment effects (CATEs) of marketing actions.<n>By modifying the firm's objective function in the standard profit problem, our method yields a near-optimal targeting policy.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Firms often develop targeting policies to personalize marketing actions and improve incremental profits. Effective targeting depends on accurately separating customers with positive versus negative treatment effects. We propose an approach to estimate the conditional average treatment effects (CATEs) of marketing actions that aligns their estimation with the firm's profit objective. The method recognizes that, for many customers, treatment effects are so extreme that additional accuracy is unlikely to change the recommended actions. However, accuracy matters near the decision boundary, as small errors can alter targeting decisions. By modifying the firm's objective function in the standard profit maximization problem, our method yields a near-optimal targeting policy while simultaneously estimating CATEs. This introduces a new perspective on CATE estimation, reframing it as a problem of profit optimization rather than prediction accuracy. We establish the theoretical properties of the proposed method and demonstrate its performance and trade-offs using synthetic data.

Related papers

Direct Profit Estimation Using Uplift Modeling under Clustered Network Interference [0.33842793760651557]
Uplift modeling is a key technique for promotion optimization in recommender systems.<n>Recent developments in interference-aware estimators such as Additive Inverse Propensity Weighting have not found their way into the uplift modeling literature yet.
arXiv Detail & Related papers (2025-09-01T15:38:13Z)
Treatment Effect Estimation for Optimal Decision-Making [65.30942348196443]
We study optimal decision-making based on two-stage CATE estimators.<n>We propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance.
arXiv Detail & Related papers (2025-05-19T13:24:57Z)
Achieving Fairness in Predictive Process Analytics via Adversarial Learning [50.31323204077591]
This paper addresses the challenge of integrating a debiasing phase into predictive business process analytics. Our framework leverages on adversial debiasing is evaluated on four case studies, showing a significant reduction in the contribution of biased variables to the predicted value.
arXiv Detail & Related papers (2024-10-03T15:56:03Z)
Metalearners for Ranking Treatment Effects [1.469168639465869]
We show how learning to rank can maximize the area under a policy's incremental profit curve. We show how learning to rank can maximize the area under a policy's incremental profit curve.
arXiv Detail & Related papers (2024-05-03T15:31:18Z)
Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori. In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty. We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z)
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation [46.61909578101735]
Adversarial Policy Optimization (AdvPO) is a novel solution to the pervasive issue of reward over-optimization in Reinforcement Learning from Human Feedback. In this paper, we introduce a lightweight way to quantify uncertainties in rewards, relying solely on the last layer embeddings of the reward model.
arXiv Detail & Related papers (2024-03-08T09:20:12Z)
Off-Policy Evaluation for Large Action Spaces via Policy Convolution [60.6953713877886]
Policy Convolution family of estimators uses latent structure within actions to strategically convolve the logging and target policies. Experiments on synthetic and benchmark datasets demonstrate remarkable mean squared error (MSE) improvements when using PC.
arXiv Detail & Related papers (2023-10-24T01:00:01Z)
A predict-and-optimize approach to profit-driven churn prevention [1.03590082373586]
We frame the task of targeting customers for a retention campaign as a regret minimization problem. Our proposed model aligns with the guidelines of Predict-and-optimize (PnO) frameworks and can be efficiently solved using gradient descent methods. Results underscore the effectiveness of our approach, which achieves the best average performance compared to other well-established strategies in terms of average profit.
arXiv Detail & Related papers (2023-10-10T22:21:16Z)
Off-Policy Evaluation with Policy-Dependent Optimization Response [90.28758112893054]
We develop a new framework for off-policy evaluation with a textitpolicy-dependent linear optimization response. We construct unbiased estimators for the policy-dependent estimand by a perturbation method. We provide a general algorithm for optimizing causal interventions.
arXiv Detail & Related papers (2022-02-25T20:25:37Z)
Variance-Aware Off-Policy Evaluation with Linear Function Approximation [85.75516599931632]
We study the off-policy evaluation problem in reinforcement learning with linear function approximation. We propose an algorithm, VA-OPE, which uses the estimated variance of the value function to reweight the Bellman residual in Fitted Q-Iteration.
arXiv Detail & Related papers (2021-06-22T17:58:46Z)
To do or not to do: cost-sensitive causal decision-making [3.492636597449942]
We introduce a cost-sensitive decision boundary for double binary causal classification. The boundary allows causally classifying instances in the positive and negative treatment class to maximize the expected causal profit. We introduce the expected causal profit ranker which ranks instances for maximizing the expected causal profit.
arXiv Detail & Related papers (2021-01-05T08:36:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.