Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation
- URL: http://arxiv.org/abs/2509.24962v1
- Date: Mon, 29 Sep 2025 15:56:24 GMT
- Title: Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation
- Authors: Valentyn Melnychuk, Dennis Frauen, Jonas Schweisthal, Stefan Feuerriegel,
- Abstract summary: State-of-the-art methods for CATE estimation often perform poorly in the presence of low overlap.<n>We introduce Overlap-Adaptive Regularization (OAR) that regularizes target models proportionally to overlap weights.<n>Our OAR significantly improves CATE estimation in low-overlap settings in comparison to constant regularization.
- Score: 59.153491256972806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The conditional average treatment effect (CATE) is widely used in personalized medicine to inform therapeutic decisions. However, state-of-the-art methods for CATE estimation (so-called meta-learners) often perform poorly in the presence of low overlap. In this work, we introduce a new approach to tackle this issue and improve the performance of existing meta-learners in the low-overlap regions. Specifically, we introduce Overlap-Adaptive Regularization (OAR) that regularizes target models proportionally to overlap weights so that, informally, the regularization is higher in regions with low overlap. To the best of our knowledge, our OAR is the first approach to leverage overlap weights in the regularization terms of the meta-learners. Our OAR approach is flexible and works with any existing CATE meta-learner: we demonstrate how OAR can be applied to both parametric and non-parametric second-stage models. Furthermore, we propose debiased versions of our OAR that preserve the Neyman-orthogonality of existing meta-learners and thus ensure more robust inference. Through a series of (semi-)synthetic experiments, we demonstrate that our OAR significantly improves CATE estimation in low-overlap settings in comparison to constant regularization.
Related papers
- Mitigating Forgetting in Low Rank Adaptation [17.859306837144732]
We present LaLoRA, a weight-space regularization technique that applies a Laplace approximation to Low-Rank Adaptation.<n>Our approach estimates the model's confidence in each parameter and constrains updates in high-curvature directions.<n>We evaluate LaLoRA by fine-tuning a Llama model for mathematical reasoning and demonstrate an improved learning-forgetting trade-off.
arXiv Detail & Related papers (2025-12-19T15:54:36Z) - Overlap-weighted orthogonal meta-learner for treatment effect estimation over time [90.46786193198744]
We introduce a novel overlap-weighted meta-learner for estimating heterogeneous treatment effects (HTEs)<n>Our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions.<n>We show that our WO-learner is fully model-agnostic and can be applied to any machine learning model.
arXiv Detail & Related papers (2025-10-22T14:47:57Z) - Hybrid Meta-learners for Estimating Heterogeneous Treatment Effects [1.9506923346234724]
Estimating conditional average treatment effects (CATE) from observational data involves modeling decisions that differ from supervised learning.<n>Previous approaches can be grouped into two primary "meta-learner" paradigms that impose distinct inductive biases.<n>We introduce the Hybrid Learner (H-learner), a novel regularization strategy that interpolates between the direct and indirect regularizations depending on the dataset.
arXiv Detail & Related papers (2025-06-16T16:37:20Z) - A Meta-learner for Heterogeneous Effects in Difference-in-Differences [17.361857058902494]
We propose a doubly robust meta-learner for the estimation of the Conditional Average Treatment Effect on the Treated (CATT)<n>Our framework allows for the flexible estimation of the CATT, when conditioning on any subset of variables of interest using generic machine learning.
arXiv Detail & Related papers (2025-02-07T07:04:37Z) - Improving the Estimation of Lifetime Effects in A/B Testing via Treatment Locality [16.36651676133996]
We develop optimal inference techniques for general A/B testing in Markov Decision Processes.<n>We propose methods to harness the localized structure by sharing information on the non-targeted states.<n>We show that all such estimators can benefit from variance reduction through information sharing without increasing their bias.
arXiv Detail & Related papers (2024-07-29T00:41:11Z) - Conformal Meta-learners for Predictive Inference of Individual Treatment
Effects [0.0]
We investigate the problem of machine learning-based (ML) predictive inference on individual treatment effects (ITEs)
We develop conformal meta-learners, a general framework for issuing predictive intervals for ITEs by applying the standard conformal prediction (CP) procedure on top of CATE meta-learners.
arXiv Detail & Related papers (2023-08-28T20:32:22Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - An Investigation of the Bias-Variance Tradeoff in Meta-Gradients [53.28925387487846]
Hessian estimation always adds bias and can also add variance to meta-gradient estimation.
We study the bias and variance tradeoff arising from truncated backpropagation and sampling correction.
arXiv Detail & Related papers (2022-09-22T20:33:05Z) - Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality [65.67315418971688]
Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR) are proposed.
Experiments on visual recognition demonstrate that our methods can simultaneously improve the covariance conditioning and generalization.
arXiv Detail & Related papers (2022-07-05T15:39:29Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.