Doubly robust inference via calibration
- URL: http://arxiv.org/abs/2411.02771v2
- Date: Fri, 27 Jun 2025 18:36:47 GMT
- Title: Doubly robust inference via calibration
- Authors: Lars van der Laan, Alex Luedtke, Marco Carone,
- Abstract summary: We show that calibrating the nuisance estimators within a doubly robust procedure yields doubly robust normality for linear functionals.<n>Our theoretical analysis shows that the DML estimator remains calibratedally normal if either the regression or the Riesz representer of the functional is estimated sufficiently well.<n>Our method can be integrated into existing DML pipelines by adding just a few lines of code to calibrate cross-fitted estimates via isotonic regression.
- Score: 0.9694940903078658
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Doubly robust estimators are widely used for estimating average treatment effects and other linear summaries of regression functions. While consistency requires only one of two nuisance functions to be estimated consistently, asymptotic normality typically require sufficiently fast convergence of both. In this work, we correct this mismatch: we show that calibrating the nuisance estimators within a doubly robust procedure yields doubly robust asymptotic normality for linear functionals. We introduce a general framework, calibrated debiased machine learning (calibrated DML), and propose a specific estimator that augments standard DML with a simple isotonic regression adjustment. Our theoretical analysis shows that the calibrated DML estimator remains asymptotically normal if either the regression or the Riesz representer of the functional is estimated sufficiently well, allowing the other to converge arbitrarily slowly or even inconsistently. We further propose a simple bootstrap method for constructing confidence intervals, enabling doubly robust inference without additional nuisance estimation. In a range of semi-synthetic benchmark datasets, calibrated DML reduces bias and improves coverage relative to standard DML. Our method can be integrated into existing DML pipelines by adding just a few lines of code to calibrate cross-fitted estimates via isotonic regression.
Related papers
- Multivariate Latent Recalibration for Conditional Normalizing Flows [2.3020018305241337]
latent recalibration learns a transformation of the latent space with finite-sample bounds on latent calibration.<n>LR consistently improves latent calibration error and the negative log-likelihood of the recalibrated models.
arXiv Detail & Related papers (2025-05-22T13:08:20Z) - Calibration Strategies for Robust Causal Estimation: Theoretical and Empirical Insights on Propensity Score Based Estimators [0.6562256987706128]
partitioning of data for estimation and calibration critically impacts the performance of propensity score based estimators.
We extend recent advances in calibration techniques for propensity score estimation, improving the robustness of propensity scores in challenging settings.
arXiv Detail & Related papers (2025-03-21T16:41:10Z) - Automatic Double Reinforcement Learning in Semiparametric Markov Decision Processes with Applications to Long-Term Causal Inference [33.14076284663493]
Markov Decision Processes (MDPs) offer a principled framework for modeling outcomes as sequences of states, actions, and rewards over time.
We introduce a semiparametric extension of Double Reinforcement Learning (DRL) for statistically efficient, model-robust inference on linear functionals of the Q-function.
We develop a novel debiased plug-in estimator based on isotonic Bellman calibration, which integrates fitted Q-it with an isotonic regression step.
arXiv Detail & Related papers (2025-01-12T20:35:28Z) - Automatic debiasing of neural networks via moment-constrained learning [0.0]
Naively learning the regression function and taking a sample mean of the target functional results in biased estimators.
We propose moment-constrained learning as a new RR learning approach that addresses some shortcomings in automatic debiasing.
arXiv Detail & Related papers (2024-09-29T20:56:54Z) - Improving the Finite Sample Performance of Double/Debiased Machine Learning with Propensity Score Calibration [0.0]
Double/debiased machine learning (DML) uses a double-robust score function that relies on the prediction of nuisance functions.
Estimators relying on double-robust score functions are highly sensitive to errors in propensity score predictions.
This paper investigates the use of probability calibration approaches within the DML framework.
arXiv Detail & Related papers (2024-09-07T17:44:01Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Orthogonal Causal Calibration [55.28164682911196]
We develop general algorithms for reducing the task of causal calibration to that of calibrating a standard (non-causal) predictive model.<n>Our results are exceedingly general, showing that essentially any existing calibration algorithm can be used in causal settings.
arXiv Detail & Related papers (2024-06-04T03:35:25Z) - Doubly Robust Proximal Causal Learning for Continuous Treatments [56.05592840537398]
We propose a kernel-based doubly robust causal learning estimator for continuous treatments.
We show that its oracle form is a consistent approximation of the influence function.
We then provide a comprehensive convergence analysis in terms of the mean square error.
arXiv Detail & Related papers (2023-09-22T12:18:53Z) - Spectrum-Aware Debiasing: A Modern Inference Framework with Applications to Principal Components Regression [1.342834401139078]
We introduce SpectrumAware Debiasing, a novel method for high-dimensional regression.
Our approach applies to problems with structured, heavy tails, and low-rank structures.
We demonstrate our method through simulated and real data experiments.
arXiv Detail & Related papers (2023-09-14T15:58:30Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Automatic Debiased Machine Learning for Dynamic Treatment Effects and
General Nested Functionals [23.31865419578237]
We extend the idea of automated debiased machine learning to the dynamic treatment regime and more generally to nested functionals.
We show that the multiply robust formula for the dynamic treatment regime with discrete treatments can be re-stated in terms of a Riesz representer characterization of nested mean regressions.
arXiv Detail & Related papers (2022-03-25T19:54:17Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Mostly Harmless Machine Learning: Learning Optimal Instruments in Linear
IV Models [3.7599363231894176]
We offer theoretical results that justify incorporating machine learning in the standard linear instrumental variable setting.
We use machine learning, combined with sample-splitting, to predict the treatment variable from the instrument.
This allows the researcher to extract non-linear co-variation between the treatment and instrument.
arXiv Detail & Related papers (2020-11-12T01:55:11Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.