Optimal Nuisance Function Tuning for Estimating a Doubly Robust Functional under Proportional Asymptotics
- URL: http://arxiv.org/abs/2509.25536v2
- Date: Fri, 24 Oct 2025 20:45:06 GMT
- Title: Optimal Nuisance Function Tuning for Estimating a Doubly Robust Functional under Proportional Asymptotics
- Authors: Sean McGrath, Debarghya Mukherjee, Rajarshi Mukherjee, Zixiao Jolene Wang,
- Abstract summary: We evaluate three existing ECC estimators and two sample splitting strategies for estimating the required nuisance functions.<n>We show that our bias correction strategy yields $sqrtn$-consistent estimators across different sample splitting strategies and estimator choices.<n>Our analysis reveals that prediction- tuning parameters (i.e., those that optimally estimate the nuisance functions) may not lead to the lowest variance of the ECC estimator.
- Score: 9.86496801565209
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we explore the asymptotically optimal tuning parameter choice in ridge regression for estimating nuisance functions of a statistical functional that has recently gained prominence in conditional independence testing and causal inference. Given a sample of size $n$, we study estimators of the Expected Conditional Covariance (ECC) between variables $Y$ and $A$ given a high-dimensional covariate $X \in \mathbb{R}^p$. Under linear regression models for $Y$ and $A$ on $X$ and the proportional asymptotic regime $p/n \to c \in (0, \infty)$, we evaluate three existing ECC estimators and two sample splitting strategies for estimating the required nuisance functions. Since no consistent estimator of the nuisance functions exists in the proportional asymptotic regime without imposing further structure on the problem, we first derive debiased versions of the ECC estimators that utilize the ridge regression nuisance function estimators. We show that our bias correction strategy yields $\sqrt{n}$-consistent estimators of the ECC across different sample splitting strategies and estimator choices. We then derive the asymptotic variances of these debiased estimators to illustrate the nuanced interplay between the sample splitting strategy, estimator choice, and tuning parameters of the nuisance function estimators for optimally estimating the ECC. Our analysis reveals that prediction-optimal tuning parameters (i.e., those that optimally estimate the nuisance functions) may not lead to the lowest asymptotic variance of the ECC estimator -- thereby demonstrating the need to be careful in selecting tuning parameters based on the final goal of inference. Finally, we verify our theoretical results through extensive numerical experiments.
Related papers
- Near-Efficient and Non-Asymptotic Multiway Inference [3.131740922192115]
We establish non-asymptotic efficiency guarantees for tensor decomposition-based inference in count data models.<n>A rank-constrained maximum-likelihood estimator achieves multiway analysis with variance matching the Cram'er-Rao Lower Bound.<n>For higher ranks, we illustrate that our multiway estimator may not attain the CRLB; nevertheless, CP-based parametric inference remains nearly minimax optimal.
arXiv Detail & Related papers (2025-11-07T15:54:31Z) - On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization [57.179679246370114]
A potential limitation of existing methods is the bias inherent in most perturbation estimators unless a stepsize is proposed.<n>We propose a novel family of unbiased gradient scaling estimators that eliminate bias while maintaining favorable construction.
arXiv Detail & Related papers (2025-10-22T18:25:43Z) - Debiased Ill-Posed Regression [8.495265117285223]
We propose a debiased estimation strategy based on the influence function of a modification of the projected error.<n>Our proposed estimator possesses a second-order bias with respect to the involved nuisance functions.
arXiv Detail & Related papers (2025-05-27T06:47:33Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Nuisance Function Tuning and Sample Splitting for Optimal Doubly Robust Estimation [5.018363990542611]
We consider the problem of how to estimate nuisance functions to obtain optimal rates of convergence for a doubly robust nonparametric functional.
We show that plug-in and first-order biased-corrected estimators can achieve minimax rates of convergence across all H"older smoothness classes of the nuisance functions.
arXiv Detail & Related papers (2022-12-30T18:17:06Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - On the Estimation of Derivatives Using Plug-in Kernel Ridge Regression
Estimators [4.392844455327199]
We propose a simple plug-in kernel ridge regression (KRR) estimator in nonparametric regression.
We provide a non-asymotic analysis to study the behavior of the proposed estimator in a unified manner.
The proposed estimator achieves the optimal rate of convergence with the same choice of tuning parameter for any order of derivatives.
arXiv Detail & Related papers (2020-06-02T02:32:39Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.