Automatic Debiased Machine Learning for Smooth Functionals of Nonparametric M-Estimands
- URL: http://arxiv.org/abs/2501.11868v1
- Date: Tue, 21 Jan 2025 03:50:51 GMT
- Title: Automatic Debiased Machine Learning for Smooth Functionals of Nonparametric M-Estimands
- Authors: Lars van der Laan, Aurelien Bibaut, Nathan Kallus, Alex Luedtke,
- Abstract summary: We propose a unified framework for automatic debiased machine learning (autoDML) to perform inference on smooth functionals of infinite-dimensional M-estimands.
We introduce three autoDML estimators based on one-step estimation, targeted minimum loss-based estimation, and the method of sieves.
For data-driven model selection, we derive a novel decomposition of model approximation error for smooth functionals of M-estimands.
- Score: 34.30497962430375
- License:
- Abstract: We propose a unified framework for automatic debiased machine learning (autoDML) to perform inference on smooth functionals of infinite-dimensional M-estimands, defined as population risk minimizers over Hilbert spaces. By automating debiased estimation and inference procedures in causal inference and semiparametric statistics, our framework enables practitioners to construct valid estimators for complex parameters without requiring specialized expertise. The framework supports Neyman-orthogonal loss functions with unknown nuisance parameters requiring data-driven estimation, as well as vector-valued M-estimands involving simultaneous loss minimization across multiple Hilbert space models. We formalize the class of parameters efficiently estimable by autoDML as a novel class of nonparametric projection parameters, defined via orthogonal minimum loss objectives. We introduce three autoDML estimators based on one-step estimation, targeted minimum loss-based estimation, and the method of sieves. For data-driven model selection, we derive a novel decomposition of model approximation error for smooth functionals of M-estimands and propose adaptive debiased machine learning estimators that are superefficient and adaptive to the functional form of the M-estimand. Finally, we illustrate the flexibility of our framework by constructing autoDML estimators for the long-term survival under a beta-geometric model.
Related papers
- Statistical learning for constrained functional parameters in infinite-dimensional models with applications in fair machine learning [4.974815773537217]
We study the general problem of constrained statistical machine learning through a statistical functional lens.
We characterize the constrained functional parameter as the minimizer of a penalized risk criterion using a Lagrange multiplier formulation.
Our results suggest natural estimators of the constrained parameter that can be constructed by combining estimates of unconstrained parameters.
arXiv Detail & Related papers (2024-04-15T14:59:21Z) - Data-freeWeight Compress and Denoise for Large Language Models [101.53420111286952]
We propose a novel approach termed Data-free Joint Rank-k Approximation for compressing the parameter matrices.
We achieve a model pruning of 80% parameters while retaining 93.43% of the original performance without any calibration data.
arXiv Detail & Related papers (2024-02-26T05:51:47Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Hyperparameter Tuning for Causal Inference with Double Machine Learning:
A Simulation Study [4.526082390949313]
We empirically assess the relationship between the predictive performance of machine learning methods and the resulting causal estimation.
We conduct an extensive simulation study using data from the 2019 Atlantic Causal Inference Conference Data Challenge.
arXiv Detail & Related papers (2024-02-07T09:01:51Z) - Adaptive debiased machine learning using data-driven model selection
techniques [0.5735035463793007]
Adaptive Debiased Machine Learning (ADML) is a nonbiased framework that combines data-driven model selection and debiased machine learning techniques.
ADML avoids the bias introduced by model misspecification and remains free from the restrictions of parametric and semi models.
We provide a broad class of ADML estimators for estimating the average treatment effect in adaptive partially linear regression models.
arXiv Detail & Related papers (2023-07-24T06:16:17Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Revisiting minimum description length complexity in overparameterized
models [38.21167656112762]
We provide an extensive theoretical characterization of MDL-COMP for linear models and kernel methods.
For kernel methods, we show that MDL-COMP informs minimax in-sample error, and can decrease as the dimensionality of the input increases.
We also prove that MDL-COMP bounds the in-sample mean squared error (MSE)
arXiv Detail & Related papers (2020-06-17T22:45:14Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z) - Selective machine learning of doubly robust functionals [6.880360838661036]
We propose a selective machine learning framework for making inferences about a finite-dimensional functional defined on a semiparametric model.
We introduce a new selection criterion aimed at bias reduction in estimating the functional of interest based on a novel definition of pseudo-risk.
arXiv Detail & Related papers (2019-11-05T19:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.