Finite-Sample Guarantees for High-Dimensional DML
- URL: http://arxiv.org/abs/2206.07386v1
- Date: Wed, 15 Jun 2022 08:48:58 GMT
- Title: Finite-Sample Guarantees for High-Dimensional DML
- Authors: Victor Quintas-Martinez
- Abstract summary: This paper gives novel finite-sample guarantees for joint inference on high-dimensional DML.
These guarantees are useful to applied researchers, as they are informative about how far off the coverage of joint confidence bands can be from the nominal level.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Debiased machine learning (DML) offers an attractive way to estimate
treatment effects in observational settings, where identification of causal
parameters requires a conditional independence or unconfoundedness assumption,
since it allows to control flexibly for a potentially very large number of
covariates. This paper gives novel finite-sample guarantees for joint inference
on high-dimensional DML, bounding how far the finite-sample distribution of the
estimator is from its asymptotic Gaussian approximation. These guarantees are
useful to applied researchers, as they are informative about how far off the
coverage of joint confidence bands can be from the nominal level. There are
many settings where high-dimensional causal parameters may be of interest, such
as the ATE of many treatment profiles, or the ATE of a treatment on many
outcomes. We also cover infinite-dimensional parameters, such as impacts on the
entire marginal distribution of potential outcomes. The finite-sample
guarantees in this paper complement the existing results on consistency and
asymptotic normality of DML estimators, which are either asymptotic or treat
only the one-dimensional case.
Related papers
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Anytime-Valid Inference for Double/Debiased Machine Learning of Causal Parameters [27.333679232669823]
Double (debiased) machine learning (DML) has seen widespread use in recent years for learning causal/structural parameters.
The classic double-debiased framework is only validally for a predetermined sample size.
This can be of particular concern in large scale experimental studies with huge financial costs or human lives at stake.
arXiv Detail & Related papers (2024-08-18T21:19:56Z) - Robust Estimation of the Tail Index of a Single Parameter Pareto
Distribution from Grouped Data [0.0]
This paper introduces a novel robust estimation technique, the Method of Truncated Moments (MTuM)
Inferential justification of MTuM is established by employing the central limit theorem and validating them through a comprehensive simulation study.
arXiv Detail & Related papers (2024-01-26T01:42:06Z) - On the Consistency of Maximum Likelihood Estimation of Probabilistic
Principal Component Analysis [1.0528389538549636]
PPCA has a broad spectrum of applications ranging from science and engineering to quantitative finance.
Despite this wide applicability in various fields, hardly any theoretical guarantees exist to justify the soundness of the maximal likelihood (ML) solution for this model.
We propose a novel approach using quotient topological spaces and in particular, we show that the maximum likelihood solution is consistent in an appropriate quotient Euclidean space.
arXiv Detail & Related papers (2023-11-08T22:40:45Z) - A Targeted Accuracy Diagnostic for Variational Approximations [8.969208467611896]
Variational Inference (VI) is an attractive alternative to Markov Chain Monte Carlo (MCMC)
Existing methods characterize the quality of the whole variational distribution.
We propose the TArgeted Diagnostic for Distribution Approximation Accuracy (TADDAA)
arXiv Detail & Related papers (2023-02-24T02:50:18Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Keep it Tighter -- A Story on Analytical Mean Embeddings [0.6445605125467574]
Kernel techniques are among the most popular and flexible approaches in data science.
Mean embedding gives rise to a divergence measure referred to as maximum mean discrepancy (MMD)
In this paper we focus on the problem of MMD estimation when the mean embedding of one of the underlying distributions is available analytically.
arXiv Detail & Related papers (2021-10-15T21:29:27Z) - A Unified Joint Maximum Mean Discrepancy for Domain Adaptation [73.44809425486767]
This paper theoretically derives a unified form of JMMD that is easy to optimize.
From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence that benefits to classification.
We propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift.
arXiv Detail & Related papers (2021-01-25T09:46:14Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Learning, compression, and leakage: Minimising classification error via
meta-universal compression principles [87.054014983402]
A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding.
Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains PAC learning when applied to a wide variety of models.
We show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.
arXiv Detail & Related papers (2020-10-14T20:03:58Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.