Debiased Machine Learning without Sample-Splitting for Stable Estimators
- URL: http://arxiv.org/abs/2206.01825v1
- Date: Fri, 3 Jun 2022 21:31:28 GMT
- Title: Debiased Machine Learning without Sample-Splitting for Stable Estimators
- Authors: Qizhao Chen, Vasilis Syrgkanis, Morgane Austern
- Abstract summary: Recent work on debiased machine learning shows how one can use generic machine learning estimators for auxiliary problems.
We show that when these auxiliary estimation algorithms satisfy natural leave-one-out stability properties, then sample splitting is not required.
- Score: 21.502538698559825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimation and inference on causal parameters is typically reduced to a
generalized method of moments problem, which involves auxiliary functions that
correspond to solutions to a regression or classification problem. Recent line
of work on debiased machine learning shows how one can use generic machine
learning estimators for these auxiliary problems, while maintaining asymptotic
normality and root-$n$ consistency of the target parameter of interest, while
only requiring mean-squared-error guarantees from the auxiliary estimation
algorithms. The literature typically requires that these auxiliary problems are
fitted on a separate sample or in a cross-fitting manner. We show that when
these auxiliary estimation algorithms satisfy natural leave-one-out stability
properties, then sample splitting is not required. This allows for sample
re-use, which can be beneficial in moderately sized sample regimes. For
instance, we show that the stability properties that we propose are satisfied
for ensemble bagged estimators, built via sub-sampling without replacement, a
popular technique in machine learning practice.
Related papers
- Concentration Inequalities for the Stochastic Optimization of Unbounded Objectives with Application to Denoising Score Matching [5.022028859839544]
We derive concentration inequalities that bound the statistical error for a large class of optimization problems.
Our results establish the benefit of sample reuse in algorithms that employ easily sampled auxiliary random variables.
arXiv Detail & Related papers (2025-02-12T18:30:36Z) - Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Learning to sample fibers for goodness-of-fit testing [0.0]
We consider the problem of constructing exact goodness-of-fit tests for discrete exponential family models.
We translate the problem into a Markov decision process and demonstrate a reinforcement learning approach for learning good moves' for sampling.
Our algorithm is based on an actor-critic sampling scheme, with provable convergence.
arXiv Detail & Related papers (2024-05-22T19:33:58Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Learning to Extrapolate: A Transductive Approach [44.74850954809099]
We tackle the problem of developing machine learning systems that retain the power of over parameterized function approximators.
We propose a simple strategy based on bilinear embeddings to enable this type of generalization.
We instantiate a simple, practical algorithm applicable to various supervised learning and imitation learning tasks.
arXiv Detail & Related papers (2023-04-27T17:00:51Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - RF+clust for Leave-One-Problem-Out Performance Prediction [0.9281671380673306]
We study leave-one-problem-out (LOPO) performance prediction.
We analyze whether standard random forest (RF) model predictions can be improved by calibrating them with a weighted average of performance values.
arXiv Detail & Related papers (2023-01-23T16:14:59Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Automatic Debiased Machine Learning for Dynamic Treatment Effects and
General Nested Functionals [23.31865419578237]
We extend the idea of automated debiased machine learning to the dynamic treatment regime and more generally to nested functionals.
We show that the multiply robust formula for the dynamic treatment regime with discrete treatments can be re-stated in terms of a Riesz representer characterization of nested mean regressions.
arXiv Detail & Related papers (2022-03-25T19:54:17Z) - Fairness constraint in Structural Econometrics and Application to fair
estimation using Instrumental Variables [3.265773263570237]
A supervised machine learning algorithm determines a model from a learning sample that will be used to predict new observations.
This information aggregation does not consider any potential selection on unobservables and any status-quo biases which may be contained in the training sample.
The latter bias has raised concerns around the so-called textitfairness of machine learning algorithms, especially towards disadvantaged groups.
arXiv Detail & Related papers (2022-02-16T15:34:07Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.