Related papers: Theoretical Analysis of Leave-one-out Cross Validation for Non-differentiable Penalties under High-dimensional Settings

Theoretical Analysis of Leave-one-out Cross Validation for Non-differentiable Penalties under High-dimensional Settings

URL: http://arxiv.org/abs/2402.08543v2
Date: Wed, 14 Feb 2024 16:28:59 GMT
Title: Theoretical Analysis of Leave-one-out Cross Validation for Non-differentiable Penalties under High-dimensional Settings
Authors: Haolin Zou, Arnab Auddy, Kamiar Rahnama Rad, Arian Maleki
Abstract summary: We provide finite sample upper bounds on the expected squared error of leave-one-out cross-validation (LO) in estimating the out-of-sample risk. The theoretical framework presented here provides a solid foundation for elucidating empirical findings that show the accuracy of LO.
Score: 12.029919627622954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite a large and significant body of recent work focused on estimating the out-of-sample risk of regularized models in the high dimensional regime, a theoretical understanding of this problem for non-differentiable penalties such as generalized LASSO and nuclear norm is missing. In this paper we resolve this challenge. We study this problem in the proportional high dimensional regime where both the sample size n and number of features p are large, and n/p and the signal-to-noise ratio (per observation) remain finite. We provide finite sample upper bounds on the expected squared error of leave-one-out cross-validation (LO) in estimating the out-of-sample risk. The theoretical framework presented here provides a solid foundation for elucidating empirical findings that show the accuracy of LO.

Related papers

Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems. We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z)
Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations. We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting. We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z)
Approximate Leave-one-out Cross Validation for Regression with $\ell_1$ Regularizers (extended version) [12.029919627622954]
We present a novel theory for a wide class of problems in the generalized linear model family with non-differentiable regularizers. We show that |ALO - LO| goes to zero as p goes to infinity while n/p and SNR are fixed and bounded.
arXiv Detail & Related papers (2023-10-26T17:48:10Z)
Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution [2.1146241717926664]
We show that the Wasserstein GAN, constrained to left-invertible push-forward maps, generates distributions that avoid replication and significantly deviate from the empirical distribution. Our most important contribution provides a finite-sample lower bound on the Wasserstein-1 distance between the generative distribution and the empirical one. We also establish a finite-sample upper bound on the distance between the generative distribution and the true data-generating one.
arXiv Detail & Related papers (2023-07-31T06:11:57Z)
A Robustness Analysis of Blind Source Separation [91.3755431537592]
Blind source separation (BSS) aims to recover an unobserved signal from its mixture $X=f(S)$ under the condition that the transformation $f$ is invertible but unknown. We present a general framework for analysing such violations and quantifying their impact on the blind recovery of $S$ from $X$. We show that a generic BSS-solution in response to general deviations from its defining structural assumptions can be profitably analysed in the form of explicit continuity guarantees.
arXiv Detail & Related papers (2023-03-17T16:30:51Z)
A New Central Limit Theorem for the Augmented IPW Estimator: Variance Inflation, Cross-Fit Covariance and Beyond [0.9172870611255595]
Cross-fit inverse probability weighting (AIPW) with cross-fitting is a popular choice in practice. We study this cross-fit AIPW estimator under well-specified outcome regression and propensity score models in a high-dimensional regime. Our work utilizes a novel interplay between three distinct tools--approximate message passing theory, the theory of deterministic equivalents, and the leave-one-out approach.
arXiv Detail & Related papers (2022-05-20T14:17:53Z)
Non-Linear Spectral Dimensionality Reduction Under Uncertainty [107.01839211235583]
We propose a new dimensionality reduction framework, called NGEU, which leverages uncertainty information and directly extends several traditional approaches. We show that the proposed NGEU formulation exhibits a global closed-form solution, and we analyze, based on the Rademacher complexity, how the underlying uncertainties theoretically affect the generalization ability of the framework.
arXiv Detail & Related papers (2022-02-09T19:01:33Z)
Divergence Frontiers for Generative Models: Sample Complexity, Quantization Level, and Frontier Integral [58.434753643798224]
Divergence frontiers have been proposed as an evaluation framework for generative models. We establish non-asymptotic bounds on the sample complexity of the plug-in estimator of divergence frontiers. We also augment the divergence frontier framework by investigating the statistical performance of smoothed distribution estimators.
arXiv Detail & Related papers (2021-06-15T06:26:25Z)
Deconfounded Score Method: Scoring DAGs with Dense Unobserved Confounding [101.35070661471124]
We show that unobserved confounding leaves a characteristic footprint in the observed data distribution that allows for disentangling spurious and causal effects. We propose an adjusted score-based causal discovery algorithm that may be implemented with general-purpose solvers and scales to high-dimensional problems.
arXiv Detail & Related papers (2021-03-28T11:07:59Z)
Non-asymptotic Optimal Prediction Error for Growing-dimensional Partially Functional Linear Models [0.951828574518325]
We show the rate-optimal upper and lower bounds of the prediction error. An exact upper bound for the excess prediction risk is shown in a non-asymptotic form. We derive the non-asymptotic minimax lower bound under the regularity assumption of the Kullback-Leibler divergence of the models.
arXiv Detail & Related papers (2020-09-10T08:49:32Z)
Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions [19.439945058410203]
We study the problem of out-of-sample risk estimation in the high dimensional regime. Extensive empirical evidence confirms the accuracy of leave-one-out cross validation. One technical advantage of the theory is that it can be used to clarify and connect some results from the recent literature on scalable approximate LO.
arXiv Detail & Related papers (2020-03-03T20:07:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.