Simultaneous analysis of approximate leave-one-out cross-validation and mean-field inference
- URL: http://arxiv.org/abs/2501.02624v1
- Date: Sun, 05 Jan 2025 18:34:14 GMT
- Title: Simultaneous analysis of approximate leave-one-out cross-validation and mean-field inference
- Authors: Pierre C Bellec,
- Abstract summary: Approximate Leave-One-Outindex Cross-Validation (ALO-CV) is a method that has been proposed to estimate the error of a single generalizationized in the high-dimensional regime.
ALO-CV provides a proof that ALO-CV approximates the leave-one-out quantity as well as up to the linear error terms.
- Score: 3.5353632767823506
- License:
- Abstract: Approximate Leave-One-Out Cross-Validation (ALO-CV) is a method that has been proposed to estimate the generalization error of a regularized estimator in the high-dimensional regime where dimension and sample size are of the same order, the so called ``proportional regime''. A new analysis is developed to derive the consistency of ALO-CV for non-differentiable regularizer under Gaussian covariates and strong-convexity of the regularizer. Using a conditioning argument, the difference between the ALO-CV weights and their counterparts in mean-field inference is shown to be small. Combined with upper bounds between the mean-field inference estimate and the leave-one-out quantity, this provides a proof that ALO-CV approximates the leave-one-out quantity as well up to negligible error terms. Linear models with square loss, robust linear regression and single-index models are explicitly treated.
Related papers
- Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning [53.25336975467293]
We present the first theoretical error decomposition analysis of methods such as perplexity and self-consistency.
Our analysis reveals a fundamental trade-off: perplexity methods suffer from substantial model error due to the absence of a proper consistency function.
We propose Reasoning-Pruning Perplexity Consistency (RPC), which integrates perplexity with self-consistency, and Reasoning Pruning, which eliminates low-probability reasoning paths.
arXiv Detail & Related papers (2025-02-01T18:09:49Z) - Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems.
We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z) - Failures and Successes of Cross-Validation for Early-Stopped Gradient
Descent [8.0225129190882]
We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped descent gradient (GD)
We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features.
Our theory requires only mild assumptions on the data distribution and does not require the underlying regression function to be linear.
arXiv Detail & Related papers (2024-02-26T18:07:27Z) - Approximate Leave-one-out Cross Validation for Regression with $\ell_1$
Regularizers (extended version) [12.029919627622954]
We present a novel theory for a wide class of problems in the generalized linear model family with non-differentiable regularizers.
We show that |ALO - LO| goes to zero as p goes to infinity while n/p and SNR are fixed and bounded.
arXiv Detail & Related papers (2023-10-26T17:48:10Z) - Corrected generalized cross-validation for finite ensembles of penalized estimators [5.165142221427927]
Generalized cross-validation (GCV) is a widely-used method for estimating the squared out-of-sample prediction risk.
We show that GCV is inconsistent for any finite ensemble of size greater than one.
arXiv Detail & Related papers (2023-10-02T17:38:54Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Prediction Errors for Penalized Regressions based on Generalized
Approximate Message Passing [0.0]
We derive the forms of estimators for the prediction errors: $C_p$ criterion, information criteria, and leave-one-out cross validation (LOOCV) error.
In the framework of GAMP, we show that the information criteria can be expressed by using the variance of the estimates.
arXiv Detail & Related papers (2022-06-26T09:42:39Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - A connection between the pattern classification problem and the General
Linear Model for statistical inference [0.2320417845168326]
Both approaches, i.e. GLM and LRM, apply to different domains, the observation and the label domains.
We derive a statistical test based on a more refined predictive algorithm.
The MLE-based inference employs a residual score and includes the upper bound to compute a better estimation of the actual (real) error.
arXiv Detail & Related papers (2020-12-16T12:26:26Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z) - GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications.
Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions.
The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.