Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in
High Dimensions
- URL: http://arxiv.org/abs/2006.08917v2
- Date: Sun, 5 Jul 2020 23:54:29 GMT
- Title: Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in
High Dimensions
- Authors: Hossein Taheri, Ramtin Pedarsani, and Christos Thrampoulidis
- Abstract summary: Empirical Risk Minimization algorithms are widely used in a variety of estimation and prediction tasks.
In this paper, we characterize for the first time the fundamental limits on the statistical accuracy of convex ERM for inference.
- Score: 41.7567932118769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Empirical Risk Minimization (ERM) algorithms are widely used in a variety of
estimation and prediction tasks in signal-processing and machine learning
applications. Despite their popularity, a theory that explains their
statistical properties in modern regimes where both the number of measurements
and the number of unknown parameters is large is only recently emerging. In
this paper, we characterize for the first time the fundamental limits on the
statistical accuracy of convex ERM for inference in high-dimensional
generalized linear models. For a stylized setting with Gaussian features and
problem dimensions that grow large at a proportional rate, we start with sharp
performance characterizations and then derive tight lower bounds on the
estimation and prediction error that hold over a wide class of loss functions
and for any value of the regularization parameter. Our precise analysis has
several attributes. First, it leads to a recipe for optimally tuning the loss
function and the regularization parameter. Second, it allows to precisely
quantify the sub-optimality of popular heuristic choices: for instance, we show
that optimally-tuned least-squares is (perhaps surprisingly) approximately
optimal for standard logistic data, but the sub-optimality gap grows
drastically as the signal strength increases. Third, we use the bounds to
precisely assess the merits of ridge-regularization as a function of the
over-parameterization ratio. Notably, our bounds are expressed in terms of the
Fisher Information of random variables that are simple functions of the data
distribution, thus making ties to corresponding bounds in classical statistics.
Related papers
- On the design-dependent suboptimality of the Lasso [27.970033039287884]
We show that the Lasso estimator is provably minimax rate-suboptimal when the minimum singular value is small.
Our lower bound is strong enough to preclude the sparse statistical optimality of all forms of the Lasso.
arXiv Detail & Related papers (2024-02-01T07:01:54Z) - Optimal Differentially Private PCA and Estimation for Spiked Covariance Matrices [10.377683220196873]
Estimating a covariance matrix and its associated principal components is a fundamental problem in contemporary statistics.
We study optimal differentially private Principal Component Analysis (PCA) and covariance estimation within the spiked covariance model.
We propose computationally efficient differentially private estimators and prove their minimax optimality for sub-Gaussian distributions.
arXiv Detail & Related papers (2024-01-08T11:18:14Z) - Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression [12.443289202402761]
We show the benefits of batch- partitioning through the lens of a minimum-norm overparametrized linear regression model.
We characterize the optimal batch size and show it is inversely proportional to the noise level.
We also show that shrinking the batch minimum-norm estimator by a factor equal to the Weiner coefficient further stabilizes it and results in lower quadratic risk in all settings.
arXiv Detail & Related papers (2023-06-14T11:02:08Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.