On the design-dependent suboptimality of the Lasso
- URL: http://arxiv.org/abs/2402.00382v1
- Date: Thu, 1 Feb 2024 07:01:54 GMT
- Title: On the design-dependent suboptimality of the Lasso
- Authors: Reese Pathak, Cong Ma
- Abstract summary: We show that the Lasso estimator is provably minimax rate-suboptimal when the minimum singular value is small.
Our lower bound is strong enough to preclude the sparse statistical optimality of all forms of the Lasso.
- Score: 27.970033039287884
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This paper investigates the effect of the design matrix on the ability (or
inability) to estimate a sparse parameter in linear regression. More
specifically, we characterize the optimal rate of estimation when the smallest
singular value of the design matrix is bounded away from zero. In addition to
this information-theoretic result, we provide and analyze a procedure which is
simultaneously statistically optimal and computationally efficient, based on
soft thresholding the ordinary least squares estimator. Most surprisingly, we
show that the Lasso estimator -- despite its widespread adoption for sparse
linear regression -- is provably minimax rate-suboptimal when the minimum
singular value is small. We present a family of design matrices and sparse
parameters for which we can guarantee that the Lasso with any choice of
regularization parameter -- including those which are data-dependent and
randomized -- would fail in the sense that its estimation rate is suboptimal by
polynomial factors in the sample size. Our lower bound is strong enough to
preclude the statistical optimality of all forms of the Lasso, including its
highly popular penalized, norm-constrained, and cross-validated variants.
Related papers
- Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization [29.24821214671497]
Training machine learning and statistical models often involve optimizing a data-driven risk criterion.
We propose a novel robust criterion by combining insights from Bayesian nonparametric (i.e., Dirichlet process) theory and a recent decision-theoretic model of smooth ambiguity-averse preferences.
For practical implementation, we propose and study tractable approximations of the criterion based on well-known Dirichlet process representations.
arXiv Detail & Related papers (2024-01-28T21:19:15Z) - Error Reduction from Stacked Regressions [12.657895453939298]
Stacking regressions is an ensemble technique that forms linear combinations of different regression estimators to enhance predictive accuracy.
In this paper, we learn these weights analogously by minimizing a regularized version of the empirical risk subject to a nonnegativity constraint.
Thanks to an adaptive shrinkage effect, the resulting stacked estimator has strictly smaller population risk than best single estimator among them.
arXiv Detail & Related papers (2023-09-18T15:42:12Z) - Quantized Low-Rank Multivariate Regression with Random Dithering [23.81618208119832]
Low-rank multivariate regression (LRMR) is an important statistical learning model.
We focus on the estimation of the underlying coefficient matrix.
We employ uniform quantization with random dithering, i.e., we add appropriate random noise to the data before quantization.
We propose the constrained Lasso and regularized Lasso estimators, and derive the non-asymptotic error bounds.
arXiv Detail & Related papers (2023-02-22T08:14:24Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in
High Dimensions [41.7567932118769]
Empirical Risk Minimization algorithms are widely used in a variety of estimation and prediction tasks.
In this paper, we characterize for the first time the fundamental limits on the statistical accuracy of convex ERM for inference.
arXiv Detail & Related papers (2020-06-16T04:27:38Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z) - Support recovery and sup-norm convergence rates for sparse pivotal
estimation [79.13844065776928]
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level.
We show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators.
arXiv Detail & Related papers (2020-01-15T16:11:04Z) - On Low-rank Trace Regression under General Sampling Distribution [9.699586426043885]
We show that cross-validated estimators satisfy near-optimal error bounds on general assumptions.
We also show that the cross-validated estimator outperforms the theory-inspired approach of selecting the parameter.
arXiv Detail & Related papers (2019-04-18T02:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.