The Lasso with general Gaussian designs with applications to hypothesis
testing
- URL: http://arxiv.org/abs/2007.13716v3
- Date: Tue, 19 Sep 2023 13:07:32 GMT
- Title: The Lasso with general Gaussian designs with applications to hypothesis
testing
- Authors: Michael Celentano, Andrea Montanari, Yuting Wei
- Abstract summary: The Lasso is a method for high-dimensional regression.
We show that the Lasso estimator can be precisely characterized in the regime in which both $n$ and $p$ are large.
- Score: 21.342900543543816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Lasso is a method for high-dimensional regression, which is now commonly
used when the number of covariates $p$ is of the same order or larger than the
number of observations $n$. Classical asymptotic normality theory does not
apply to this model due to two fundamental reasons: $(1)$ The regularized risk
is non-smooth; $(2)$ The distance between the estimator
$\widehat{\boldsymbol{\theta}}$ and the true parameters vector
$\boldsymbol{\theta}^*$ cannot be neglected. As a consequence, standard
perturbative arguments that are the traditional basis for asymptotic normality
fail.
On the other hand, the Lasso estimator can be precisely characterized in the
regime in which both $n$ and $p$ are large and $n/p$ is of order one. This
characterization was first obtained in the case of Gaussian designs with i.i.d.
covariates: here we generalize it to Gaussian correlated designs with
non-singular covariance structure. This is expressed in terms of a simpler
``fixed-design'' model. We establish non-asymptotic bounds on the distance
between the distribution of various quantities in the two models, which hold
uniformly over signals $\boldsymbol{\theta}^*$ in a suitable sparsity class and
over values of the regularization parameter.
As an application, we study the distribution of the debiased Lasso and show
that a degrees-of-freedom correction is necessary for computing valid
confidence intervals.
Related papers
- Scaling Laws in Linear Regression: Compute, Parameters, and Data [86.48154162485712]
We study the theory of scaling laws in an infinite dimensional linear regression setup.
We show that the reducible part of the test error is $Theta(-(a-1) + N-(a-1)/a)$.
Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
arXiv Detail & Related papers (2024-06-12T17:53:29Z) - Stability of a Generalized Debiased Lasso with Applications to Resampling-Based Variable Selection [11.490578151974285]
We propose an approximate formula for updating a debiased Lasso coefficient.
As applications, we show that the approximate formula allows us to reduce the complexity of variable selection algorithms.
arXiv Detail & Related papers (2024-05-05T22:05:02Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Dimension free ridge regression [10.434481202633458]
We revisit ridge regression on i.i.d. data in terms of the bias and variance of ridge regression in terms of the bias and variance of an equivalent' sequence model.
As a new application, we obtain a completely explicit and sharp characterization of ridge regression for Hilbert covariates with regularly varying spectrum.
arXiv Detail & Related papers (2022-10-16T16:01:05Z) - $p$-Generalized Probit Regression and Scalable Maximum Likelihood
Estimation via Sketching and Coresets [74.37849422071206]
We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses.
We show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+varepsilon)$ on large data.
arXiv Detail & Related papers (2022-03-25T10:54:41Z) - On Model Selection Consistency of Lasso for High-Dimensional Ising
Models on Tree-like Graphs [13.14903445595385]
We consider the problem of high-dimensional Ising model selection using neighborhood-based least absolute shrinkage and selection operator (Lasso)
It is rigorously proved that consistent model selection can be achieved with sample sizes $n=Omega(d3logp)$ for any tree-like graph in the paramagnetic phase.
Given the popularity and efficiency of Lasso, our rigorous analysis provides a theoretical backing for its practical use in Ising model selection.
arXiv Detail & Related papers (2021-10-16T07:23:02Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Optimal Sub-Gaussian Mean Estimation in $\mathbb{R}$ [5.457150493905064]
We present a novel estimator with sub-Gaussian convergence.
Our estimator does not require prior knowledge of the variance.
Our estimator construction and analysis gives a framework generalizable to other problems.
arXiv Detail & Related papers (2020-11-17T02:47:24Z) - The Generalized Lasso with Nonlinear Observations and Generative Priors [63.541900026673055]
We make the assumption of sub-Gaussian measurements, which is satisfied by a wide range of measurement models.
We show that our result can be extended to the uniform recovery guarantee under the assumption of a so-called local embedding property.
arXiv Detail & Related papers (2020-06-22T16:43:35Z) - A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian
Kernel, a Precise Phase Transition, and the Corresponding Double Descent [85.77233010209368]
This article characterizes the exacts of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$ is all large and comparable.
This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$.
arXiv Detail & Related papers (2020-06-09T02:05:40Z) - A Precise High-Dimensional Asymptotic Theory for Boosting and
Minimum-$\ell_1$-Norm Interpolated Classifiers [3.167685495996986]
This paper establishes a precise high-dimensional theory for boosting on separable data.
Under a class of statistical models, we provide an exact analysis of the universality error of boosting.
We also explicitly pin down the relation between the boosting test error and the optimal Bayes error.
arXiv Detail & Related papers (2020-02-05T00:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.