Related papers: Feature Adaptation for Sparse Linear Regression

Feature Adaptation for Sparse Linear Regression

URL: http://arxiv.org/abs/2305.16892v1
Date: Fri, 26 May 2023 12:53:13 GMT
Title: Feature Adaptation for Sparse Linear Regression
Authors: Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi
Abstract summary: Sparse linear regression is a central problem in high-dimensional statistics. We provide an algorithm that adapts to tolerate a small number of approximate dependencies. Our framework fits into a broader framework of feature adaptation for sparse linear regression.
Score: 20.923321050404827
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,\Sigma)$, and we seek an estimator with small excess risk. If the true signal is $t$-sparse, information-theoretically, it is possible to achieve strong recovery guarantees with only $O(t\log n)$ samples. However, computationally efficient algorithms have sample complexity linear in (some variant of) the condition number of $\Sigma$. Classical algorithms such as the Lasso can require significantly more samples than necessary even if there is only a single sparse approximate dependency among the covariates. We provide a polynomial-time algorithm that, given $\Sigma$, automatically adapts the Lasso to tolerate a small number of approximate dependencies. In particular, we achieve near-optimal sample complexity for constant sparsity and if $\Sigma$ has few ``outlier'' eigenvalues. Our algorithm fits into a broader framework of feature adaptation for sparse linear regression with ill-conditioned covariates. With this framework, we additionally provide the first polynomial-factor improvement over brute-force search for constant sparsity $t$ and arbitrary covariance $\Sigma$.

Related papers

A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal. Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
Lasso with Latents: Efficient Estimation, Covariate Rescaling, and Computational-Statistical Gaps [29.13944209460543]
We propose a natural sparse linear regression setting where strong correlations arise from unobserved latent variables. In this setting, we analyze the problem caused by strong correlations and design a surprisingly simple fix. The sample complexity of the resulting "rescaled Lasso" algorithm incurs (in the worst case) quadratic dependence on the sparsity of the underlying signal.
arXiv Detail & Related papers (2024-02-23T16:16:38Z)
Computational-Statistical Gaps for Improper Learning in Sparse Linear Regression [4.396860522241307]
We show that an efficient learning algorithm for sparse linear regression can be used to solve sparse PCA problems with a negative spike. We complement our reduction with low-degree and statistical query lower bounds for the sparse problems from which we reduce.
arXiv Detail & Related papers (2024-02-21T19:55:01Z)
Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression [20.00109111254507]
We show that the problem suffers from a $frackSNR2$-to-$frack2SNR2$ statistical-to-computational gap. We also analyze a simple thresholding algorithm which, outside of the narrow regime where the problem is hard, solves the associated mixed regression detection problem.
arXiv Detail & Related papers (2023-03-03T18:03:49Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Efficient and robust high-dimensional sparse logistic regression via nonlinear primal-dual hybrid gradient algorithms [0.0]
We propose an iterative algorithm that provably computes a solution to a logistic regression problem regularized by an elastic net penalty. This result improves on the known complexity bound of $O(min(m2n,mn2)log (1/epsilon))$ for first-order optimization methods.
arXiv Detail & Related papers (2021-11-30T14:16:48Z)
On the Power of Preconditioning in Sparse Linear Regression [24.140675945592704]
We show that a preconditioned Lasso can solve a large class of sparse linear regression problems nearly optimally. For the first time, we construct random-design instances which are provably hard for an optimally preconditioned Lasso.
arXiv Detail & Related papers (2021-06-17T02:12:01Z)
Conditional Uncorrelation and Efficient Non-approximate Subset Selection in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation. By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.