Sparse Linear Regression and Lattice Problems
- URL: http://arxiv.org/abs/2402.14645v1
- Date: Thu, 22 Feb 2024 15:45:27 GMT
- Title: Sparse Linear Regression and Lattice Problems
- Authors: Aparna Gupte, Neekon Vafa, Vinod Vaikuntanathan
- Abstract summary: We give evidence of average-case hardness of SLR w.r.t. all efficient algorithms assuming the worst-case hardness of lattice problems.
Specifically, we give an instance-by-instance reduction from a variant of the bounded distance decoding (BDD) problem on lattices to SLR.
- Score: 9.50127504736299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sparse linear regression (SLR) is a well-studied problem in statistics where
one is given a design matrix $X\in\mathbb{R}^{m\times n}$ and a response vector
$y=X\theta^*+w$ for a $k$-sparse vector $\theta^*$ (that is,
$\|\theta^*\|_0\leq k$) and small, arbitrary noise $w$, and the goal is to find
a $k$-sparse $\widehat{\theta} \in \mathbb{R}^n$ that minimizes the mean
squared prediction error $\frac{1}{m}\|X\widehat{\theta}-X\theta^*\|^2_2$.
While $\ell_1$-relaxation methods such as basis pursuit, Lasso, and the Dantzig
selector solve SLR when the design matrix is well-conditioned, no general
algorithm is known, nor is there any formal evidence of hardness in an
average-case setting with respect to all efficient algorithms.
We give evidence of average-case hardness of SLR w.r.t. all efficient
algorithms assuming the worst-case hardness of lattice problems. Specifically,
we give an instance-by-instance reduction from a variant of the bounded
distance decoding (BDD) problem on lattices to SLR, where the condition number
of the lattice basis that defines the BDD instance is directly related to the
restricted eigenvalue condition of the design matrix, which characterizes some
of the classical statistical-computational gaps for sparse linear regression.
Also, by appealing to worst-case to average-case reductions from the world of
lattices, this shows hardness for a distribution of SLR instances; while the
design matrices are ill-conditioned, the resulting SLR instances are in the
identifiable regime.
Furthermore, for well-conditioned (essentially) isotropic Gaussian design
matrices, where Lasso is known to behave well in the identifiable regime, we
show hardness of outputting any good solution in the unidentifiable regime
where there are many solutions, assuming the worst-case hardness of standard
and well-studied lattice problems.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Computational-Statistical Gaps for Improper Learning in Sparse Linear Regression [4.396860522241307]
We show that an efficient learning algorithm for sparse linear regression can be used to solve sparse PCA problems with a negative spike.
We complement our reduction with low-degree and statistical query lower bounds for the sparse problems from which we reduce.
arXiv Detail & Related papers (2024-02-21T19:55:01Z) - Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression [20.00109111254507]
We show that the problem suffers from a $frackSNR2$-to-$frack2SNR2$ statistical-to-computational gap.
We also analyze a simple thresholding algorithm which, outside of the narrow regime where the problem is hard, solves the associated mixed regression detection problem.
arXiv Detail & Related papers (2023-03-03T18:03:49Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - On the well-spread property and its relation to linear regression [4.619541348328937]
We show that consistent recovery of the parameter vector in a robust linear regression model is information-theoretically impossible.
We show that it is possible to efficiently certify whether a given $n$-by-$d$ matrix is well-spread if the number of observations is quadratic in the ambient dimension.
arXiv Detail & Related papers (2022-06-16T11:17:44Z) - Optimal Online Generalized Linear Regression with Stochastic Noise and
Its Application to Heteroscedastic Bandits [88.6139446295537]
We study the problem of online generalized linear regression in the setting of a generalized linear model with possibly unbounded additive noise.
We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise.
We propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
arXiv Detail & Related papers (2022-02-28T08:25:26Z) - On the Power of Preconditioning in Sparse Linear Regression [24.140675945592704]
We show that a preconditioned Lasso can solve a large class of sparse linear regression problems nearly optimally.
For the first time, we construct random-design instances which are provably hard for an optimally preconditioned Lasso.
arXiv Detail & Related papers (2021-06-17T02:12:01Z) - Hashing embeddings of optimal dimension, with applications to linear
least squares [1.2891210250935143]
We present subspace embedding properties for $s$-hashing sketching matrices, with $sgeq 1$, that are optimal in the projection dimension $m$ of the sketch.
We apply these results to the special case of Linear Least Squares (LLS), and develop Ski-LLS, a generic software package for these problems.
arXiv Detail & Related papers (2021-05-25T10:35:13Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.