Related papers: Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination

Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination

URL: http://arxiv.org/abs/2510.10665v1
Date: Sun, 12 Oct 2025 15:42:44 GMT
Title: Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination
Authors: Ilias Diakonikolas, Chao Gao, Daniel M. Kane, John Lafferty, Ankit Pensia,
Abstract summary: We show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $tildeOmega(d1/2/alpha2)$.
Score: 65.37519531362157
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the task of noiseless linear regression under Gaussian covariates in the presence of additive oblivious contamination. Specifically, we are given i.i.d.\ samples from a distribution $(x, y)$ on $\mathbb{R}^d \times \mathbb{R}$ with $x \sim \mathcal{N}(0,\mathbf{I}_d)$ and $y = x^\top \beta + z$, where $z$ is drawn independently of $x$ from an unknown distribution $E$. Moreover, $z$ satisfies $\mathbb{P}_E[z = 0] = \alpha>0$. The goal is to accurately recover the regressor $\beta$ to small $\ell_2$-error. Ignoring computational considerations, this problem is known to be solvable using $O(d/\alpha)$ samples. On the other hand, the best known polynomial-time algorithms require $\Omega(d/\alpha^2)$ samples. Here we provide formal evidence that the quadratic dependence in $1/\alpha$ is inherent for efficient algorithms. Specifically, we show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $\tilde{\Omega}(d^{1/2}/\alpha^2)$.

Related papers

Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models [37.42736399673992]
A single-index model (SIM) is a function of the form $sigma(mathbfwast cdot mathbfx)$, where $sigma: mathbbR to mathbbR$ is a known link function and $mathbfwast$ is a hidden unit vector. We show that a proper learner attains $L2$-error of $O(mathrmOPT)+epsilon$, where $
arXiv Detail & Related papers (2024-11-08T17:10:38Z)
Iterative thresholding for non-linear learning in the strong $\varepsilon$-contamination model [3.309767076331365]
We derive approximation bounds for learning single neuron models using thresholded descent. We also study the linear regression problem, where $sigma(mathbfx) = mathbfx$.
arXiv Detail & Related papers (2024-09-05T16:59:56Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$<n>We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z)
Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting. We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin. Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z)
Optimal Estimator for Linear Regression with Shuffled Labels [17.99906229036223]
This paper considers the task of linear regression with shuffled labels. $mathbf Y in mathbb Rntimes m, mathbf Pi in mathbb Rntimes p, mathbf B in mathbb Rptimes m$, and $mathbf Win mathbb Rntimes m$, respectively.
arXiv Detail & Related papers (2023-10-02T16:44:47Z)
Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We present an algorithm that tackles newthis problem in its most general distribution-independent setting. This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations. The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$. We propose a novel random technique, called Ternary Random Feature (TRF) The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy. For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.