Near-Optimal Statistical Query Lower Bounds for Agnostically Learning
Intersections of Halfspaces with Gaussian Marginals
- URL: http://arxiv.org/abs/2202.05096v1
- Date: Thu, 10 Feb 2022 15:34:10 GMT
- Title: Near-Optimal Statistical Query Lower Bounds for Agnostically Learning
Intersections of Halfspaces with Gaussian Marginals
- Authors: Daniel Hsu, Clayton Sanford, Rocco Servedio, Emmanouil-Vasileios
Vlatakis-Gkaragkounis
- Abstract summary: We consider the well-studied problem of learning intersections halfspaces under the Gaussian distribution in the challenging emphagnostic learning model.
We prove two variants of our lower bound, each of which combines ingredients from Diakonikolas et al. (2021) with (an extension of) a different earlier approach for SQ lower bounds for the Boolean setting.
- Score: 10.06689891744466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the well-studied problem of learning intersections of halfspaces
under the Gaussian distribution in the challenging \emph{agnostic learning}
model. Recent work of Diakonikolas et al. (2021) shows that any Statistical
Query (SQ) algorithm for agnostically learning the class of intersections of
$k$ halfspaces over $\mathbb{R}^n$ to constant excess error either must make
queries of tolerance at most $n^{-\tilde{\Omega}(\sqrt{\log k})}$ or must make
$2^{n^{\Omega(1)}}$ queries. We strengthen this result by improving the
tolerance requirement to $n^{-\tilde{\Omega}(\log k)}$. This lower bound is
essentially best possible since an SQ algorithm of Klivans et al. (2008)
agnostically learns this class to any constant excess error using $n^{O(\log
k)}$ queries of tolerance $n^{-O(\log k)}$. We prove two variants of our lower
bound, each of which combines ingredients from Diakonikolas et al. (2021) with
(an extension of) a different earlier approach for agnostic SQ lower bounds for
the Boolean setting due to Dachman-Soled et al. (2014). Our approach also
yields lower bounds for agnostically SQ learning the class of "convex subspace
juntas" (studied by Vempala, 2010) and the class of sets with bounded Gaussian
surface area; all of these lower bounds are nearly optimal since they
essentially match known upper bounds from Klivans et al. (2008).
Related papers
- Improved Hardness Results for Learning Intersections of Halfspaces [2.1393480341554736]
We show strong (and surprisingly simple) lower bounds for weakly learning intersections of halfspaces in the improper setting.
We significantly narrow this gap by showing that even learning $omega(log log N)$ halfspaces in $N$ takes super-polynomial time.
Specifically, we show that for any $k$ halfspaces in dimension $N$ requires accuracy $N-Omega(k)$, or exponentially many queries.
arXiv Detail & Related papers (2024-02-25T05:26:35Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution.
We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces
and ReLU Regression under Gaussian Marginals [43.0867217287089]
We study the task of agnostically learning halfspaces under the Gaussian distribution.
We prove a near-optimal computational hardness result for this task.
arXiv Detail & Related papers (2023-02-13T16:46:23Z) - Order-Optimal Error Bounds for Noisy Kernel-Based Bayesian Quadrature [42.129843613950285]
We consider functions in a em Reproducing Kernel Hilbert Space (RKHS) with the Mat'ern-$nu$ kernel.
We find that when the black-box queries are subject to Gaussian noise having variance $sigma2$, any algorithm making at most $T$ queries must incur a mean absolute error of $Omega(T-fracnud-1 + sigma T-frac12)$.
arXiv Detail & Related papers (2022-02-22T01:49:41Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits
with Linear Payoff Functions [53.77572276969548]
We show that the C$2$UCB algorithm has the optimal regret bound $tildeO(dsqrtkT + dk)$ for the partition matroid constraints.
For general constraints, we propose an algorithm that modifies the reward estimates of arms in the C$2$UCB algorithm.
arXiv Detail & Related papers (2021-01-20T04:29:18Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Q-learning with Uniformly Bounded Variance: Large Discounting is Not a
Barrier to Fast Learning [7.6146285961466]
We introduce a new class of algorithms that have sample complexity uniformly bounded for all $gamma 1$.
We show that the covariance of the Q-learning algorithm with an optimized step-size sequence is a quadratic function of $1/(1-gamma)$; an expected, and essentially known result.
arXiv Detail & Related papers (2020-02-24T15:12:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.