Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise
- URL: http://arxiv.org/abs/2201.09818v1
- Date: Mon, 24 Jan 2022 17:33:19 GMT
- Title: Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise
- Authors: Rajai Nasser, Stefan Tiegel
- Abstract summary: tightest statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise.
We show that for arbitrary $eta in [0,1/2]$ every SQ algorithm achieving misclassification error better than $eta$ requires queries of superpolynomial accuracy.
- Score: 9.378684220920562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give tight statistical query (SQ) lower bounds for learnining halfspaces
in the presence of Massart noise. In particular, suppose that all labels are
corrupted with probability at most $\eta$. We show that for arbitrary $\eta \in
[0,1/2]$ every SQ algorithm achieving misclassification error better than
$\eta$ requires queries of superpolynomial accuracy or at least a
superpolynomial number of queries. Further, this continues to hold even if the
information-theoretically optimal error $\mathrm{OPT}$ is as small as
$\exp\left(-\log^c(d)\right)$, where $d$ is the dimension and $0 < c < 1$ is an
arbitrary absolute constant, and an overwhelming fraction of examples are
noiseless. Our lower bound matches known polynomial time algorithms, which are
also implementable in the SQ framework. Previously, such lower bounds only
ruled out algorithms achieving error $\mathrm{OPT} + \epsilon$ or error better
than $\Omega(\eta)$ or, if $\eta$ is close to $1/2$, error $\eta - o_\eta(1)$,
where the term $o_\eta(1)$ is constant in $d$ but going to 0 for $\eta$
approaching $1/2$.
As a consequence, we also show that achieving misclassification error better
than $1/2$ in the $(A,\alpha)$-Tsybakov model is SQ-hard for $A$ constant and
$\alpha$ bounded away from 1.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Memory-Sample Lower Bounds for Learning Parity with Noise [2.724141845301679]
We show, for the well-studied problem of learning parity under noise, that any learning algorithm requires either a memory of size $Omega(n2/varepsilon)$ or an exponential number of samples.
Our proof is based on adapting the arguments in [Raz'17,GRT'18] to the noisy case.
arXiv Detail & Related papers (2021-07-05T23:34:39Z) - The planted matching problem: Sharp threshold and infinite-order phase
transition [25.41713098167692]
We study the problem of reconstructing a perfect matching $M*$ hidden in a randomly weighted $ntimes n$ bipartite graph.
We show that if $sqrtd B(mathcalP,mathcalQ) ge 1+epsilon$ for an arbitrarily small constant $epsilon>0$, the reconstruction error for any estimator is shown to be bounded away from $0$.
arXiv Detail & Related papers (2021-03-17T00:59:33Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - $Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z) - Robust testing of low-dimensional functions [8.927163098772589]
A recent work of the authors showed that linear $k$-juntas are testable.
Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result.
We obtain a fully noise tolerant tester with query complexity $kO(mathsfpoly(log k/epsilon))$ for the class of intersection of $k$-halfspaces.
arXiv Detail & Related papers (2020-04-24T10:23:12Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.