Low-degree learning and the metric entropy of polynomials
- URL: http://arxiv.org/abs/2203.09659v3
- Date: Mon, 27 Nov 2023 17:23:10 GMT
- Title: Low-degree learning and the metric entropy of polynomials
- Authors: Alexandros Eskenazis, Paata Ivanisvili, Lauritz Streck
- Abstract summary: We prove that any (deterministic or randomized) algorithm which learns $mathscrF_nd$ with $L$-accuracy $varepsilon$ requires at least $Omega(sqrtvarepsilon)2dlog n leq log mathsfM(mathscrF_n,d,|cdot|_L,varepsilon) satisfies the two-sided estimate $$c (1-varepsilon)2dlog
- Score: 49.1574468325115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Let $\mathscr{F}_{n,d}$ be the class of all functions $f:\{-1,1\}^n\to[-1,1]$
on the $n$-dimensional discrete hypercube of degree at most $d$. In the first
part of this paper, we prove that any (deterministic or randomized) algorithm
which learns $\mathscr{F}_{n,d}$ with $L_2$-accuracy $\varepsilon$ requires at
least $\Omega((1-\sqrt{\varepsilon})2^d\log n)$ queries for large enough $n$,
thus establishing the sharpness as $n\to\infty$ of a recent upper bound of
Eskenazis and Ivanisvili (2021). To do this, we show that the $L_2$-packing
numbers $\mathsf{M}(\mathscr{F}_{n,d},\|\cdot\|_{L_2},\varepsilon)$ of the
concept class $\mathscr{F}_{n,d}$ satisfy the two-sided estimate
$$c(1-\varepsilon)2^d\log n \leq \log
\mathsf{M}(\mathscr{F}_{n,d},\|\cdot\|_{L_2},\varepsilon) \leq \frac{2^{Cd}\log
n}{\varepsilon^4}$$ for large enough $n$, where $c, C>0$ are universal
constants. In the second part of the paper, we present a logarithmic upper
bound for the randomized query complexity of classes of bounded approximate
polynomials whose Fourier spectra are concentrated on few subsets. As an
application, we prove new estimates for the number of random queries required
to learn approximate juntas of a given degree, functions with rapidly decaying
Fourier tails and constant depth circuits of given size. Finally, we obtain
bounds for the number of queries required to learn the polynomial class
$\mathscr{F}_{n,d}$ without error in the query and random example models.
Related papers
- Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting.
We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin.
Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z) - Quantum and classical query complexities of functions of matrices [0.0]
We show that for any continuous function $f(x):[-1,1]rightarrow [-1,1]$, the quantum query complexity of computing $brai f(A) ketjpm varepsilon/4$ is lower bounded by $Omega(widetildedeg_varepsilon(f))$.
arXiv Detail & Related papers (2023-11-13T00:45:41Z) - Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Learning low-degree functions from a logarithmic number of random
queries [77.34726150561087]
We prove that for any integer $ninmathbbN$, $din1,ldots,n$ and any $varepsilon,deltain(0,1)$, a bounded function $f:-1,1nto[-1,1]$ of degree at most $d$ can be learned.
arXiv Detail & Related papers (2021-09-21T13:19:04Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Sharper bounds for online learning of smooth functions of a single
variable [0.0]
We show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
We also show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
arXiv Detail & Related papers (2021-05-30T23:06:21Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.