Low-degree learning and the metric entropy of polynomials
- URL: http://arxiv.org/abs/2203.09659v4
- Date: Mon, 21 Oct 2024 18:58:20 GMT
- Title: Low-degree learning and the metric entropy of polynomials
- Authors: Alexandros Eskenazis, Paata Ivanisvili, Lauritz Streck,
- Abstract summary: We prove that any (deterministic or randomized) algorithm which learns $mathscrF_nd$ with $L$-accuracy $varepsilon$ requires at least $Omega(sqrtvarepsilon)2dlog n leq log mathsfM(mathscrF_n,d,|cdot|_L,varepsilon) satisfies the two-sided estimate $$c (1-varepsilon)2dlog
- Score: 44.99833362998488
- License:
- Abstract: Let $\mathscr{F}_{n,d}$ be the class of all functions $f:\{-1,1\}^n\to[-1,1]$ on the $n$-dimensional discrete hypercube of degree at most $d$. In the first part of this paper, we prove that any (deterministic or randomized) algorithm which learns $\mathscr{F}_{n,d}$ with $L_2$-accuracy $\varepsilon$ requires at least $\Omega((1-\sqrt{\varepsilon})2^d\log n)$ queries for large enough $n$, thus establishing the sharpness as $n\to\infty$ of a recent upper bound of Eskenazis and Ivanisvili (2021). To do this, we show that the $L_2$-packing numbers $\mathsf{M}(\mathscr{F}_{n,d},\|\cdot\|_{L_2},\varepsilon)$ of the concept class $\mathscr{F}_{n,d}$ satisfy the two-sided estimate $$c(1-\varepsilon)2^d\log n \leq \log \mathsf{M}(\mathscr{F}_{n,d},\|\cdot\|_{L_2},\varepsilon) \leq \frac{2^{Cd}\log n}{\varepsilon^4}$$ for large enough $n$, where $c, C>0$ are universal constants. In the second part of the paper, we present a logarithmic upper bound for the randomized query complexity of classes of bounded approximate polynomials whose Fourier spectra are concentrated on few subsets. As an application, we prove new estimates for the number of random queries required to learn approximate juntas of a given degree, functions with rapidly decaying Fourier tails and constant depth circuits of given size. Finally, we obtain bounds for the number of queries required to learn the polynomial class $\mathscr{F}_{n,d}$ without error in the query and random example models.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Quantum Sabotage Complexity [0.7812210699650152]
We show $mathsfQ(f_mathsfsab)$, the quantum query complexity of $f_mathsfsab$.
We show that when $f$ is the Indexing function, $mathsfQ(f_mathsfsab)=Theta(sqrtmathsfsab)$, ruling out the possibility that $mathsfQ(f_mathsfsab)=Theta(sqrtmathsf
arXiv Detail & Related papers (2024-08-22T17:57:58Z) - Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting.
We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin.
Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z) - Quantum and classical query complexities of functions of matrices [0.0]
We show that for any continuous function $f(x):[-1,1]rightarrow [-1,1]$, the quantum query complexity of computing $brai f(A) ketjpm varepsilon/4$ is lower bounded by $Omega(widetildedeg_varepsilon(f))$.
arXiv Detail & Related papers (2023-11-13T00:45:41Z) - Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Learning low-degree functions from a logarithmic number of random
queries [77.34726150561087]
We prove that for any integer $ninmathbbN$, $din1,ldots,n$ and any $varepsilon,deltain(0,1)$, a bounded function $f:-1,1nto[-1,1]$ of degree at most $d$ can be learned.
arXiv Detail & Related papers (2021-09-21T13:19:04Z) - Sharper bounds for online learning of smooth functions of a single
variable [0.0]
We show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
We also show that $opt_1+epsilon(mathcalF_q) = Theta(epsilon-frac12)$, where the constants in the bound do not depend on $q$.
arXiv Detail & Related papers (2021-05-30T23:06:21Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.