Toward $L_\infty$-recovery of Nonlinear Functions: A Polynomial Sample
Complexity Bound for Gaussian Random Fields
- URL: http://arxiv.org/abs/2305.00322v1
- Date: Sat, 29 Apr 2023 18:33:39 GMT
- Title: Toward $L_\infty$-recovery of Nonlinear Functions: A Polynomial Sample
Complexity Bound for Gaussian Random Fields
- Authors: Kefan Dong, Tengyu Ma
- Abstract summary: Many machine learning applications require learning a function with a small worst-case error over the entire input domain, that is, the $L_infty$-error.
Most existing theoretical works only guarantee recovery in average errors such as the $L$-error.
This paper makes some initial steps beyond the impossibility results by leveraging the randomness in the ground-truth functions.
- Score: 35.76184529520015
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Many machine learning applications require learning a function with a small
worst-case error over the entire input domain, that is, the $L_\infty$-error,
whereas most existing theoretical works only guarantee recovery in average
errors such as the $L_2$-error. $L_\infty$-recovery from polynomial samples is
even impossible for seemingly simple function classes such as constant-norm
infinite-width two-layer neural nets. This paper makes some initial steps
beyond the impossibility results by leveraging the randomness in the
ground-truth functions. We prove a polynomial sample complexity bound for
random ground-truth functions drawn from Gaussian random fields. Our key
technical novelty is to prove that the degree-$k$ spherical harmonics
components of a function from Gaussian random field cannot be spiky in that
their $L_\infty$/$L_2$ ratios are upperbounded by $O(d \sqrt{\ln k})$ with high
probability. In contrast, the worst-case $L_\infty$/$L_2$ ratio for degree-$k$
spherical harmonics is on the order of $\Omega(\min\{d^{k/2},k^{d/2}\})$.
Related papers
- Random features and polynomial rules [0.0]
We present a generalization of the performance of random features models for generic supervised learning problems with Gaussian data.
We find good agreement far from the limits where $Dto infty$ and at least one between $P/DK$, $N/DL$ remains finite.
arXiv Detail & Related papers (2024-02-15T18:09:41Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - The $L^\infty$ Learnability of Reproducing Kernel Hilbert Spaces [3.2931415075553576]
We analyze the learnability of kernel spaces (RKHS) under the $Linfty$ norm.
For dot-product kernels on the sphere, we identify conditions when the $Linfty$ learning can be achieved with Hilbert samples.
arXiv Detail & Related papers (2023-06-05T12:29:13Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Small Covers for Near-Zero Sets of Polynomials and Learning Latent
Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$.
Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Reverse Euclidean and Gaussian isoperimetric inequalities for parallel
sets with applications [0.0]
We show that the surface area of an $r$-parallel set in $mathbb Rd$ with volume at most $V$ is upper-bounded by $eTheta(d)V/r$, whereas its Gaussian surface area is upper-bounded by $max(eTheta(d), eTheta(d)/r)$.
We also derive a reverse form of the Brunn-Minkowski inequality for $r$-parallel sets, and as an aside a reverse entropy power inequality for Gaussian-smoothed random variables
arXiv Detail & Related papers (2020-06-16T23:58:54Z) - Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation [30.137884459159107]
We consider the question of learning $Q$-function in a sample efficient manner for reinforcement learning with continuous state and action spaces.
We develop a simple, iterative learning algorithm that finds $epsilon$-Schmidt $Q$-function with sample complexity of $widetildeO(frac1epsilonmax(d1), d_2)+2)$ when the optimal $Q$-function has low rank $r$ and the factor $$ is below a certain threshold.
arXiv Detail & Related papers (2020-06-11T00:55:35Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.