On Agnostic PAC Learning using $\mathcal{L}_2$-polynomial Regression and
Fourier-based Algorithms
- URL: http://arxiv.org/abs/2102.06277v1
- Date: Thu, 11 Feb 2021 21:28:55 GMT
- Title: On Agnostic PAC Learning using $\mathcal{L}_2$-polynomial Regression and
Fourier-based Algorithms
- Authors: Mohsen Heidari and Wojciech Szpankowski
- Abstract summary: We develop a framework using Hilbert spaces as a proxy to analyze PAC learning problems with structural properties.
We demonstrate that PAC learning with 0-1 loss is equivalent to an optimization in the Hilbert space domain.
- Score: 10.66048003460524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a framework using Hilbert spaces as a proxy to analyze PAC
learning problems with structural properties. We consider a joint Hilbert space
incorporating the relation between the true label and the predictor under a
joint distribution $D$. We demonstrate that agnostic PAC learning with 0-1 loss
is equivalent to an optimization in the Hilbert space domain. With our model,
we revisit the PAC learning problem using methods based on least-squares such
as $\mathcal{L}_2$ polynomial regression and Linial's low-degree algorithm. We
study learning with respect to several hypothesis classes such as half-spaces
and polynomial-approximated classes (i.e., functions approximated by a
fixed-degree polynomial). We prove that (under some distributional assumptions)
such methods obtain generalization error up to $2opt$ with $opt$ being the
optimal error of the class. Hence, we show the tightest bound on generalization
error when $opt\leq 0.2$.
Related papers
- Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution.
We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Prediction, Learning, Uniform Convergence, and Scale-sensitive
Dimensions [39.97534972432276]
We present a new general-purpose algorithm for learning classes of $[0,1]$-valued functions.
We prove a general upper bound on the expected absolute error of this algorithm.
We show how to apply both packing bounds to obtain improved general bounds on the sample complexity of learning.
arXiv Detail & Related papers (2023-04-21T15:51:35Z) - Agnostic PAC Learning of k-juntas Using L2-Polynomial Regression [9.732863739456036]
We present a new PAC learning algorithm based on the Fourier expansion with lower computational complexity.
We prove our results by connecting the PAC learning with 0-1 loss to the minimum mean square estimation problem.
We derive an elegant upper bound on the 0-1 loss in terms of the MMSE error and show that the sign of the MMSE is a PAC learner for any concept class containing it.
arXiv Detail & Related papers (2023-03-08T19:54:07Z) - On the complexity of PAC learning in Hilbert spaces [0.0]
We study the problem of binary classification from the point of view of learning convex polyhedra in Hilbert spaces.
We propose an algorithm for learning a polyhedron which correctly classifies at least $1- varepsilon$ of the distribution.
arXiv Detail & Related papers (2023-03-03T16:16:11Z) - Small Covers for Near-Zero Sets of Polynomials and Learning Latent
Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$.
Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z) - Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU
Networks [48.32532049640782]
We study the problem of learning one-hidden-layer ReLU networks with $k$ hidden units on $mathbbRd$ under Gaussian marginals.
For the case of positive coefficients, we give the first-time algorithm for this learning problem for $k$ up to $tildeOOmega(sqrtlog d)$.
arXiv Detail & Related papers (2020-06-22T17:53:54Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.