Prediction, Learning, Uniform Convergence, and Scale-sensitive
Dimensions
- URL: http://arxiv.org/abs/2304.11059v2
- Date: Mon, 24 Apr 2023 17:06:45 GMT
- Title: Prediction, Learning, Uniform Convergence, and Scale-sensitive
Dimensions
- Authors: Peter L. Bartlett and Philip M. Long
- Abstract summary: We present a new general-purpose algorithm for learning classes of $[0,1]$-valued functions.
We prove a general upper bound on the expected absolute error of this algorithm.
We show how to apply both packing bounds to obtain improved general bounds on the sample complexity of learning.
- Score: 39.97534972432276
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a new general-purpose algorithm for learning classes of
$[0,1]$-valued functions in a generalization of the prediction model, and prove
a general upper bound on the expected absolute error of this algorithm in terms
of a scale-sensitive generalization of the Vapnik dimension proposed by Alon,
Ben-David, Cesa-Bianchi and Haussler. We give lower bounds implying that our
upper bounds cannot be improved by more than a constant factor in general. We
apply this result, together with techniques due to Haussler and to Benedek and
Itai, to obtain new upper bounds on packing numbers in terms of this
scale-sensitive notion of dimension. Using a different technique, we obtain new
bounds on packing numbers in terms of Kearns and Schapire's fat-shattering
function. We show how to apply both packing bounds to obtain improved general
bounds on the sample complexity of agnostic learning. For each $\epsilon > 0$,
we establish weaker sufficient and stronger necessary conditions for a class of
$[0,1]$-valued functions to be agnostically learnable to within $\epsilon$, and
to be an $\epsilon$-uniform Glivenko-Cantelli class.
This is a manuscript that was accepted by JCSS, together with a correction.
Related papers
- Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct.
Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z) - Local Borsuk-Ulam, Stability, and Replicability [16.6959756920423]
We show that both list-replicable and globally stable learning are impossible in the PAC setting.
Specifically, we establish optimal bounds for list replicability and global stability numbers in finite classes.
arXiv Detail & Related papers (2023-11-02T21:10:16Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization.
In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z) - Quantum and classical low-degree learning via a dimension-free Remez
inequality [52.12931955662553]
We show a new way to relate functions on the hypergrid to their harmonic extensions over the polytorus.
We show the supremum of a function $f$ over products of the cyclic group $exp(2pi i k/K)_k=1K$.
We extend to new spaces a recent line of work citeEI22, CHP, VZ22 that gave similarly efficient methods for learning low-degrees on hypercubes and observables on qubits.
arXiv Detail & Related papers (2023-01-04T04:15:40Z) - Learning Algorithm Generalization Error Bounds via Auxiliary Distributions [16.44492672878356]
Generalization error bounds are essential for comprehending how well machine learning models work.
In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors.
arXiv Detail & Related papers (2022-10-02T10:37:04Z) - Localization, Convexity, and Star Aggregation [0.0]
Offset Rademacher complexities have been shown to imply sharp, linear-dependent upper bounds for the square loss.
We show that in the statistical setting, the offset bound can be generalized to any loss satisfying certain uniform convexity.
arXiv Detail & Related papers (2021-05-19T00:47:59Z) - On Agnostic PAC Learning using $\mathcal{L}_2$-polynomial Regression and
Fourier-based Algorithms [10.66048003460524]
We develop a framework using Hilbert spaces as a proxy to analyze PAC learning problems with structural properties.
We demonstrate that PAC learning with 0-1 loss is equivalent to an optimization in the Hilbert space domain.
arXiv Detail & Related papers (2021-02-11T21:28:55Z) - Q-learning with Uniformly Bounded Variance: Large Discounting is Not a
Barrier to Fast Learning [7.6146285961466]
We introduce a new class of algorithms that have sample complexity uniformly bounded for all $gamma 1$.
We show that the covariance of the Q-learning algorithm with an optimized step-size sequence is a quadratic function of $1/(1-gamma)$; an expected, and essentially known result.
arXiv Detail & Related papers (2020-02-24T15:12:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.