Exact Gap between Generalization Error and Uniform Convergence in Random
Feature Models
- URL: http://arxiv.org/abs/2103.04554v1
- Date: Mon, 8 Mar 2021 05:20:36 GMT
- Title: Exact Gap between Generalization Error and Uniform Convergence in Random
Feature Models
- Authors: Zitong Yang, Yu Bai, Song Mei
- Abstract summary: We show that there could be a large gap between the classical uniform convergence bound and the actual test error of zero-training-error predictors (interpolators)
We derive and prove analytical expressions for three quantities in this model.
We show that, in the setting where the classical uniform convergence bound is vacuous (diverges to $infty$), uniform convergence over the interpolators still gives a non-trivial bound of the test error of interpolating solutions.
- Score: 12.588676054808044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work showed that there could be a large gap between the classical
uniform convergence bound and the actual test error of zero-training-error
predictors (interpolators) such as deep neural networks. To better understand
this gap, we study the uniform convergence in the nonlinear random feature
model and perform a precise theoretical analysis on how uniform convergence
depends on the sample size and the number of parameters. We derive and prove
analytical expressions for three quantities in this model: 1) classical uniform
convergence over norm balls, 2) uniform convergence over interpolators in the
norm ball (recently proposed by Zhou et al. (2020)), and 3) the risk of minimum
norm interpolator. We show that, in the setting where the classical uniform
convergence bound is vacuous (diverges to $\infty$), uniform convergence over
the interpolators still gives a non-trivial bound of the test error of
interpolating solutions. We also showcase a different setting where classical
uniform convergence bound is non-vacuous, but uniform convergence over
interpolators can give an improved sample complexity guarantee. Our result
provides a first exact comparison between the test errors and uniform
convergence bounds for interpolators beyond simple linear models.
Related papers
- C$^3$DG: Conditional Domain Generalization for Hyperspectral Imagery Classification with Convergence and Constrained-risk Theories [23.21421412818663]
Hyperspectral imagery (HSI) classification may suffer the challenge of hyperspectral-monospectra.
Joint spatial-spectral feature extraction is a popular solution for the problem.
We propose a Convergence and Error-Constrained Conditional Domain Generalization method.
arXiv Detail & Related papers (2024-07-04T18:03:45Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds, and
Benign Overfitting [35.78863301525758]
We prove a generic uniform convergence guarantee on the generalization error of interpolators in an arbitrary hypothesis class.
Applying the generic bound to Euclidean norm balls recovers the consistency result of Bartlett et al. ( 2020) for minimum-norm interpolators.
We show how norm-based generalization bounds can explain and be used to analyze benign overfitting, at least in some settings.
arXiv Detail & Related papers (2021-06-17T06:58:10Z) - Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z) - Understanding Double Descent Requires a Fine-Grained Bias-Variance
Decomposition [34.235007566913396]
We describe an interpretable, symmetric decomposition of the variance into terms associated with the labels.
We find that the bias decreases monotonically with the network width, but the variance terms exhibit non-monotonic behavior.
We also analyze the strikingly rich phenomenology that arises.
arXiv Detail & Related papers (2020-11-04T21:04:02Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - On Uniform Convergence and Low-Norm Interpolation Learning [25.96459928769678]
We show that uniformly bounding the difference between empirical and population errors cannot show any learning in the norm ball.
We argue we can explain the consistency of the minimal-norm interpolator with a slightly weaker, yet standard, notion: uniform convergence of zero-error predictors in a norm ball.
arXiv Detail & Related papers (2020-06-10T16:49:28Z) - The Convergence Indicator: Improved and completely characterized
parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge.
Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv Detail & Related papers (2020-06-06T19:08:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.