Improving Generalization Bounds for VC Classes Using the Hypergeometric
Tail Inversion
- URL: http://arxiv.org/abs/2111.00062v1
- Date: Fri, 29 Oct 2021 19:50:34 GMT
- Title: Improving Generalization Bounds for VC Classes Using the Hypergeometric
Tail Inversion
- Authors: Jean-Samuel Leboeuf, Fr\'ed\'eric LeBlanc and Mario Marchand
- Abstract summary: We significantly improve the generalization bounds for VC classes by using two main ideas.
First, we consider the hypergeometric inversion tail to obtain a very tight non-uniform distribution-independent risk upper bound for VC classes.
Second, we optimize the ghost sample trick to obtain a further non-negligible gain.
- Score: 3.157455635899373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We significantly improve the generalization bounds for VC classes by using
two main ideas. First, we consider the hypergeometric tail inversion to obtain
a very tight non-uniform distribution-independent risk upper bound for VC
classes. Second, we optimize the ghost sample trick to obtain a further
non-negligible gain. These improvements are then used to derive a relative
deviation bound, a multiclass margin bound, as well as a lower bound. Numerical
comparisons show that the new bound is nearly never vacuous, and is tighter
than other VC bounds for all reasonable data set sizes.
Related papers
- Optimal PAC Bounds Without Uniform Convergence [11.125968799758436]
We provide optimal high probability risk bounds through a framework that surpasses the limitations of uniform convergence arguments.
Our framework converts the leave-one-out error of permutation invariant predictors into high probability risk bounds.
Specifically, our result shows that certain aggregations of one-inclusion graph algorithms are optimal.
arXiv Detail & Related papers (2023-04-18T17:57:31Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized
Linear Models [33.36787620121057]
We prove a new generalization bound that shows for any class of linear predictors in Gaussian space.
We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021)
We show that application of our bound generalization using localized Gaussian width will generally be sharp for empirical risk minimizers.
arXiv Detail & Related papers (2022-10-21T16:16:55Z) - Max-Margin Works while Large Margin Fails: Generalization without
Uniform Convergence [55.26547500184405]
Existing tools rely on em uniform convergence em (UC), which guarantees that the test loss will be close to the training loss, uniformly over a class of candidate models.
Nagarajan and Kolter show that in certain simple linear and neural-network settings, any uniform convergence bound will be vacuous, leaving open the question of how to prove generalization in settings where UC fails.
We prove a new type of margin bound showing that above a certain signal-to-noise threshold, any near-max-margin will achieve almost no test loss in these two settings.
arXiv Detail & Related papers (2022-06-16T02:46:26Z) - On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual
Recognition [65.67315418971688]
We show that truncating small eigenvalues of the Global Covariance Pooling (GCP) can attain smoother gradient.
On fine-grained datasets, truncating the small eigenvalues would make the model fail to converge.
Inspired by this observation, we propose a network branch dedicated to magnifying the importance of small eigenvalues.
arXiv Detail & Related papers (2022-05-26T11:41:36Z) - Minimax Optimal Quantile and Semi-Adversarial Regret via
Root-Logarithmic Regularizers [31.102181563065844]
Quantile (and, more generally, KL) regret bounds relax the goal of competing against the best individual expert to only competing against a majority of experts on adversarial data.
More recently, the semi-adversarial paradigm (Bilodeau, Negrea, and Roy 2020) provides an alternative relaxation of adversarial online learning by considering data that may be neither fully adversarial nor adversarial.
We achieve the minimax optimal regret in both paradigms using FTRL with separate, novel, root-logarithmic regularizers, both of which can be interpreted as yielding variants of NormalHedge.
arXiv Detail & Related papers (2021-10-27T22:38:52Z) - Regret Bounds without Lipschitz Continuity: Online Learning with
Relative-Lipschitz Losses [19.554559253046225]
In online convex optimization (OCO), Lipschitz continuity of the functions is commonly assumed in order to obtain sublinear regret.
In this work, we consider OCO for relative Lipschitz and relative strongly convex functions.
We show regret bounds for the follow the regularized leader algorithms and a variant of online mirror descent.
arXiv Detail & Related papers (2020-10-22T20:22:19Z) - Self-Weighted Robust LDA for Multiclass Classification with Edge Classes [111.5515086563592]
A novel self-weighted robust LDA with l21-norm based between-class distance criterion, called SWRLDA, is proposed for multi-class classification.
The proposed SWRLDA is easy to implement, and converges fast in practice.
arXiv Detail & Related papers (2020-09-24T12:32:55Z) - Comparator-adaptive Convex Bandits [77.43350984086119]
We develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small.
We extend the ideas to convex bandits with Lipschitz or smooth loss functions.
arXiv Detail & Related papers (2020-07-16T16:33:35Z) - Near-Tight Margin-Based Generalization Bounds for Support Vector
Machines [15.447966950703952]
Support Vector Machines (SVMs) are among the most fundamental tools for binary classification.
In this paper, we revisit and improve the classic generalization bounds in terms of margins.
We complement our new generalization bound by a nearly matching lower bound, thus almost settling the generalization performance of SVMs in terms of margins.
arXiv Detail & Related papers (2020-06-03T11:22:37Z) - On the Dual Formulation of Boosting Algorithms [92.74617630106559]
We show that the Lagrange problems of AdaBoost, LogitBoost and soft-marginBoost are all dual problems with generalized hinge loss entropy.
By looking at the dual problems of these boosting algorithms, we show that the success of boosting can be understood in terms of maintaining a better margin distribution.
arXiv Detail & Related papers (2009-01-23T02:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.