In Search of Robust Measures of Generalization
- URL: http://arxiv.org/abs/2010.11924v2
- Date: Wed, 20 Jan 2021 20:08:21 GMT
- Title: In Search of Robust Measures of Generalization
- Authors: Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal, Nitarshan
Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Daniel M. Roy
- Abstract summary: We develop bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, most of these bounds are numerically vacuous.
We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
- Score: 79.75709926309703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the principal scientific challenges in deep learning is explaining
generalization, i.e., why the particular way the community now trains networks
to achieve small training error also leads to small error on held-out data from
the same population. It is widely appreciated that some worst-case theories --
such as those based on the VC dimension of the class of predictors induced by
modern neural network architectures -- are unable to explain empirical
performance. A large volume of work aims to close this gap, primarily by
developing bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, however, most of these bounds are numerically
vacuous. Focusing on generalization bounds, this work addresses the question of
how to evaluate such bounds empirically. Jiang et al. (2020) recently described
a large-scale empirical study aimed at uncovering potential causal
relationships between bounds/measures and generalization. Building on their
study, we highlight where their proposed methods can obscure failures and
successes of generalization measures in explaining generalization. We argue
that generalization measures should instead be evaluated within the framework
of distributional robustness.
Related papers
- Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation [59.138470433237615]
We introduce statistical metrics that quantify both the linguistic and visual skew of a dataset for relational learning.
We show that systematically controlled metrics are strongly predictive of generalization performance.
This work informs an important direction towards quality-enhancing the data diversity or balance to scaling up the absolute size.
arXiv Detail & Related papers (2024-03-25T03:18:39Z) - Understanding What Affects the Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence [53.51724434972605]
This paper theoretically answers the key factors that contribute to the generalization gap when the testing environment has distractors.
Our theories indicate that minimizing the representation distance between training and testing environments, which aligns with human intuition, is the most critical for the benefit of reducing the generalization gap.
arXiv Detail & Related papers (2024-02-05T03:27:52Z) - Class-wise Generalization Error: an Information-Theoretic Analysis [22.877440350595222]
We study the class-generalization error, which quantifies the generalization performance of each individual class.
We empirically validate our proposed bounds in different neural networks and show that they accurately capture the complex class-generalization error behavior.
arXiv Detail & Related papers (2024-01-05T17:05:14Z) - Fine-grained analysis of non-parametric estimation for pairwise learning [9.676007573960383]
We are concerned with the generalization performance of non-parametric estimation for pairwise learning.
Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC, pairwise regression and metric and similarity learning.
arXiv Detail & Related papers (2023-05-31T08:13:14Z) - Domain Generalization -- A Causal Perspective [20.630396283221838]
Machine learning models have gained widespread success, from healthcare to personalized recommendations.
One of the preliminary assumptions of these models is the independent and identical distribution.
Since the models rely heavily on this assumption, they exhibit poor generalization capabilities.
arXiv Detail & Related papers (2022-09-30T01:56:49Z) - Assaying Out-Of-Distribution Generalization in Transfer Learning [103.57862972967273]
We take a unified view of previous work, highlighting message discrepancies that we address empirically.
We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting.
arXiv Detail & Related papers (2022-07-19T12:52:33Z) - Generalization Through The Lens Of Leave-One-Out Error [22.188535244056016]
We show that the leave-one-out error provides a tractable way to estimate the generalization ability of deep neural networks in the kernel regime.
Our work therefore demonstrates that the leave-one-out error provides a tractable way to estimate the generalization ability of deep neural networks in the kernel regime.
arXiv Detail & Related papers (2022-03-07T14:56:00Z) - Measuring Generalization with Optimal Transport [111.29415509046886]
We develop margin-based generalization bounds, where the margins are normalized with optimal transport costs.
Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets.
arXiv Detail & Related papers (2021-06-07T03:04:59Z) - Gi and Pal Scores: Deep Neural Network Generalization Statistics [58.8755389068888]
We introduce two new measures, the Gi-score and Pal-score, that capture a deep neural network's generalization capabilities.
Inspired by the Gini coefficient and Palma ratio, our statistics are robust measures of a network's invariance to perturbations that accurately predict generalization gaps.
arXiv Detail & Related papers (2021-04-08T01:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.