Robustness Implies Generalization via Data-Dependent Generalization
Bounds
- URL: http://arxiv.org/abs/2206.13497v1
- Date: Mon, 27 Jun 2022 17:58:06 GMT
- Title: Robustness Implies Generalization via Data-Dependent Generalization
Bounds
- Authors: Kenji Kawaguchi, Zhun Deng, Kyle Luh, Jiaoyang Huang
- Abstract summary: This paper proves that robustness implies generalization via data-dependent generalization bounds.
We present several examples, including ones for lasso and deep learning, in which our bounds are provably preferable.
- Score: 24.413499775513145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proves that robustness implies generalization via data-dependent
generalization bounds. As a result, robustness and generalization are shown to
be connected closely in a data-dependent manner. Our bounds improve previous
bounds in two directions, to solve an open problem that has seen little
development since 2010. The first is to reduce the dependence on the covering
number. The second is to remove the dependence on the hypothesis space. We
present several examples, including ones for lasso and deep learning, in which
our bounds are provably preferable. The experiments on real-world data and
theoretical models demonstrate near-exponential improvements in various
situations. To achieve these improvements, we do not require additional
assumptions on the unknown distribution; instead, we only incorporate an
observable and computable property of the training samples. A key technical
innovation is an improved concentration bound for multinomial random variables
that is of independent interest beyond robustness and generalization.
Related papers
- Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets [25.250314934981233]
We first apply the PAC-Bayesian framework on random sets' in a rigorous way, where the training algorithm is assumed to output a data-dependent hypothesis set.
This approach allows us to prove data-dependent bounds, which can be applicable in numerous contexts.
arXiv Detail & Related papers (2024-04-26T14:28:18Z) - Trade-off Between Dependence and Complexity for Nonparametric Learning
-- an Empirical Process Approach [10.27974860479791]
In many applications where the data exhibit temporal dependencies, the corresponding empirical processes are much less understood.
We present a general bound on the expected supremum of empirical processes under standard $beta/rho$-mixing assumptions.
We show that even under long-range dependence, it is possible to attain the same rates as in the i.i.d. setting.
arXiv Detail & Related papers (2024-01-17T05:08:37Z) - Data-dependent Generalization Bounds via Variable-Size Compressibility [16.2444595840653]
We establish novel data-dependent upper bounds on the generalization error through the lens of a "variable-size compressibility" framework.
In this framework, the generalization error of an algorithm is linked to a variable-size 'compression rate' of its input data.
Our new generalization bounds that we establish are tail bounds, tail bounds on the expectation, and in-expectations bounds.
arXiv Detail & Related papers (2023-03-09T16:17:45Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Adversarially Robust Models may not Transfer Better: Sufficient
Conditions for Domain Transferability from the View of Regularization [17.825841580342715]
Machine learning robustness and domain generalization are fundamentally correlated.
Recent studies show that more robust (adversarially trained) models are more generalizable.
There is a lack of theoretical understanding of their fundamental connections.
arXiv Detail & Related papers (2022-02-03T20:26:27Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - An Online Learning Approach to Interpolation and Extrapolation in Domain
Generalization [53.592597682854944]
We recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test.
We show that ERM is provably minimax-optimal for both tasks.
arXiv Detail & Related papers (2021-02-25T19:06:48Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.