Generalization Bounds with Data-dependent Fractal Dimensions
- URL: http://arxiv.org/abs/2302.02766v2
- Date: Mon, 10 Jul 2023 14:08:37 GMT
- Title: Generalization Bounds with Data-dependent Fractal Dimensions
- Authors: Benjamin Dupuis, George Deligiannidis, Umut \c{S}im\c{s}ekli
- Abstract summary: We prove fractal geometry-based generalization bounds without requiring any Lipschitz assumption.
Despite introducing a significant amount of technical complications, this new notion lets us control the generalization error.
- Score: 5.833272638548154
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Providing generalization guarantees for modern neural networks has been a
crucial task in statistical learning. Recently, several studies have attempted
to analyze the generalization error in such settings by using tools from
fractal geometry. While these works have successfully introduced new
mathematical tools to apprehend generalization, they heavily rely on a
Lipschitz continuity assumption, which in general does not hold for neural
networks and might make the bounds vacuous. In this work, we address this issue
and prove fractal geometry-based generalization bounds without requiring any
Lipschitz assumption. To achieve this goal, we build up on a classical covering
argument in learning theory and introduce a data-dependent fractal dimension.
Despite introducing a significant amount of technical complications, this new
notion lets us control the generalization error (over either fixed or random
hypothesis spaces) along with certain mutual information (MI) terms. To provide
a clearer interpretation to the newly introduced MI terms, as a next step, we
introduce a notion of "geometric stability" and link our bounds to the prior
art. Finally, we make a rigorous connection between the proposed data-dependent
dimension and topological data analysis tools, which then enables us to compute
the dimension in a numerically efficient way. We support our theory with
experiments conducted on various settings.
Related papers
- Slicing Mutual Information Generalization Bounds for Neural Networks [14.48773730230054]
We introduce new, tighter information-theoretic generalization bounds tailored for deep learning algorithms.
Our bounds offer significant computational and statistical advantages over standard MI bounds.
We extend our analysis to algorithms whose parameters do not need to exactly lie on random subspaces.
arXiv Detail & Related papers (2024-06-06T13:15:37Z) - On the Limitations of Fractal Dimension as a Measure of Generalization [17.38382314570976]
Bounding and predicting the generalization gap of neural networks remains a central open problem in theoretical machine learning.
There is a recent and growing body of literature that proposes the framework of fractals to model optimization trajectories of neural networks, motivating generalization bounds and measures based on the fractal dimension of the trajectory.
This paper performs an empirical evaluation of these persistent homology-based generalization measures, with an in-depth statistical analysis.
arXiv Detail & Related papers (2024-06-04T11:56:19Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Rate-Distortion Theoretic Generalization Bounds for Stochastic Learning
Algorithms [12.020634332110147]
We prove novel generalization bounds through the lens of rate-distortion theory.
Our results bring a more unified perspective on generalization and open up several future research directions.
arXiv Detail & Related papers (2022-03-04T18:12:31Z) - Intrinsic Dimension, Persistent Homology and Generalization in Neural
Networks [19.99615698375829]
We show that generalization error can be equivalently bounded in terms of a notion called the 'persistent homology dimension' (PHD)
We develop an efficient algorithm to estimate PHD in the scale of modern deep neural networks.
Our experiments show that the proposed approach can efficiently compute a network's intrinsic dimension in a variety of settings.
arXiv Detail & Related papers (2021-11-25T17:06:15Z) - Explaining generalization in deep learning: progress and fundamental
limits [8.299945169799795]
In the first part of the thesis, we will empirically study how training deep networks via gradient descent implicitly controls the networks' capacity.
We will then derive em data-dependent em uniform-convergence-based generalization bounds with improved dependencies on the parameter count.
In the last part of the thesis, we will introduce an em empirical technique to estimate generalization using unlabeled data.
arXiv Detail & Related papers (2021-10-17T21:17:30Z) - Intrinsic Dimension Estimation [92.87600241234344]
We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees.
We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending on the intrinsic dimension of the data.
arXiv Detail & Related papers (2021-06-08T00:05:39Z) - Post-mortem on a deep learning contest: a Simpson's paradox and the
complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models.
We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data.
We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z) - Bayesian Quadrature on Riemannian Data Manifolds [79.71142807798284]
A principled way to model nonlinear geometric structure inherent in data is provided.
However, these operations are typically computationally demanding.
In particular, we focus on Bayesian quadrature (BQ) to numerically compute integrals over normal laws.
We show that by leveraging both prior knowledge and an active exploration scheme, BQ significantly reduces the number of required evaluations.
arXiv Detail & Related papers (2021-02-12T17:38:04Z) - Dimension Free Generalization Bounds for Non Linear Metric Learning [61.193693608166114]
We provide uniform generalization bounds for two regimes -- the sparse regime, and a non-sparse regime.
We show that by relying on a different, new property of the solutions, it is still possible to provide dimension free generalization guarantees.
arXiv Detail & Related papers (2021-02-07T14:47:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.