The Geometric Occam's Razor Implicit in Deep Learning
- URL: http://arxiv.org/abs/2111.15090v2
- Date: Wed, 1 Dec 2021 04:54:50 GMT
- Title: The Geometric Occam's Razor Implicit in Deep Learning
- Authors: Benoit Dherin, Michael Munn, and David G.T. Barrett
- Abstract summary: We show that neural networks trained with gradient descent are implicitly regularized by a Geometric Occam's Razor.
For one-dimensional regression, the geometric model complexity is simply given by the arc length of the function.
For higher-dimensional settings, the geometric model complexity depends on the Dirichlet energy of the function.
- Score: 7.056824589733872
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In over-parameterized deep neural networks there can be many possible
parameter configurations that fit the training data exactly. However, the
properties of these interpolating solutions are poorly understood. We argue
that over-parameterized neural networks trained with stochastic gradient
descent are subject to a Geometric Occam's Razor; that is, these networks are
implicitly regularized by the geometric model complexity. For one-dimensional
regression, the geometric model complexity is simply given by the arc length of
the function. For higher-dimensional settings, the geometric model complexity
depends on the Dirichlet energy of the function. We explore the relationship
between this Geometric Occam's Razor, the Dirichlet energy and other known
forms of implicit regularization. Finally, for ResNets trained on CIFAR-10, we
observe that Dirichlet energy measurements are consistent with the action of
this implicit Geometric Occam's Razor.
Related papers
- Decoder ensembling for learned latent geometries [15.484595752241122]
We show how to easily compute geodesics on the associated expected manifold.
We find this simple and reliable, thereby coming one step closer to easy-to-use latent geometries.
arXiv Detail & Related papers (2024-08-14T12:35:41Z) - Transolver: A Fast Transformer Solver for PDEs on General Geometries [66.82060415622871]
We present Transolver, which learns intrinsic physical states hidden behind discretized geometries.
By calculating attention to physics-aware tokens encoded from slices, Transovler can effectively capture intricate physical correlations.
Transolver achieves consistent state-of-the-art with 22% relative gain across six standard benchmarks and also excels in large-scale industrial simulations.
arXiv Detail & Related papers (2024-02-04T06:37:38Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Automatic Parameterization for Aerodynamic Shape Optimization via Deep
Geometric Learning [60.69217130006758]
We propose two deep learning models that fully automate shape parameterization for aerodynamic shape optimization.
Both models are optimized to parameterize via deep geometric learning to embed human prior knowledge into learned geometric patterns.
We perform shape optimization experiments on 2D airfoils and discuss the applicable scenarios for the two models.
arXiv Detail & Related papers (2023-05-03T13:45:40Z) - Geometric Clifford Algebra Networks [53.456211342585824]
We propose Geometric Clifford Algebra Networks (GCANs) for modeling dynamical systems.
GCANs are based on symmetry group transformations using geometric (Clifford) algebras.
arXiv Detail & Related papers (2023-02-13T18:48:33Z) - Differential Geometry in Neural Implicits [0.6198237241838558]
We introduce a neural implicit framework that bridges discrete differential geometry of triangle meshes and continuous differential geometry of neural implicit surfaces.
It exploits the differentiable properties of neural networks and the discrete geometry of triangle meshes to approximate them as the zero-level sets of neural implicit functions.
arXiv Detail & Related papers (2022-01-23T13:40:45Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Length Learning for Planar Euclidean Curves [0.0]
This work focuses on learning the length of planar sampled curves created by a sine waves dataset.
The robustness to additive noise and discretization errors were tested.
arXiv Detail & Related papers (2021-02-03T06:30:03Z) - Hyperbolic Neural Networks++ [66.16106727715061]
We generalize the fundamental components of neural networks in a single hyperbolic geometry model, namely, the Poincar'e ball model.
Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.
arXiv Detail & Related papers (2020-06-15T08:23:20Z) - Embed Me If You Can: A Geometric Perceptron [14.274582421372308]
We introduce an extension of the multilayer hypersphere perceptron (MLHP)
Our model is superior to the vanilla multilayer perceptron when classifying 3D Tetris shapes.
arXiv Detail & Related papers (2020-06-11T15:25:50Z) - A Geometric Modeling of Occam's Razor in Deep Learning [8.007631014276896]
deep neural networks (DNNs) benefit from very high dimensional parameter spaces.
Their huge parameter complexities vs. stunning performances in practice is all the more intriguing and not explainable.
We propose a geometrically flavored information-theoretic approach to study this phenomenon.
arXiv Detail & Related papers (2019-05-27T07:57:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.