Riemannian Residual Neural Networks
- URL: http://arxiv.org/abs/2310.10013v1
- Date: Mon, 16 Oct 2023 02:12:32 GMT
- Title: Riemannian Residual Neural Networks
- Authors: Isay Katsman and Eric Ming Chen and Sidhanth Holalkere and Anna Asch
and Aaron Lou and Ser-Nam Lim and Christopher De Sa
- Abstract summary: We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
- Score: 58.925132597945634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent methods in geometric deep learning have introduced various neural
networks to operate over data that lie on Riemannian manifolds. Such networks
are often necessary to learn well over graphs with a hierarchical structure or
to learn over manifold-valued data encountered in the natural sciences. These
networks are often inspired by and directly generalize standard Euclidean
neural networks. However, extending Euclidean networks is difficult and has
only been done for a select few manifolds. In this work, we examine the
residual neural network (ResNet) and show how to extend this construction to
general Riemannian manifolds in a geometrically principled manner. Originally
introduced to help solve the vanishing gradient problem, ResNets have become
ubiquitous in machine learning due to their beneficial learning properties,
excellent empirical results, and easy-to-incorporate nature when building
varied neural networks. We find that our Riemannian ResNets mirror these
desirable properties: when compared to existing manifold neural networks
designed to learn over hyperbolic space and the manifold of symmetric positive
definite matrices, we outperform both kinds of networks in terms of relevant
testing metrics and training dynamics.
Related papers
- A Theoretical Study of Neural Network Expressive Power via Manifold Topology [9.054396245059555]
A prevalent assumption regarding real-world data is that it lies on or close to a low-dimensional manifold.
In this study, we investigate network expressive power in terms of the latent data manifold.
We present a size upper bound of ReLU neural networks.
arXiv Detail & Related papers (2024-10-21T22:10:24Z) - A singular Riemannian Geometry Approach to Deep Neural Networks III. Piecewise Differentiable Layers and Random Walks on $n$-dimensional Classes [49.32130498861987]
We study the case of non-differentiable activation functions, such as ReLU.
Two recent works introduced a geometric framework to study neural networks.
We illustrate our findings with some numerical experiments on classification of images and thermodynamic problems.
arXiv Detail & Related papers (2024-04-09T08:11:46Z) - When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning.
Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z) - Neural networks learn to magnify areas near decision boundaries [32.84188052937496]
We study how training shapes the geometry induced by unconstrained neural network feature maps.
We first show that at infinite width, neural networks with random parameters induce highly symmetric metrics on input space.
This symmetry is broken by feature learning: networks trained to perform classification tasks learn to magnify local areas along decision boundaries.
arXiv Detail & Related papers (2023-01-26T19:43:16Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.