The geometry of the deep linear network
- URL: http://arxiv.org/abs/2411.09004v1
- Date: Wed, 13 Nov 2024 20:15:50 GMT
- Title: The geometry of the deep linear network
- Authors: Govind Menon,
- Abstract summary: Rigorous results by several authors are unified into a thermodynamic framework for deep learning.
Several links between the DLN and other areas of mathematics are discussed, along with some open questions.
- Score: 0.0
- License:
- Abstract: This article provides an expository account of training dynamics in the Deep Linear Network (DLN) from the perspective of the geometric theory of dynamical systems. Rigorous results by several authors are unified into a thermodynamic framework for deep learning. The analysis begins with a characterization of the invariant manifolds and Riemannian geometry in the DLN. This is followed by exact formulas for a Boltzmann entropy, as well as stochastic gradient descent of free energy using a Riemannian Langevin Equation. Several links between the DLN and other areas of mathematics are discussed, along with some open questions.
Related papers
- A singular Riemannian Geometry Approach to Deep Neural Networks III. Piecewise Differentiable Layers and Random Walks on $n$-dimensional Classes [49.32130498861987]
We study the case of non-differentiable activation functions, such as ReLU.
Two recent works introduced a geometric framework to study neural networks.
We illustrate our findings with some numerical experiments on classification of images and thermodynamic problems.
arXiv Detail & Related papers (2024-04-09T08:11:46Z) - A Survey of Geometric Graph Neural Networks: Data Structures, Models and
Applications [67.33002207179923]
This paper presents a survey of data structures, models, and applications related to geometric GNNs.
We provide a unified view of existing models from the geometric message passing perspective.
We also summarize the applications as well as the related datasets to facilitate later research for methodology development and experimental evaluation.
arXiv Detail & Related papers (2024-03-01T12:13:04Z) - A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems [87.30652640973317]
Recent advances in computational modelling of atomic systems represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space.
Geometric Graph Neural Networks have emerged as the preferred machine learning architecture powering applications ranging from protein structure prediction to molecular simulations and material generation.
This paper provides a comprehensive and self-contained overview of the field of Geometric GNNs for 3D atomic systems.
arXiv Detail & Related papers (2023-12-12T18:44:19Z) - Physics-informed neural networks for transformed geometries and
manifolds [0.0]
We propose a novel method for integrating geometric transformations within PINNs to robustly accommodate geometric variations.
We demonstrate the enhanced flexibility over traditional PINNs, especially under geometric variations.
The proposed framework presents an outlook for training deep neural operators over parametrized geometries.
arXiv Detail & Related papers (2023-11-27T15:47:33Z) - Adaptive Log-Euclidean Metrics for SPD Matrix Learning [73.12655932115881]
We propose Adaptive Log-Euclidean Metrics (ALEMs), which extend the widely used Log-Euclidean Metric (LEM)
The experimental and theoretical results demonstrate the merit of the proposed metrics in improving the performance of SPD neural networks.
arXiv Detail & Related papers (2023-03-26T18:31:52Z) - Deep Linear Networks for Matrix Completion -- An Infinite Depth Limit [10.64241024049424]
The deep linear network (DLN) is a model for implicit regularization in gradient based optimization of overparametrized learning architectures.
We investigate the link between the geometric geometry and the trainings for matrix completion with rigorous analysis and numerics.
We propose that implicit regularization is a result of bias towards high state space volume.
arXiv Detail & Related papers (2022-10-22T17:03:10Z) - Thermodynamics-informed graph neural networks [0.09332987715848712]
We propose using both geometric and thermodynamic inductive biases to improve accuracy and generalization of the resulting integration scheme.
The first is achieved with Graph Neural Networks, which induces a non-Euclidean geometrical prior and permutation invariant node and edge update functions.
The second bias is forced by learning the GENERIC structure of the problem, an extension of the Hamiltonian formalism, to model more general non-conservative dynamics.
arXiv Detail & Related papers (2022-03-03T17:30:44Z) - A singular Riemannian geometry approach to Deep Neural Networks I.
Theoretical foundations [77.86290991564829]
Deep Neural Networks are widely used for solving complex problems in several scientific areas, such as speech recognition, machine translation, image analysis.
We study a particular sequence of maps between manifold, with the last manifold of the sequence equipped with a Riemannian metric.
We investigate the theoretical properties of the maps of such sequence, eventually we focus on the case of maps between implementing neural networks of practical interest.
arXiv Detail & Related papers (2021-12-17T11:43:30Z) - A Unifying and Canonical Description of Measure-Preserving Diffusions [60.59592461429012]
A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework.
We develop a geometric theory that improves and generalises this construction to any manifold.
arXiv Detail & Related papers (2021-05-06T17:36:55Z) - Symplectic Geometric Methods for Matrix Differential Equations Arising
from Inertial Navigation Problems [3.94183940327189]
This article explores some geometric and algebraic properties of the dynamical system.
It extends the applicable fields of symplectic geometric algorithms from the even dimensional Hamiltonian system to the odd dimensional dynamical system.
arXiv Detail & Related papers (2020-02-11T11:08:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.