Related papers: An entropy formula for the Deep Linear Network

An entropy formula for the Deep Linear Network

URL: http://arxiv.org/abs/2509.09088v1
Date: Thu, 11 Sep 2025 01:40:46 GMT
Title: An entropy formula for the Deep Linear Network
Authors: Govind Menon, Tianmin Yu,
Abstract summary: Main tools are the use of group actions to analyze overparametrization.<n>The foliation of the balanced manifold in the parameter space by group orbits is used to define and compute a Boltzmann entropy.<n>The main technical step is an explicit construction of an orthonormal basis for the tangent space of the balanced manifold.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the Riemannian geometry of the Deep Linear Network (DLN) as a foundation for a thermodynamic description of the learning process. The main tools are the use of group actions to analyze overparametrization and the use of Riemannian submersion from the space of parameters to the space of observables. The foliation of the balanced manifold in the parameter space by group orbits is used to define and compute a Boltzmann entropy. We also show that the Riemannian geometry on the space of observables defined in [2] is obtained by Riemannian submersion of the balanced manifold. The main technical step is an explicit construction of an orthonormal basis for the tangent space of the balanced manifold using the theory of Jacobi matrices.

Related papers

Riemannian Langevin Dynamics: Strong Convergence of Geometric Euler-Maruyama Scheme [51.56484100374058]
Low-dimensional structure in real-world data plays an important role in the success of generative models.<n>We prove convergence theory of numerical schemes for manifold-valued differential equations.
arXiv Detail & Related papers (2026-03-04T01:29:35Z)
Riemann$^2$: Learning Riemannian Submanifolds from Riemannian Data [12.424539896723603]
Latent variable models are powerful tools for learning low-dimensional manifold from high-dimensional data.<n>This paper generalizes previous work and allows us to handle complex tasks in various domains, including robot motion synthesis and analysis of brain connectomes.
arXiv Detail & Related papers (2025-03-07T16:08:53Z)
The geometry of the deep linear network [0.0]
Rigorous results by several authors are unified into a thermodynamic framework for deep learning. Several links between the DLN and other areas of mathematics are discussed, along with some open questions.
arXiv Detail & Related papers (2024-11-13T20:15:50Z)
RMLR: Extending Multinomial Logistic Regression into General Geometries [64.16104856124029]
Our framework only requires minimal geometric properties, thus exhibiting broad applicability. We develop five families of SPD MLRs under five types of power-deformed metrics. On rotation matrices we propose Lie MLR based on the popular bi-invariant metric.
arXiv Detail & Related papers (2024-09-28T18:38:21Z)
Product Geometries on Cholesky Manifolds with Applications to SPD Manifolds [65.04845593770727]
We present two new metrics on the Symmetric Positive Definite (SPD) manifold via the Cholesky manifold. Our metrics are easy to use, computationally efficient, and numerically stable.
arXiv Detail & Related papers (2024-07-02T18:46:13Z)
The Fisher-Rao geometry of CES distributions [50.50897590847961]
The Fisher-Rao information geometry allows for leveraging tools from differential geometry. We will present some practical uses of these geometric tools in the framework of elliptical distributions.
arXiv Detail & Related papers (2023-10-02T09:23:32Z)
A singular Riemannian geometry approach to Deep Neural Networks I. Theoretical foundations [77.86290991564829]
Deep Neural Networks are widely used for solving complex problems in several scientific areas, such as speech recognition, machine translation, image analysis. We study a particular sequence of maps between manifold, with the last manifold of the sequence equipped with a Riemannian metric. We investigate the theoretical properties of the maps of such sequence, eventually we focus on the case of maps between implementing neural networks of practical interest.
arXiv Detail & Related papers (2021-12-17T11:43:30Z)
A Unifying and Canonical Description of Measure-Preserving Diffusions [60.59592461429012]
A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework. We develop a geometric theory that improves and generalises this construction to any manifold.
arXiv Detail & Related papers (2021-05-06T17:36:55Z)
Geometric Approach Towards Complete Logarithmic Sobolev Inequalities [15.86478274881752]
In this paper, we use the Carnot-Caratheodory distance from sub-Riemanian geometry to prove entropy decay estimates for all finite dimensional symmetric quantum Markov semigroups. Our approach relies on the transference principle, the existence of $t$-designs, and the sub-Riemanian diameter of compact Lie groups and implies estimates for the spectral gap.
arXiv Detail & Related papers (2021-02-08T18:48:15Z)
Nested Grassmannians for Dimensionality Reduction with Applications [7.106986689736826]
We propose a novel framework for constructing a nested sequence of homogeneous Riemannian manifold. We focus on applying the proposed framework to the Grassmann manifold, giving rise to the nested Grassmannians (NG) Specifically, each planar (2D) shape can be represented as a point in the complex projective space which is a complex Grass-mann manifold. With the proposed NG structure, we develop algorithms for the supervised and unsupervised dimensionality reduction problems respectively.
arXiv Detail & Related papers (2020-10-27T20:09:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.