Geometry of fibers of the multiplication map of deep linear neural networks
- URL: http://arxiv.org/abs/2411.19920v2
- Date: Wed, 11 Dec 2024 10:56:00 GMT
- Title: Geometry of fibers of the multiplication map of deep linear neural networks
- Authors: Simon Pepin Lehalleur, Richárd Rimányi,
- Abstract summary: We study the geometry of the set of quivers of composable matrices which multiply to a fixed matrix.
Our solution is presented in three forms: a Poincar'e series in equivariant cohomology, a quadratic integer program, and an explicit formula.
- Score: 0.0
- License:
- Abstract: We study the geometry of the algebraic set of tuples of composable matrices which multiply to a fixed matrix, using tools from the theory of quiver representations. In particular, we determine its codimension $C$ and the number $\theta$ of its top-dimensional irreducible components. Our solution is presented in three forms: a Poincar\'e series in equivariant cohomology, a quadratic integer program, and an explicit formula. In the course of the proof, we establish a surprising property: $C$ and $\theta$ are invariant under arbitrary permutations of the dimension vector. We also show that the real log-canonical threshold of the function taking a tuple to the square Frobenius norm of its product is $C/2$. These results are motivated by the study of deep linear neural networks in machine learning and Bayesian statistics (singular learning theory) and show that deep linear networks are in a certain sense ``mildly singular".
Related papers
- Neural Networks and (Virtual) Extended Formulations [5.762677915745415]
We prove lower bounds on the size of neural networks that optimize over $P$.
We show that $mathrmxc(P)$ is a lower bound on the size of any monotone or input neural network that solves the linear optimization problem over $P$.
arXiv Detail & Related papers (2024-11-05T11:12:11Z) - Butson Hadamard matrices, bent sequences, and spherical codes [15.98720468046758]
We explore a notion of bent sequence attached to the data consisting of an Hadamard matrix of order $n$ defined over the complex $qth$ roots of unity.
In particular we construct self-dual bent sequences for various $qle 60$ and lengths $nle 21.$ construction methods comprise the resolution of systems by Groebner bases and eigenspace computations.
arXiv Detail & Related papers (2023-11-01T08:03:11Z) - Lie Neurons: Adjoint-Equivariant Neural Networks for Semisimple Lie Algebras [5.596048634951087]
This paper proposes an equivariant neural network that takes data in any semi-simple Lie algebra as input.
The corresponding group acts on the Lie algebra as adjoint operations, making our proposed network adjoint-equivariant.
Our framework generalizes the Vector Neurons, a simple $mathrmSO(3)$-equivariant network, from 3-D Euclidean space to Lie algebra spaces.
arXiv Detail & Related papers (2023-10-06T18:34:27Z) - LU-Net: Invertible Neural Networks Based on Matrix Factorization [1.2891210250935146]
LU-Net is a simple and fast architecture for invertible neural networks (INN)
Intrepid neural networks can be trained according to the maximum likelihood principle.
In our numerical experiments, we test the LU-net architecture as generative model on several academic datasets.
arXiv Detail & Related papers (2023-02-21T08:52:36Z) - Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras
from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset.
We use fully connected neural networks to model the transformations symmetry and the corresponding generators.
Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z) - Sign and Basis Invariant Networks for Spectral Graph Representation
Learning [75.18802152811539]
We introduce SignNet and BasisNet -- new neural architectures that are invariant to all requisite symmetries and hence process collections of eigenspaces in a principled manner.
Our networks are theoretically strong for graph representation learning -- they can approximate any spectral graph convolution.
Experiments show the strength of our networks for learning spectral graph filters and learning graph positional encodings.
arXiv Detail & Related papers (2022-02-25T23:11:59Z) - A singular Riemannian geometry approach to Deep Neural Networks II.
Reconstruction of 1-D equivalence classes [78.120734120667]
We build the preimage of a point in the output manifold in the input space.
We focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces.
arXiv Detail & Related papers (2021-12-17T11:47:45Z) - Geometric Deep Learning and Equivariant Neural Networks [0.9381376621526817]
We survey the mathematical foundations of geometric deep learning, focusing on group equivariant and gauge equivariant neural networks.
We develop gauge equivariant convolutional neural networks on arbitrary manifold $mathcalM$ using principal bundles with structure group $K$ and equivariant maps between sections of associated vector bundles.
We analyze several applications of this formalism, including semantic segmentation and object detection networks.
arXiv Detail & Related papers (2021-05-28T15:41:52Z) - A Practical Method for Constructing Equivariant Multilayer Perceptrons
for Arbitrary Matrix Groups [115.58550697886987]
We provide a completely general algorithm for solving for the equivariant layers of matrix groups.
In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before.
Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems.
arXiv Detail & Related papers (2021-04-19T17:21:54Z) - Eigendecomposition-Free Training of Deep Networks for Linear
Least-Square Problems [107.3868459697569]
We introduce an eigendecomposition-free approach to training a deep network.
We show that our approach is much more robust than explicit differentiation of the eigendecomposition.
Our method has better convergence properties and yields state-of-the-art results.
arXiv Detail & Related papers (2020-04-15T04:29:34Z) - Neural Networks are Convex Regularizers: Exact Polynomial-time Convex
Optimization Formulations for Two-layer Networks [70.15611146583068]
We develop exact representations of training two-layer neural networks with rectified linear units (ReLUs)
Our theory utilizes semi-infinite duality and minimum norm regularization.
arXiv Detail & Related papers (2020-02-24T21:32:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.