Unification of Symmetries Inside Neural Networks: Transformer,
Feedforward and Neural ODE
- URL: http://arxiv.org/abs/2402.02362v1
- Date: Sun, 4 Feb 2024 06:11:54 GMT
- Title: Unification of Symmetries Inside Neural Networks: Transformer,
Feedforward and Neural ODE
- Authors: Koji Hashimoto, Yuji Hirono, Akiyoshi Sannai
- Abstract summary: This study introduces a novel approach by applying the principles of gauge symmetries, a key concept in physics, to neural network architectures.
We mathematically formulate the parametric redundancies in neural ODEs, and find that their gauge symmetries are given by spacetime diffeomorphisms.
Viewing neural ODEs as a continuum version of feedforward neural networks, we show that the parametric redundancies in feedforward neural networks are indeed lifted to diffeomorphisms in neural ODEs.
- Score: 2.002741592555996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the inner workings of neural networks, including transformers,
remains one of the most challenging puzzles in machine learning. This study
introduces a novel approach by applying the principles of gauge symmetries, a
key concept in physics, to neural network architectures. By regarding model
functions as physical observables, we find that parametric redundancies of
various machine learning models can be interpreted as gauge symmetries. We
mathematically formulate the parametric redundancies in neural ODEs, and find
that their gauge symmetries are given by spacetime diffeomorphisms, which play
a fundamental role in Einstein's theory of gravity. Viewing neural ODEs as a
continuum version of feedforward neural networks, we show that the parametric
redundancies in feedforward neural networks are indeed lifted to
diffeomorphisms in neural ODEs. We further extend our analysis to transformer
models, finding natural correspondences with neural ODEs and their gauge
symmetries. The concept of gauge symmetries sheds light on the complex behavior
of deep learning models through physics and provides us with a unifying
perspective for analyzing various machine learning architectures.
Related papers
- Optimal Equivariant Architectures from the Symmetries of Matrix-Element Likelihoods [0.0]
Matrix-Element Method (MEM) has long been a cornerstone of data analysis in high-energy physics.
geometric deep learning has enabled neural network architectures that incorporate known symmetries directly into their design.
This paper presents a novel approach that combines MEM-inspired symmetry considerations with equivariant neural network design for particle physics analysis.
arXiv Detail & Related papers (2024-10-24T08:56:37Z) - The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures.
We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries.
Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Hamiltonian Neural Networks with Automatic Symmetry Detection [0.0]
Hamiltonian neural networks (HNN) have been introduced to incorporate prior physical knowledge.
We enhance HNN with a Lie algebra framework to detect and embed symmetries in the neural network.
arXiv Detail & Related papers (2023-01-19T07:34:57Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Lagrangian Neural Network with Differential Symmetries and Relational
Inductive Bias [5.017136256232997]
We present a momentum conserving Lagrangian neural network (MCLNN) that learns the Lagrangian of a system.
We also show that the model developed can generalize to systems of any arbitrary size.
arXiv Detail & Related papers (2021-10-07T08:49:57Z) - Incorporating Symmetry into Deep Dynamics Models for Improved
Generalization [24.363954435050264]
We propose to improve accuracy and generalization by incorporating symmetries into convolutional neural networks.
Our models are theoretically and experimentally robust to distributional shift by symmetry group transformations.
Compared with image or text applications, our work is a significant step towards applying equivariant neural networks to high-dimensional systems.
arXiv Detail & Related papers (2020-02-08T01:28:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.