Symmetry Induces Structure and Constraint of Learning
- URL: http://arxiv.org/abs/2309.16932v2
- Date: Sat, 1 Jun 2024 22:07:42 GMT
- Title: Symmetry Induces Structure and Constraint of Learning
- Authors: Liu Ziyin,
- Abstract summary: We unveil the importance of the loss function symmetries in affecting, if not deciding, the learning behavior of machine learning models.
Common instances of mirror symmetries in deep learning include rescaling, rotation, and permutation symmetry.
We show that the theoretical framework can explain intriguing phenomena, such as the loss of plasticity and various collapse phenomena in neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to common architecture designs, symmetries exist extensively in contemporary neural networks. In this work, we unveil the importance of the loss function symmetries in affecting, if not deciding, the learning behavior of machine learning models. We prove that every mirror-reflection symmetry, with reflection surface $O$, in the loss function leads to the emergence of a constraint on the model parameters $\theta$: $O^T\theta =0$. This constrained solution becomes satisfied when either the weight decay or gradient noise is large. Common instances of mirror symmetries in deep learning include rescaling, rotation, and permutation symmetry. As direct corollaries, we show that rescaling symmetry leads to sparsity, rotation symmetry leads to low rankness, and permutation symmetry leads to homogeneous ensembling. Then, we show that the theoretical framework can explain intriguing phenomena, such as the loss of plasticity and various collapse phenomena in neural networks, and suggest how symmetries can be used to design an elegant algorithm to enforce hard constraints in a differentiable way.
Related papers
- The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures.
We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries.
Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z) - Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent [8.347295051171525]
We show that gradient noise creates a systematic interplay of parameters $theta$ along the degenerate direction to a unique-independent fixed point $theta*$.
These points are referred to as the it noise equilibria because, at these points, noise contributions from different directions are balanced and aligned.
We show that the balance and alignment of gradient noise can serve as a novel alternative mechanism for explaining important phenomena such as progressive sharpening/flattening and representation formation within neural networks.
arXiv Detail & Related papers (2024-02-11T13:00:04Z) - Lie Point Symmetry and Physics Informed Networks [59.56218517113066]
We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function.
Our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions.
Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.
arXiv Detail & Related papers (2023-11-07T19:07:16Z) - Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance.
symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted.
Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z) - Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras
from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset.
We use fully connected neural networks to model the transformations symmetry and the corresponding generators.
Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z) - On the Importance of Asymmetry for Siamese Representation Learning [53.86929387179092]
Siamese networks are conceptually symmetric with two parallel encoders.
We study the importance of asymmetry by explicitly distinguishing the two encoders within the network.
We find the improvements from asymmetric designs generalize well to longer training schedules, multiple other frameworks and newer backbones.
arXiv Detail & Related papers (2022-04-01T17:57:24Z) - Exact solutions of interacting dissipative systems via weak symmetries [77.34726150561087]
We analytically diagonalize the Liouvillian of a class Markovian dissipative systems with arbitrary strong interactions or nonlinearity.
This enables an exact description of the full dynamics and dissipative spectrum.
Our method is applicable to a variety of other systems, and could provide a powerful new tool for the study of complex driven-dissipative quantum systems.
arXiv Detail & Related papers (2021-09-27T17:45:42Z) - Machine-learning hidden symmetries [0.0]
We present an automated method for finding hidden symmetries, defined as symmetries that become manifest only in a new coordinate system that must be discovered.
Its core idea is to quantify asymmetry as violation of certain partial differential equations, and to numerically minimize such violation over the space of all invertible transformations, parametrized as invertible neural networks.
arXiv Detail & Related papers (2021-09-20T17:55:02Z) - Noether's Learning Dynamics: The Role of Kinetic Symmetry Breaking in
Deep Learning [7.310043452300738]
In nature, symmetry governs regularities, while symmetry breaking brings texture.
Recent experiments suggest that the symmetry of the loss function is closely related to the learning performance.
We pose symmetry breaking as a new design principle by considering the symmetry of the learning rule in addition to the loss function.
arXiv Detail & Related papers (2021-05-06T14:36:10Z) - Noether: The More Things Change, the More Stay the Same [1.14219428942199]
Noether's celebrated theorem states that symmetry leads to conserved quantities.
In the realm of neural networks under gradient descent, model symmetries imply restrictions on the gradient path.
Symmetry can be thought of as one further important tool in understanding the performance of neural networks under gradient descent.
arXiv Detail & Related papers (2021-04-12T14:41:05Z) - Finding Symmetry Breaking Order Parameters with Euclidean Neural
Networks [2.735801286587347]
We demonstrate that symmetry equivariant neural networks uphold Curie's principle and can be used to articulate many symmetry-relevant scientific questions into simple optimization problems.
We prove these properties mathematically and demonstrate them numerically by training a Euclidean symmetry equivariant neural network to learn symmetry-breaking input to deform a square into a rectangle and to generate octahedra tilting patterns in perovskites.
arXiv Detail & Related papers (2020-07-04T17:24:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.