Related papers: Ubiquitous Symmetry at Critical Points Across Diverse Optimization Landscapes

Ubiquitous Symmetry at Critical Points Across Diverse Optimization Landscapes

URL: http://arxiv.org/abs/2506.01959v1
Date: Sun, 04 May 2025 12:32:38 GMT
Title: Ubiquitous Symmetry at Critical Points Across Diverse Optimization Landscapes
Authors: Irmi Schneider,
Abstract summary: We investigate symmetry phenomena in real-valued loss functions defined on a broader class of spaces.<n>We show that as in the neural network case, all the critical points observed have non-trivial symmetry.<n>We introduce a new measure of symmetry in the system and show that it reveals additional symmetry structures not captured by the previous measure.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Symmetry plays a crucial role in understanding the properties of mathematical structures and optimization problems. Recent work has explored this phenomenon in the context of neural networks, where the loss function is invariant under column and row permutations of the network weights. It has been observed that local minima exhibit significant symmetry with respect to the network weights (invariance to row and column permutations). And moreover no critical point was found that lacked symmetry. We extend this line of inquiry by investigating symmetry phenomena in real-valued loss functions defined on a broader class of spaces. We will introduce four more cases: the projective case over a finite field, the octahedral graph case, the perfect matching case, and the particle attraction case. We show that as in the neural network case, all the critical points observed have non-trivial symmetry. Finally we introduce a new measure of symmetry in the system and show that it reveals additional symmetry structures not captured by the previous measure.

Related papers

Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
Translation symmetry restoration in integrable systems: the noninteracting case [0.16385815610837165]
We study translation symmetry restoration in integrable systems.<n>In particular, we consider non-interacting spinless fermions on the lattice prepared in non-equilibrium states invariant under $nu>1$ lattice shifts.<n>We show that, differently from random unitary circuits where symmetry restoration occurs abruptly for times proportional to the subsystem size, here symmetry is restored smoothly and over timescales of the order of the subsystem size squared.
arXiv Detail & Related papers (2025-06-17T14:11:31Z)
Exceptional Points and Stability in Nonlinear Models of Population Dynamics having $\mathcal{PT}$ symmetry [49.1574468325115]
We analyze models governed by the replicator equation of evolutionary game theory and related Lotka-Volterra systems of population dynamics.<n>We study the emergence of exceptional points in two cases: (a) when the governing symmetry properties are tied to global properties of the models, and (b) when these symmetries emerge locally around stationary states.
arXiv Detail & Related papers (2024-11-19T02:15:59Z)
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures. We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries. Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z)
Entanglement asymmetry in CFT and its relation to non-topological defects [0.0]
The entanglement asymmetry is an information based observable that quantifies the degree of symmetry breaking in a region of an extended quantum system. We investigate this measure in the ground state of one dimensional critical systems described by a CFT.
arXiv Detail & Related papers (2024-02-05T19:01:09Z)
Lie Point Symmetry and Physics Informed Networks [59.56218517113066]
We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function. Our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions. Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.
arXiv Detail & Related papers (2023-11-07T19:07:16Z)
Symmetry Induces Structure and Constraint of Learning [0.0]
We unveil the importance of the loss function symmetries in affecting, if not deciding, the learning behavior of machine learning models. Common instances of mirror symmetries in deep learning include rescaling, rotation, and permutation symmetry. We show that the theoretical framework can explain intriguing phenomena, such as the loss of plasticity and various collapse phenomena in neural networks.
arXiv Detail & Related papers (2023-09-29T02:21:31Z)
Emergence of non-Abelian SU(2) invariance in Abelian frustrated fermionic ladders [37.69303106863453]
We consider a system of interacting spinless fermions on a two-leg triangular ladder with $pi/2$ magnetic flux per triangular plaquette. Microscopically, the system exhibits a U(1) symmetry corresponding to the conservation of total fermionic charge, and a discrete $mathbbZ$ symmetry. At the intersection of the three phases, the system features a critical point with an emergent SU(2) symmetry.
arXiv Detail & Related papers (2023-05-11T15:57:27Z)
Annihilation of Spurious Minima in Two-Layer ReLU Networks [9.695960412426672]
We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss. We show that adding neurons can turn symmetric spurious minima into saddles. We also prove the existence of descent directions in certain subspaces arising from the symmetry structure of the loss function.
arXiv Detail & Related papers (2022-10-12T11:04:21Z)
On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes. We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z)
Noether: The More Things Change, the More Stay the Same [1.14219428942199]
Noether's celebrated theorem states that symmetry leads to conserved quantities. In the realm of neural networks under gradient descent, model symmetries imply restrictions on the gradient path. Symmetry can be thought of as one further important tool in understanding the performance of neural networks under gradient descent.
arXiv Detail & Related papers (2021-04-12T14:41:05Z)
Symmetry Breaking in Symmetric Tensor Decomposition [44.181747424363245]
We consider the nonsymmetry problem associated with computing the points rank decomposition of symmetric tensors. We show that critical points the loss function is detected by standard methods.
arXiv Detail & Related papers (2021-03-10T18:11:22Z)
From Symmetry to Geometry: Tractable Nonconvex Problems [20.051126124841076]
We discuss the role of curvature in the landscape and the different roles of symmetries. This is rich with observed phenomena open problems; we close by directions for future research.
arXiv Detail & Related papers (2020-07-14T01:19:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.