Related papers: Loss Landscape Geometry and the Learning of Symmetries: Or, What Influence Functions Reveal About Robust Generalization

Loss Landscape Geometry and the Learning of Symmetries: Or, What Influence Functions Reveal About Robust Generalization

URL: http://arxiv.org/abs/2601.20172v1
Date: Wed, 28 Jan 2026 02:14:01 GMT
Title: Loss Landscape Geometry and the Learning of Symmetries: Or, What Influence Functions Reveal About Robust Generalization
Authors: James Amarel, Robyn Miller, Nicolas Hengartner, Benjamin Migliori, Emily Casleton, Alexei Skurikhin, Earl Lawrence, Gerd J. Kunde,
Abstract summary: We study how neural emulators internalize physical symmetries by introducing an influence-based diagnostic.<n>This quantity probes the local geometry of the learned loss landscape.<n>We show that orbit-wise gradient coherence provides the mechanism for learning to generalize over symmetry transformations.
Score: 0.14201057456467273
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study how neural emulators of partial differential equation solution operators internalize physical symmetries by introducing an influence-based diagnostic that measures the propagation of parameter updates between symmetry-related states, defined as the metric-weighted overlap of loss gradients evaluated along group orbits. This quantity probes the local geometry of the learned loss landscape and goes beyond forward-pass equivariance tests by directly assessing whether learning dynamics couple physically equivalent configurations. Applying our diagnostic to autoregressive fluid flow emulators, we show that orbit-wise gradient coherence provides the mechanism for learning to generalize over symmetry transformations and indicates when training selects a symmetry compatible basin. The result is a novel technique for evaluating if surrogate models have internalized symmetry properties of the known solution operator.

Related papers

Implicit bias as a Gauge correction: Theory and Inverse Design [2.9379512315137117]
A central problem in machine learning theory is to characterize how learning dynamics select particular solutions compatible with the training objective.<n>We identify a general mechanism, in terms of an explicit correction of the learning dynamics, for the emergence of implicit biases.<n>We compute the resulting induced bias for a range of dynamics, showing how several well known results fit into a single unified framework.
arXiv Detail & Related papers (2026-01-10T15:33:09Z)
Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
SymmetryLens: Unsupervised Symmetry Learning via Locality and Density Preservation [0.0]
We develop a new unsupervised symmetry learning method that starts with raw data and provides the minimal generator of an underlying Lie group of symmetries.<n>The method is able to learn the pixel translation operator from a dataset with only an approximate translation symmetry.<n>We demonstrate that this coupling between symmetry and locality, together with an optimization technique developed for entropy estimation, results in a stable system.
arXiv Detail & Related papers (2024-10-07T17:40:51Z)
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures. We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries. Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z)
Lie Point Symmetry and Physics Informed Networks [59.56218517113066]
We propose a loss function that informs the network about Lie point symmetries in the same way that PINN models try to enforce the underlying PDE through a loss function. Our symmetry loss ensures that the infinitesimal generators of the Lie group conserve the PDE solutions. Empirical evaluations indicate that the inductive bias introduced by the Lie point symmetries of the PDEs greatly boosts the sample efficiency of PINNs.
arXiv Detail & Related papers (2023-11-07T19:07:16Z)
Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z)
Symmetry Induces Structure and Constraint of Learning [0.0]
We unveil the importance of the loss function symmetries in affecting, if not deciding, the learning behavior of machine learning models. Common instances of mirror symmetries in deep learning include rescaling, rotation, and permutation symmetry. We show that the theoretical framework can explain intriguing phenomena, such as the loss of plasticity and various collapse phenomena in neural networks.
arXiv Detail & Related papers (2023-09-29T02:21:31Z)
On discrete symmetries of robotics systems: A group-theoretic and data-driven analysis [38.92081817503126]
We study discrete morphological symmetries of dynamical systems. These symmetries arise from the presence of one or more planes/axis of symmetry in the system's morphology. We exploit these symmetries using data augmentation and $G$-equivariant neural networks.
arXiv Detail & Related papers (2023-02-21T04:10:16Z)
Oracle-Preserving Latent Flows [58.720142291102135]
We develop a methodology for the simultaneous discovery of multiple nontrivial continuous symmetries across an entire labelled dataset. The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function. The two new elements in this work are the use of a reduced-dimensionality latent space and the generalization to transformations invariant with respect to high-dimensional oracles.
arXiv Detail & Related papers (2023-02-02T00:13:32Z)
Symmetry Breaking in Symmetric Tensor Decomposition [44.181747424363245]
We consider the nonsymmetry problem associated with computing the points rank decomposition of symmetric tensors. We show that critical points the loss function is detected by standard methods.
arXiv Detail & Related papers (2021-03-10T18:11:22Z)
Inverse Learning of Symmetries [71.62109774068064]
We learn the symmetry transformation with a model consisting of two latent subspaces. Our approach is based on the deep information bottleneck in combination with a continuous mutual information regulariser. Our model outperforms state-of-the-art methods on artificial and molecular datasets.
arXiv Detail & Related papers (2020-02-07T13:48:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.