Related papers: On the Symmetries of Deep Learning Models and their Internal Representations

On the Symmetries of Deep Learning Models and their Internal Representations

URL: http://arxiv.org/abs/2205.14258v5
Date: Fri, 24 Mar 2023 17:25:53 GMT
Title: On the Symmetries of Deep Learning Models and their Internal Representations
Authors: Charles Godfrey, Davis Brown, Tegan Emerson, Henry Kvinge
Abstract summary: We seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. Our work suggests that the symmetries of a network are propagated into the symmetries in that network's representation of data.
Score: 1.418465438044804
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Symmetry is a fundamental tool in the exploration of a broad range of complex systems. In machine learning symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the intertwiner groups of the model. We connect intertwiner groups to a model's internal representations of data through a range of experiments that probe similarities between hidden states across models with the same architecture. Our work suggests that the symmetries of a network are propagated into the symmetries in that network's representation of data, providing us with a better understanding of how architecture affects the learning and prediction process. Finally, we speculate that for ReLU networks, the intertwiner groups may provide a justification for the common practice of concentrating model interpretability exploration on the activation basis in hidden layers rather than arbitrary linear combinations thereof.

Related papers

Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
Connecting Neural Models Latent Geometries with Relative Geodesic Representations [21.71782603770616]
We show that when a latent structure is shared between distinct latent spaces, relative distances between representations can be preserved, up to distortions.<n>We assume that distinct neural models parametrize approximately the same underlying manifold, and introduce a representation based on the pullback metric.<n>We validate our method on model stitching and retrieval tasks, covering autoencoders and vision foundation discriminative models.
arXiv Detail & Related papers (2025-06-02T12:34:55Z)
Learning symmetries in datasets [0.0]
We investigate how symmetries present in datasets affect the structure of the latent space learned by Variational Autoencoders (VAEs) We show that when symmetries or approximate symmetries are present, the VAE self-organizes its latent space, effectively compressing the data along a reduced number of latent variables. Our results highlight the potential of unsupervised generative models to expose underlying structures in data and offer a novel approach to symmetry discovery without explicit supervision.
arXiv Detail & Related papers (2025-04-07T15:17:41Z)
On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory [0.0]
We show that generalization can only be successful when the local structure of the data prevails over its non-local, symmetric, structure. Our framework could be extended to guide the design of architectures and training procedures able to learn symmetries from data.
arXiv Detail & Related papers (2024-12-16T07:56:54Z)
Group Crosscoders for Mechanistic Analysis of Symmetry [0.0]
Group crosscoders systematically discover and analyse symmetrical features in neural networks. We show that group crosscoders can provide systematic insights into how neural networks represent symmetry.
arXiv Detail & Related papers (2024-10-31T17:47:01Z)
Symmetry From Scratch: Group Equivariance as a Supervised Learning Task [1.8570740863168362]
In machine learning datasets with symmetries, the paradigm for backward compatibility with symmetry-breaking has been to relax equivariant architectural constraints. We introduce symmetry-cloning, a method for inducing equivariance in machine learning models.
arXiv Detail & Related papers (2024-10-05T00:44:09Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
A Generative Model of Symmetry Transformations [44.87295754993983]
We build a generative model that explicitly aims to capture the data's approximate symmetries. We empirically demonstrate its ability to capture symmetries under affine and color transformations.
arXiv Detail & Related papers (2024-03-04T11:32:18Z)
A Unified Framework to Enforce, Discover, and Promote Symmetry in Machine Learning [5.1105250336911405]
We provide a unifying theoretical and methodological framework for incorporating symmetry into machine learning models. We show that enforcing and discovering symmetry are linear-algebraic tasks that are dual with respect to the bilinear structure of the Lie derivative. We propose a novel way to promote symmetry by introducing a class of convex regularization functions based on the Lie derivative and nuclear norm relaxation.
arXiv Detail & Related papers (2023-11-01T01:19:54Z)
Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset. We use fully connected neural networks to model the transformations symmetry and the corresponding generators. Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z)
Towards a mathematical understanding of learning from few examples with nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points. We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z)
The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning [62.601681746034956]
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision. We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each.
arXiv Detail & Related papers (2022-09-18T18:15:38Z)
Equivariant Representation Learning via Class-Pose Decomposition [17.032782230538388]
We introduce a general method for learning representations that are equivariant to symmetries of data. The components semantically correspond to intrinsic data classes and poses respectively. Results show that our representations capture the geometry of data and outperform other equivariant representation learning frameworks.
arXiv Detail & Related papers (2022-07-07T06:55:52Z)
Linear Connectivity Reveals Generalization Strategies [54.947772002394736]
Some pairs of finetuned models have large barriers of increasing loss on the linear paths between them. We find distinct clusters of models which are linearly connected on the test loss surface, but are disconnected from models outside the cluster. Our work demonstrates how the geometry of the loss surface can guide models towards different functions.
arXiv Detail & Related papers (2022-05-24T23:43:02Z)
Mixed Effects Neural ODE: A Variational Approximation for Analyzing the Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data. We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem. We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.