On the Symmetries of Deep Learning Models and their Internal
Representations
- URL: http://arxiv.org/abs/2205.14258v5
- Date: Fri, 24 Mar 2023 17:25:53 GMT
- Title: On the Symmetries of Deep Learning Models and their Internal
Representations
- Authors: Charles Godfrey, Davis Brown, Tegan Emerson, Henry Kvinge
- Abstract summary: We seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data.
Our work suggests that the symmetries of a network are propagated into the symmetries in that network's representation of data.
- Score: 1.418465438044804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Symmetry is a fundamental tool in the exploration of a broad range of complex
systems. In machine learning symmetry has been explored in both models and
data. In this paper we seek to connect the symmetries arising from the
architecture of a family of models with the symmetries of that family's
internal representation of data. We do this by calculating a set of fundamental
symmetry groups, which we call the intertwiner groups of the model. We connect
intertwiner groups to a model's internal representations of data through a
range of experiments that probe similarities between hidden states across
models with the same architecture. Our work suggests that the symmetries of a
network are propagated into the symmetries in that network's representation of
data, providing us with a better understanding of how architecture affects the
learning and prediction process. Finally, we speculate that for ReLU networks,
the intertwiner groups may provide a justification for the common practice of
concentrating model interpretability exploration on the activation basis in
hidden layers rather than arbitrary linear combinations thereof.
Related papers
- Group Crosscoders for Mechanistic Analysis of Symmetry [0.0]
Group crosscoders systematically discover and analyse symmetrical features in neural networks.
We show that group crosscoders can provide systematic insights into how neural networks represent symmetry.
arXiv Detail & Related papers (2024-10-31T17:47:01Z) - Symmetry From Scratch: Group Equivariance as a Supervised Learning Task [1.8570740863168362]
In machine learning datasets with symmetries, the paradigm for backward compatibility with symmetry-breaking has been to relax equivariant architectural constraints.
We introduce symmetry-cloning, a method for inducing equivariance in machine learning models.
arXiv Detail & Related papers (2024-10-05T00:44:09Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - A Generative Model of Symmetry Transformations [44.87295754993983]
We build a generative model that explicitly aims to capture the data's approximate symmetries.
We empirically demonstrate its ability to capture symmetries under affine and color transformations.
arXiv Detail & Related papers (2024-03-04T11:32:18Z) - A Unified Framework to Enforce, Discover, and Promote Symmetry in Machine Learning [5.1105250336911405]
We provide a unifying theoretical and methodological framework for incorporating symmetry into machine learning models.
We show that enforcing and discovering symmetry are linear-algebraic tasks that are dual with respect to the bilinear structure of the Lie derivative.
We propose a novel way to promote symmetry by introducing a class of convex regularization functions based on the Lie derivative and nuclear norm relaxation.
arXiv Detail & Related papers (2023-11-01T01:19:54Z) - Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras
from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset.
We use fully connected neural networks to model the transformations symmetry and the corresponding generators.
Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z) - Towards a mathematical understanding of learning from few examples with
nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points.
We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z) - The Geometry of Self-supervised Learning Models and its Impact on
Transfer Learning [62.601681746034956]
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision.
We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each.
arXiv Detail & Related papers (2022-09-18T18:15:38Z) - Equivariant Representation Learning via Class-Pose Decomposition [17.032782230538388]
We introduce a general method for learning representations that are equivariant to symmetries of data.
The components semantically correspond to intrinsic data classes and poses respectively.
Results show that our representations capture the geometry of data and outperform other equivariant representation learning frameworks.
arXiv Detail & Related papers (2022-07-07T06:55:52Z) - Linear Connectivity Reveals Generalization Strategies [54.947772002394736]
Some pairs of finetuned models have large barriers of increasing loss on the linear paths between them.
We find distinct clusters of models which are linearly connected on the test loss surface, but are disconnected from models outside the cluster.
Our work demonstrates how the geometry of the loss surface can guide models towards different functions.
arXiv Detail & Related papers (2022-05-24T23:43:02Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.