Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks
- URL: http://arxiv.org/abs/2409.11772v2
- Date: Fri, 18 Apr 2025 05:56:28 GMT
- Title: Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks
- Authors: Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi,
- Abstract summary: Group Matrices (GMs) are forgotten precursor to modern notion of regular representations of finite groups.<n>We show GMs can generalize classical LDR theory to general discrete groups.<n>Our framework performs competitively with approximately equivariant NNs and other structured matrix-based methods.
- Score: 5.187307904567701
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been much recent interest in designing neural networks (NNs) with relaxed equivariance, which interpolate between exact equivariance and full flexibility for consistent performance gains. In a separate line of work, structured parameter matrices with low displacement rank (LDR) -- which permit fast function and gradient evaluation -- have been used to create compact NNs, though primarily benefiting classical convolutional neural networks (CNNs). In this work, we propose a framework based on symmetry-based structured matrices to build approximately equivariant NNs with fewer parameters. Our approach unifies the aforementioned areas using Group Matrices (GMs), a forgotten precursor to the modern notion of regular representations of finite groups. GMs allow the design of structured matrices similar to LDR matrices, which can generalize all the elementary operations of a CNN from cyclic groups to arbitrary finite groups. We show GMs can also generalize classical LDR theory to general discrete groups, enabling a natural formalism for approximate equivariance. We test GM-based architectures on various tasks with relaxed symmetry and find that our framework performs competitively with approximately equivariant NNs and other structured matrix-based methods, often with one to two orders of magnitude fewer parameters.
Related papers
- Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors [46.59269589647962]
Group equivariance has emerged as a valuable inductive bias in deep learning.
Group equivariant methods require the groups of interest to be known beforehand.
We show that when the dataset exhibits strong symmetries, the permutation matrices will converge to regular group representations.
arXiv Detail & Related papers (2024-12-05T20:15:34Z) - Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains.
However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning.
We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z) - Monomial Matrix Group Equivariant Neural Functional Networks [1.797555376258229]
We extend the study of the group action on the network weights by incorporating scaling/sign-flipping symmetries.
We name our new family of NFNs the Monomial Matrix Group Equivariant Neural Functional Networks (Monomial-NFN)
arXiv Detail & Related papers (2024-09-18T04:36:05Z) - Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry [63.694184882697435]
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
arXiv Detail & Related papers (2024-07-15T07:11:44Z) - Group and Shuffle: Efficient Structured Orthogonal Parametrization [3.540195249269228]
We introduce a new class of structured matrices, which unifies and generalizes structured classes from previous works.
We empirically validate our method on different domains, including adapting of text-to-image diffusion models and downstream task fine-tuning in language modeling.
arXiv Detail & Related papers (2024-06-14T13:29:36Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Differentiable Learning of Generalized Structured Matrices for Efficient
Deep Neural Networks [16.546708806547137]
This paper investigates efficient deep neural networks (DNNs) to replace dense unstructured weight matrices with structured ones that possess desired properties.
The challenge arises because the optimal weight matrix structure in popular neural network models is obscure in most cases and may vary from layer to layer even in the same network.
We propose a generalized and differentiable framework to learn efficient structures of weight matrices by gradient descent.
arXiv Detail & Related papers (2023-10-29T03:07:30Z) - Architectural Optimization over Subgroups for Equivariant Neural
Networks [0.0]
We propose equivariance relaxation morphism and $[G]$-mixed equivariant layer to operate with equivariance constraints on a subgroup.
We present evolutionary and differentiable neural architecture search (NAS) algorithms that utilize these mechanisms respectively for equivariance-aware architectural optimization.
arXiv Detail & Related papers (2022-10-11T14:37:29Z) - Implicit Bias of Linear Equivariant Networks [2.580765958706854]
Group equivariant convolutional neural networks (G-CNNs) are generalizations of convolutional neural networks (CNNs)
We show that $L$-layer full-width linear G-CNNs trained via gradient descent converge to solutions with low-rank Fourier matrix coefficients.
arXiv Detail & Related papers (2021-10-12T15:34:25Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.