Related papers: Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks

Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks

URL: http://arxiv.org/abs/2409.11772v1
Date: Wed, 18 Sep 2024 07:52:33 GMT
Title: Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks
Authors: Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi,
Abstract summary: Group Matrices (GMs) are a forgotten precursor to the modern notion of regular representations of finite groups. GMs can be employed to extend all the elementary operations of CNNs to general discrete groups. We show that GMs can be employed to extend all the elementary operations of CNNs to general discrete groups.
Score: 5.187307904567701
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There has been much recent interest in designing symmetry-aware neural networks (NNs) exhibiting relaxed equivariance. Such NNs aim to interpolate between being exactly equivariant and being fully flexible, affording consistent performance benefits. In a separate line of work, certain structured parameter matrices -- those with displacement structure, characterized by low displacement rank (LDR) -- have been used to design small-footprint NNs. Displacement structure enables fast function and gradient evaluation, but permits accurate approximations via compression primarily to classical convolutional neural networks (CNNs). In this work, we propose a general framework -- based on a novel construction of symmetry-based structured matrices -- to build approximately equivariant NNs with significantly reduced parameter counts. Our framework integrates the two aforementioned lines of work via the use of so-called Group Matrices (GMs), a forgotten precursor to the modern notion of regular representations of finite groups. GMs allow the design of structured matrices -- resembling LDR matrices -- which generalize the linear operations of a classical CNN from cyclic groups to general finite groups and their homogeneous spaces. We show that GMs can be employed to extend all the elementary operations of CNNs to general discrete groups. Further, the theory of structured matrices based on GMs provides a generalization of LDR theory focussed on matrices with cyclic structure, providing a tool for implementing approximate equivariance for discrete groups. We test GM-based architectures on a variety of tasks in the presence of relaxed symmetry. We report that our framework consistently performs competitively compared to approximately equivariant NNs, and other structured matrix-based compression frameworks, sometimes with a one or two orders of magnitude lower parameter count.

Related papers

GLGENN: A Novel Parameter-Light Equivariant Neural Networks Architecture Based on Clifford Geometric Algebras [0.0]
We propose, implement, and compare with competitors a new architecture of equivariant neural networks based on geometric algebras: Generalized Lipschitz Group Equivariant Neural Networks (GLGENN)<n>These networks are equivariant to all pseudo-orthogonal transformations, including rotations and reflections, of a vector space with any non-degenerate or symmetric bilinear form.
arXiv Detail & Related papers (2025-06-11T11:32:51Z)
Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors [46.59269589647962]
Group equivariance has emerged as a valuable inductive bias in deep learning. Group equivariant methods require the groups of interest to be known beforehand. We show that when the dataset exhibits strong symmetries, the permutation matrices will converge to regular group representations.
arXiv Detail & Related papers (2024-12-05T20:15:34Z)
Incorporating Arbitrary Matrix Group Equivariance into KANs [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains. However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning. We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z)
Monomial Matrix Group Equivariant Neural Functional Networks [1.797555376258229]
We extend the study of the group action on the network weights by incorporating scaling/sign-flipping symmetries. We name our new family of NFNs the Monomial Matrix Group Equivariant Neural Functional Networks (Monomial-NFN)
arXiv Detail & Related papers (2024-09-18T04:36:05Z)
Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry [63.694184882697435]
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
arXiv Detail & Related papers (2024-07-15T07:11:44Z)
Group and Shuffle: Efficient Structured Orthogonal Parametrization [3.540195249269228]
We introduce a new class of structured matrices, which unifies and generalizes structured classes from previous works. We empirically validate our method on different domains, including adapting of text-to-image diffusion models and downstream task fine-tuning in language modeling.
arXiv Detail & Related papers (2024-06-14T13:29:36Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Differentiable Learning of Generalized Structured Matrices for Efficient Deep Neural Networks [16.546708806547137]
This paper investigates efficient deep neural networks (DNNs) to replace dense unstructured weight matrices with structured ones that possess desired properties. The challenge arises because the optimal weight matrix structure in popular neural network models is obscure in most cases and may vary from layer to layer even in the same network. We propose a generalized and differentiable framework to learn efficient structures of weight matrices by gradient descent.
arXiv Detail & Related papers (2023-10-29T03:07:30Z)
Architectural Optimization over Subgroups for Equivariant Neural Networks [0.0]
We propose equivariance relaxation morphism and $[G]$-mixed equivariant layer to operate with equivariance constraints on a subgroup. We present evolutionary and differentiable neural architecture search (NAS) algorithms that utilize these mechanisms respectively for equivariance-aware architectural optimization.
arXiv Detail & Related papers (2022-10-11T14:37:29Z)
Implicit Bias of Linear Equivariant Networks [2.580765958706854]
Group equivariant convolutional neural networks (G-CNNs) are generalizations of convolutional neural networks (CNNs) We show that $L$-layer full-width linear G-CNNs trained via gradient descent converge to solutions with low-rank Fourier matrix coefficients.
arXiv Detail & Related papers (2021-10-12T15:34:25Z)
Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. We show that FA-based models have maximal expressive power in a broad setting. We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.