Related papers: A Classification of $G$-invariant Shallow Neural Networks

A Classification of $G$-invariant Shallow Neural Networks

URL: http://arxiv.org/abs/2205.09219v1
Date: Wed, 18 May 2022 21:18:16 GMT
Title: A Classification of $G$-invariant Shallow Neural Networks
Authors: Devanshu Agrawal, James Ostrowski
Abstract summary: We prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or "shallow" neural network ($G$-SNN) architectures with ReLU activation. We enumerate the $G$-SNN architectures for some example groups $G$ and visualize their structure.
Score: 1.4213973379473654
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When trying to fit a deep neural network (DNN) to a $G$-invariant target function with respect to a group $G$, it only makes sense to constrain the DNN to be $G$-invariant as well. However, there can be many different ways to do this, thus raising the problem of "$G$-invariant neural architecture design": What is the optimal $G$-invariant architecture for a given problem? Before we can consider the optimization problem itself, we must understand the search space, the architectures in it, and how they relate to one another. In this paper, we take a first step towards this goal; we prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or "shallow" neural network ($G$-SNN) architectures with ReLU activation for any finite orthogonal group $G$. The proof is based on a correspondence of every $G$-SNN to a signed permutation representation of $G$ acting on the hidden neurons. The classification is equivalently given in terms of the first cohomology classes of $G$, thus admitting a topological interpretation. Based on a code implementation, we enumerate the $G$-SNN architectures for some example groups $G$ and visualize their structure. We draw the network morphisms between the enumerated architectures that can be leveraged during neural architecture search (NAS). Finally, we prove that architectures corresponding to inequivalent cohomology classes in a given cohomology ring coincide in function space only when their weight matrices are zero, and we discuss the implications of this in the context of NAS.

Related papers

Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements. We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z)
Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
We study the problem of learning hierarchical functions over the standard Gaussian distribution with three-layer neural networks. For a large subclass of degree $k$s $p$, a three-layer neural network trained via layerwise gradientp descent on the square loss learns the target $h$ up to vanishing test error. This work demonstrates the ability of three-layer neural networks to learn complex features and as a result, learn a broad class of hierarchical functions.
arXiv Detail & Related papers (2023-11-23T02:19:32Z)
Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman [66.23316415757456]
Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. We propose an extension, $(k,t)$-FWL, which considers any equivariant set as neighbors instead of all nodes. N$2$-GNN achieves record-breaking results on ZINC-Subset (0.059), outperforming previous SOTA results by 10.6%.
arXiv Detail & Related papers (2023-06-05T21:35:32Z)
Densely Connected $G$-invariant Deep Neural Networks with Signed Permutation Representations [6.200483285433661]
We introduce and investigate, for finite groups $G$, $G$-invariant deep neural network ($G$-DNN) architectures with ReLU activation. The preactivations of the $G$-DNNs are able to transform by emphsigned permutation representations (signed perm-reps) of $G$. We show that there are far more admissible $G$-DNN architectures than those accessible with the concatenated ReLU'' activation function from the literature.
arXiv Detail & Related papers (2023-03-08T14:35:03Z)
Graph Convolutional Neural Networks as Parametric CoKleisli morphisms [0.0]
We define the bicategory of Graph Convolutional Neural Networks $mathbfGCNN_n$ for an arbitrary graph with $n$ nodes. We show it can be factored through the already existing categorical constructions for deep learning called $mathbfPara$ and $mathbfLens$ with the base category set to the CoKleisli category of the product comonad.
arXiv Detail & Related papers (2022-12-01T14:49:58Z)
Algebraic Aspects of Boundaries in the Kitaev Quantum Double Model [77.34726150561087]
We provide a systematic treatment of boundaries based on subgroups $Ksubseteq G$ with the Kitaev quantum double $D(G)$ model in the bulk. The boundary sites are representations of a $*$-subalgebra $Xisubseteq D(G)$ and we explicate its structure as a strong $*$-quasi-Hopf algebra. As an application of our treatment, we study patches with boundaries based on $K=G$ horizontally and $K=e$ vertically and show how these could be used in a quantum computer
arXiv Detail & Related papers (2022-08-12T15:05:07Z)
Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
In specific regimes, neural networks trained by gradient descent behave like kernel methods. In practice, it is known that neural networks strongly outperform their associated kernels.
arXiv Detail & Related papers (2022-06-30T09:24:02Z)
Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations. We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z)
Homogeneous vector bundles and $G$-equivariant convolutional neural networks [0.0]
$G$-equivariant convolutional neural networks (GCNNs) are a geometric deep learning model for data defined on a homogeneous $G$-space $mathcalM$. In this paper, we analyze GCNNs on homogeneous spaces $mathcalM = G/K$ in the case of unimodular Lie groups $G$ and compact subgroups $K leq G$.
arXiv Detail & Related papers (2021-05-12T02:06:04Z)
A deep network construction that adapts to intrinsic dimensionality beyond the domain [79.23797234241471]
We study the approximation of two-layer compositions $f(x) = g(phi(x))$ via deep networks with ReLU activation. We focus on two intuitive and practically relevant choices for $phi$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets.
arXiv Detail & Related papers (2020-08-06T09:50:29Z)
Topology of deep neural networks [8.946655323517092]
We study how the topology of a data set $M = M_a cup M_b subseteq mathbbRd$ changes as it passes through the layers of a well-trained neural network.
arXiv Detail & Related papers (2020-04-13T17:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.