Related papers: Densely Connected $G$-invariant Deep Neural Networks with Signed Permutation Representations

Densely Connected $G$-invariant Deep Neural Networks with Signed Permutation Representations

URL: http://arxiv.org/abs/2303.04614v2
Date: Tue, 17 Oct 2023 17:06:04 GMT
Title: Densely Connected $G$-invariant Deep Neural Networks with Signed Permutation Representations
Authors: Devanshu Agrawal and James Ostrowski
Abstract summary: We introduce and investigate, for finite groups $G$, $G$-invariant deep neural network ($G$-DNN) architectures with ReLU activation. The preactivations of the $G$-DNNs are able to transform by emphsigned permutation representations (signed perm-reps) of $G$. We show that there are far more admissible $G$-DNN architectures than those accessible with the concatenated ReLU'' activation function from the literature.
Score: 6.200483285433661
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce and investigate, for finite groups $G$, $G$-invariant deep neural network ($G$-DNN) architectures with ReLU activation that are densely connected-- i.e., include all possible skip connections. In contrast to other $G$-invariant architectures in the literature, the preactivations of the$G$-DNNs presented here are able to transform by \emph{signed} permutation representations (signed perm-reps) of $G$. Moreover, the individual layers of the $G$-DNNs are not required to be $G$-equivariant; instead, the preactivations are constrained to be $G$-equivariant functions of the network input in a way that couples weights across all layers. The result is a richer family of $G$-invariant architectures never seen previously. We derive an efficient implementation of $G$-DNNs after a reparameterization of weights, as well as necessary and sufficient conditions for an architecture to be ``admissible''-- i.e., nondegenerate and inequivalent to smaller architectures. We include code that allows a user to build a $G$-DNN interactively layer-by-layer, with the final architecture guaranteed to be admissible. We show that there are far more admissible $G$-DNN architectures than those accessible with the ``concatenated ReLU'' activation function from the literature. Finally, we apply $G$-DNNs to two example problems -- (1) multiplication in $\{-1, 1\}$ (with theoretical guarantees) and (2) 3D object classification -- % finding that the inclusion of signed perm-reps significantly boosts predictive performance compared to baselines with only ordinary (i.e., unsigned) perm-reps.

Related papers

Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements. We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z)
Learning General Policies for Classical Planning Domains: Getting Beyond C$_2$ [15.574717738100727]
GNN-based approaches for learning general policies across planning domains are limited by the expressive power of $C$. We introduce a parameterized version of relational GNNs, when $t$ is infinity, R-GNN[$t$] approximates $3$-GNNs using only quadratic space for embeddings. For lower values of $t$, such as $t=1 and $t=2$, R-GNN[$t$] achieves a weaker approximation by exchanging fewer messages.
arXiv Detail & Related papers (2024-03-18T12:42:53Z)
Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
We study the problem of learning hierarchical functions over the standard Gaussian distribution with three-layer neural networks. For a large subclass of degree $k$s $p$, a three-layer neural network trained via layerwise gradientp descent on the square loss learns the target $h$ up to vanishing test error. This work demonstrates the ability of three-layer neural networks to learn complex features and as a result, learn a broad class of hierarchical functions.
arXiv Detail & Related papers (2023-11-23T02:19:32Z)
A General Framework for Robust G-Invariance in G-Equivariant Networks [5.227502964814928]
We introduce a general method for achieving robust group-invariance in group-equivariant convolutional neural networks ($G$-CNNs) The completeness of the triple correlation endows the $G$-TC layer with strong robustness. We demonstrate the benefits of this method on both commutative and non-commutative groups.
arXiv Detail & Related papers (2023-10-28T02:27:34Z)
Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman [66.23316415757456]
Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. We propose an extension, $(k,t)$-FWL, which considers any equivariant set as neighbors instead of all nodes. N$2$-GNN achieves record-breaking results on ZINC-Subset (0.059), outperforming previous SOTA results by 10.6%.
arXiv Detail & Related papers (2023-06-05T21:35:32Z)
Shallow neural network representation of polynomials [91.3755431537592]
We show that $d$-variables of degreeR$ can be represented on $[0,1]d$ as shallow neural networks of width $d+1+sum_r=2Rbinomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1d-1[binomr+d-1d-1d-1d-1
arXiv Detail & Related papers (2022-08-17T08:14:52Z)
Exponential Separations in Symmetric Neural Networks [48.80300074254758]
We consider symmetric Networkparencitesantoro 2017simple architecture as a natural generalization of DeepSetsparencitezaheerdeep architecture. Under the restriction to analytic activation functions, we construct a symmetric function acting on sets of dimensions $N$ in dimension with $D$.
arXiv Detail & Related papers (2022-06-02T19:45:10Z)
Neural Network Architecture Beyond Width and Depth [4.468952886990851]
This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures.
arXiv Detail & Related papers (2022-05-19T10:29:11Z)
A Classification of $G$-invariant Shallow Neural Networks [1.4213973379473654]
We prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or "shallow" neural network ($G$-SNN) architectures with ReLU activation. We enumerate the $G$-SNN architectures for some example groups $G$ and visualize their structure.
arXiv Detail & Related papers (2022-05-18T21:18:16Z)
RNNs can generate bounded hierarchical languages with optimal memory [113.73133308478612]
We show that RNNs can efficiently generate bounded hierarchical languages that reflect the scaffolding of natural language syntax. We introduce Dyck-($k$,$m$), the language of well-nested brackets (of $k$ types) and $m$-bounded nesting depth. We prove that an RNN with $O(m log k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction.
arXiv Detail & Related papers (2020-10-15T04:42:29Z)
Improving Robustness and Generality of NLP Models Using Disentangled Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$. We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.