Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
- URL: http://arxiv.org/abs/2405.13682v2
- Date: Thu, 03 Oct 2024 01:12:35 GMT
- Title: Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
- Authors: Sho Sonoda, Yuka Hashimoto, Isao Ishikawa, Masahiro Ikeda,
- Abstract summary: We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps.
Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
- Score: 15.67299102925013
- License:
- Abstract: We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps, based on the group representation theory. ``Constructive'' here indicates that the distribution of parameters is given in a closed-form expression known as the ridgelet transform. Joint-group-equivariance encompasses a broad class of feature maps that generalize classical group-equivariance. Notably, this class includes fully-connected networks, which are not group-equivariant but are joint-group-equivariant. Moreover, our main theorem also unifies the universal approximation theorems for both shallow and deep networks. While the universality of shallow networks has been investigated in a unified manner by the ridgelet transform, the universality of deep networks has been investigated in a case-by-case manner.
Related papers
- Decomposition of Equivariant Maps via Invariant Maps: Application to Universal Approximation under Symmetry [3.0518581575184225]
We develop a theory about the relationship between invariant and equivariant maps with regard to a group $G$.
We leverage this theory in the context of deep neural networks with group symmetries in order to obtain novel insight into their mechanisms.
arXiv Detail & Related papers (2024-09-25T13:27:41Z) - Joint Group Invariant Functions on Data-Parameter Domain Induce
Universal Neural Networks [14.45619075342763]
We present a systematic method to induce a generalized neural network and its right inverse operator, called the ridgelet transform.
Since the ridgelet transform is an inverse, it can describe the arrangement of parameters for the network to represent a target function.
We present a new simple proof of the universality by using Schur's lemma in a unified manner covering a wide class of networks.
arXiv Detail & Related papers (2023-10-05T13:30:37Z) - A PAC-Bayesian Generalization Bound for Equivariant Networks [15.27608414735815]
We derive norm-based PAC-Bayesian generalization bounds for equivariant networks.
The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error.
In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
arXiv Detail & Related papers (2022-10-24T12:07:03Z) - Equivariant Transduction through Invariant Alignment [71.45263447328374]
We introduce a novel group-equivariant architecture that incorporates a group-in hard alignment mechanism.
We find that our network's structure allows it to develop stronger equivariant properties than existing group-equivariant approaches.
We additionally find that it outperforms previous group-equivariant networks empirically on the SCAN task.
arXiv Detail & Related papers (2022-09-22T11:19:45Z) - The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm.
We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees.
We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z) - Coordinate Independent Convolutional Networks -- Isometry and Gauge
Equivariant Convolutions on Riemannian Manifolds [70.32518963244466]
A major complication in comparison to flat spaces is that it is unclear in which alignment a convolution kernel should be applied on a manifold.
We argue that the particular choice of coordinatization should not affect a network's inference -- it should be coordinate independent.
A simultaneous demand for coordinate independence and weight sharing is shown to result in a requirement on the network to be equivariant.
arXiv Detail & Related papers (2021-06-10T19:54:19Z) - Universal Approximation Theorem for Equivariant Maps by Group CNNs [14.810452619505137]
This paper provides a unified method to obtain universal approximation theorems for equivariant maps by CNNs.
As its significant advantage, we can handle non-linear equivariant maps between infinite-dimensional spaces for non-compact groups.
arXiv Detail & Related papers (2020-12-27T07:09:06Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning [90.20563679417567]
This paper introduces MDP homomorphic networks for deep reinforcement learning.
MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP.
We show that such networks converge faster than unstructured networks on CartPole, a grid world and Pong.
arXiv Detail & Related papers (2020-06-30T15:38:37Z) - Coupling-based Invertible Neural Networks Are Universal Diffeomorphism
Approximators [72.62940905965267]
Invertible neural networks based on coupling flows (CF-INNs) have various machine learning applications such as image synthesis and representation learning.
Are CF-INNs universal approximators for invertible functions?
We prove a general theorem to show the equivalence of the universality for certain diffeomorphism classes.
arXiv Detail & Related papers (2020-06-20T02:07:37Z) - Complex networks with tuneable dimensions as a universality playground [0.0]
We discuss the role of a fundamental network parameter for universality, the spectral dimension.
By explicit computation we prove that the spectral dimension for this model can be tuned continuously from $1$ to infinity.
We propose our model as a tool to probe universal behaviour on inhomogeneous structures and comment on the possibility that the universal behaviour of correlated models on such networks mimics the one of continuous field theories in fractional Euclidean dimensions.
arXiv Detail & Related papers (2020-06-18T10:56:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.