Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
- URL: http://arxiv.org/abs/2405.13682v4
- Date: Fri, 31 Jan 2025 16:05:32 GMT
- Title: Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines
- Authors: Sho Sonoda, Yuka Hashimoto, Isao Ishikawa, Masahiro Ikeda,
- Abstract summary: "Constructive" here indicates that the distribution of parameters is given in a closed-form expression known as the ridgelet transform.
Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
- Score: 15.67299102925013
- License:
- Abstract: We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps, called the joint-equivariant machines, based on the group representation theory. "Constructive" here indicates that the distribution of parameters is given in a closed-form expression known as the ridgelet transform. Joint-group-equivariance encompasses a broad class of feature maps that generalize classical group-equivariance. Particularly, fully-connected networks are not group-equivariant but are joint-group-equivariant. Our main theorem also unifies the universal approximation theorems for both shallow and deep networks. Until this study, the universality of deep networks has been shown in a different manner from the universality of shallow networks, but our results discuss them on common ground. Now we can understand the approximation schemes of various learning machines in a unified manner. As applications, we show the constructive universal approximation properties of four examples: depth-$n$ joint-equivariant machine, depth-$n$ fully-connected network, depth-$n$ group-convolutional network, and a new depth-$2$ network with quadratic forms whose universality has not been known.
Related papers
- Decomposition of Equivariant Maps via Invariant Maps: Application to Universal Approximation under Symmetry [3.0518581575184225]
We develop a theory about the relationship between invariant and equivariant maps with regard to a group $G$.
We leverage this theory in the context of deep neural networks with group symmetries in order to obtain novel insight into their mechanisms.
arXiv Detail & Related papers (2024-09-25T13:27:41Z) - Joint Group Invariant Functions on Data-Parameter Domain Induce
Universal Neural Networks [14.45619075342763]
We present a systematic method to induce a generalized neural network and its right inverse operator, called the ridgelet transform.
Since the ridgelet transform is an inverse, it can describe the arrangement of parameters for the network to represent a target function.
We present a new simple proof of the universality by using Schur's lemma in a unified manner covering a wide class of networks.
arXiv Detail & Related papers (2023-10-05T13:30:37Z) - A Toy Model of Universality: Reverse Engineering How Networks Learn
Group Operations [0.0]
We study the universality hypothesis by examining how small neural networks learn to implement group composition.
We present a novel algorithm by which neural networks may implement composition for any finite group via mathematical representation theory.
arXiv Detail & Related papers (2023-02-06T18:59:20Z) - Extending the Universal Approximation Theorem for a Broad Class of
Hypercomplex-Valued Neural Networks [1.0323063834827413]
The universal approximation theorem asserts that a single hidden layer neural network approximates continuous functions with any desired precision on compact sets.
This paper extends the universal approximation theorem for a broad class of hypercomplex-valued neural networks.
arXiv Detail & Related papers (2022-09-06T12:45:15Z) - Full network nonlocality [68.8204255655161]
We introduce the concept of full network nonlocality, which describes correlations that necessitate all links in a network to distribute nonlocal resources.
We show that the most well-known network Bell test does not witness full network nonlocality.
More generally, we point out that established methods for analysing local and theory-independent correlations in networks can be combined in order to deduce sufficient conditions for full network nonlocality.
arXiv Detail & Related papers (2021-05-19T18:00:02Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning [90.20563679417567]
This paper introduces MDP homomorphic networks for deep reinforcement learning.
MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP.
We show that such networks converge faster than unstructured networks on CartPole, a grid world and Pong.
arXiv Detail & Related papers (2020-06-30T15:38:37Z) - Coupling-based Invertible Neural Networks Are Universal Diffeomorphism
Approximators [72.62940905965267]
Invertible neural networks based on coupling flows (CF-INNs) have various machine learning applications such as image synthesis and representation learning.
Are CF-INNs universal approximators for invertible functions?
We prove a general theorem to show the equivalence of the universality for certain diffeomorphism classes.
arXiv Detail & Related papers (2020-06-20T02:07:37Z) - Complex networks with tuneable dimensions as a universality playground [0.0]
We discuss the role of a fundamental network parameter for universality, the spectral dimension.
By explicit computation we prove that the spectral dimension for this model can be tuned continuously from $1$ to infinity.
We propose our model as a tool to probe universal behaviour on inhomogeneous structures and comment on the possibility that the universal behaviour of correlated models on such networks mimics the one of continuous field theories in fractional Euclidean dimensions.
arXiv Detail & Related papers (2020-06-18T10:56:41Z) - On Infinite-Width Hypernetworks [101.03630454105621]
We show that hypernetworks do not guarantee to a global minima under descent.
We identify the functional priors of these architectures by deriving their corresponding GP and NTK kernels.
As part of this study, we make a mathematical contribution by deriving tight bounds on high order Taylor terms of standard fully connected ReLU networks.
arXiv Detail & Related papers (2020-03-27T00:50:29Z) - Neural Operator: Graph Kernel Network for Partial Differential Equations [57.90284928158383]
This work is to generalize neural networks so that they can learn mappings between infinite-dimensional spaces (operators)
We formulate approximation of the infinite-dimensional mapping by composing nonlinear activation functions and a class of integral operators.
Experiments confirm that the proposed graph kernel network does have the desired properties and show competitive performance compared to the state of the art solvers.
arXiv Detail & Related papers (2020-03-07T01:56:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.