Neural Injective Functions for Multisets, Measures and Graphs via a
Finite Witness Theorem
- URL: http://arxiv.org/abs/2306.06529v2
- Date: Sun, 29 Oct 2023 09:30:59 GMT
- Title: Neural Injective Functions for Multisets, Measures and Graphs via a
Finite Witness Theorem
- Authors: Tal Amir, Steven J. Gortler, Ilai Avni, Ravina Ravina, Nadav Dym
- Abstract summary: Injective multiset functions have a key role in the theoretical of machine learning on multipsets and graphs.
Yet, there remains a gap between the provably injective multiset functions considered in theory and those considered in practice.
We show that moments of injective multiset functions cannot be considered as neural multi functions.
- Score: 4.416503115535553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Injective multiset functions have a key role in the theoretical study of
machine learning on multisets and graphs. Yet, there remains a gap between the
provably injective multiset functions considered in theory, which typically
rely on polynomial moments, and the multiset functions used in practice, which
rely on $\textit{neural moments}$ $\unicode{x2014}$ whose injectivity on
multisets has not been studied to date.
In this paper, we bridge this gap by showing that moments of neural networks
do define injective multiset functions, provided that an analytic
non-polynomial activation is used. The number of moments required by our theory
is optimal essentially up to a multiplicative factor of two. To prove this
result, we state and prove a $\textit{finite witness theorem}$, which is of
independent interest.
As a corollary to our main theorem, we derive new approximation results for
functions on multisets and measures, and new separation results for graph
neural networks. We also provide two negative results: (1) moments of
piecewise-linear neural networks cannot be injective multiset functions; and
(2) even when moment-based multiset functions are injective, they can never be
bi-Lipschitz.
Related papers
- On the (Non) Injectivity of Piecewise Linear Janossy Pooling [3.396731589928944]
We consider the family of k-ary Janossy pooling, which includes many of the most popular multiset models, and prove that no piecewise linear Janossy pooling function can be injective.<n>On the positive side, we show that when restricted to multisets without multiplicities, even simple deep-sets models suffice for injectivity and bi-Lipschitzness.
arXiv Detail & Related papers (2025-05-26T15:53:09Z) - Separation Power of Equivariant Neural Networks [11.906285279109477]
We propose a theoretical framework to investigate the separation power of equivariant neural networks with point-wise activations.
We show that all non-polynomial activations, such as ReLU and sigmoid, are equivalent in terms of expressivity.
arXiv Detail & Related papers (2024-06-13T09:52:44Z) - Multilinear Operator Networks [60.7432588386185]
Polynomial Networks is a class of models that does not require activation functions.
We propose MONet, which relies solely on multilinear operators.
arXiv Detail & Related papers (2024-01-31T16:52:19Z) - Structure of universal formulas [13.794391803767617]
We introduce a hierarchy of classes connecting the global approximability property to the weaker property of infinite VC dimension.
We show that fixed-size neural networks with not more than one layer of neurons having activations cannot approximate functions on arbitrary finite sets.
We give examples of functional families, including two-hidden-layer neural networks, that approximate functions on arbitrary finite sets, but fail to do that on the whole domain of definition.
arXiv Detail & Related papers (2023-11-07T11:50:25Z) - Going Beyond Neural Network Feature Similarity: The Network Feature
Complexity and Its Interpretation Using Category Theory [64.06519549649495]
We provide the definition of what we call functionally equivalent features.
These features produce equivalent output under certain transformations.
We propose an efficient algorithm named Iterative Feature Merging.
arXiv Detail & Related papers (2023-10-10T16:27:12Z) - Curvature-informed multi-task learning for graph networks [56.155331323304]
State-of-the-art graph neural networks attempt to predict multiple properties simultaneously.
We investigate a potential explanation for this phenomenon: the curvature of each property's loss surface significantly varies, leading to inefficient learning.
arXiv Detail & Related papers (2022-08-02T18:18:41Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Provable General Function Class Representation Learning in Multitask
Bandits and MDPs [58.624124220900306]
multitask representation learning is a popular approach in reinforcement learning to boost the sample efficiency.
In this work, we extend the analysis to general function class representations.
We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP.
arXiv Detail & Related papers (2022-05-31T11:36:42Z) - Constrained Monotonic Neural Networks [0.685316573653194]
Wider adoption of neural networks in many critical domains such as finance and healthcare is being hindered by the need to explain their predictions.
Monotonicity constraint is one of the most requested properties in real-world scenarios.
We show it can approximate any continuous monotone function on a compact subset of $mathbbRn$.
arXiv Detail & Related papers (2022-05-24T04:26:10Z) - Qualitative neural network approximation over R and C: Elementary proofs
for analytic and polynomial activation [0.0]
We prove approximations in classes of deep and shallow neural networks with analytic activation functions.
We show that fully connected and residual networks of large depth with activation functions can approximate any under certain width requirements.
arXiv Detail & Related papers (2022-03-25T01:36:13Z) - Abelian Neural Networks [48.52497085313911]
We first construct a neural network architecture for Abelian group operations and derive a universal approximation property.
We extend it to Abelian semigroup operations using the characterization of associative symmetrics.
We train our models over fixed word embeddings and demonstrate improved performance over the original word2vec.
arXiv Detail & Related papers (2021-02-24T11:52:21Z) - PDE constraints on smooth hierarchical functions computed by neural
networks [0.0]
An important problem in the theory of deep neural networks is expressivity.
We study real infinitely differentiable (smooth) hierarchical functions implemented by feedforward neural networks.
We conjecture that such PDE constraints, once accompanied by appropriate non-singularity conditions, guarantee that the smooth function under consideration can be represented by the network.
arXiv Detail & Related papers (2020-05-18T16:34:11Z) - On Sharpness of Error Bounds for Multivariate Neural Network
Approximation [0.0]
The paper deals with best non-linear approximation by such sums of ridge functions.
Error bounds are presented in terms of moduli of smoothness.
arXiv Detail & Related papers (2020-04-05T14:00:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.