Universal Representation of Permutation-Invariant Functions on Vectors
and Tensors
- URL: http://arxiv.org/abs/2310.13829v1
- Date: Fri, 20 Oct 2023 22:00:59 GMT
- Title: Universal Representation of Permutation-Invariant Functions on Vectors
and Tensors
- Authors: Puoya Tabaghi, Yusu Wang
- Abstract summary: A main object of our study is multiset functions -- that is, permutation-invariant functions over inputs of varying sizes.
Deep Sets, proposed by citezaheer 2017deep, provides a emphuniversal representation for continuous multiset functions on scalars via a sum-decomposable model.
We prove that universal representation is guaranteed for continuous and discontinuous multiset functions though a latent space dimension of $O(ND)$.
- Score: 11.345796608258434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A main object of our study is multiset functions -- that is,
permutation-invariant functions over inputs of varying sizes. Deep Sets,
proposed by \cite{zaheer2017deep}, provides a \emph{universal representation}
for continuous multiset functions on scalars via a sum-decomposable model.
Restricting the domain of the functions to finite multisets of $D$-dimensional
vectors, Deep Sets also provides a \emph{universal approximation} that requires
a latent space dimension of $O(N^D)$ -- where $N$ is an upper bound on the size
of input multisets. In this paper, we strengthen this result by proving that
universal representation is guaranteed for continuous and discontinuous
multiset functions though a latent space dimension of $O(N^D)$. We then
introduce \emph{identifiable} multisets for which we can uniquely label their
elements using an identifier function, namely, finite-precision vectors are
identifiable. Using our analysis on identifiable multisets, we prove that a
sum-decomposable model for general continuous multiset functions only requires
a latent dimension of $2DN$. We further show that both encoder and decoder
functions of the model are continuous -- our main contribution to the existing
work which lack such a guarantee. Also this provides a significant improvement
over the aforementioned $O(N^D)$ bound which was derived for universal
representation of continuous and discontinuous multiset functions. We then
extend our results and provide special sum-decomposition structures to
universally represent permutation-invariant tensor functions on identifiable
tensors. These families of sum-decomposition models enables us to design deep
network architectures and deploy them on a variety of learning tasks on
sequences, images, and graphs.
Related papers
- Structure of universal formulas [13.794391803767617]
We introduce a hierarchy of classes connecting the global approximability property to the weaker property of infinite VC dimension.
We show that fixed-size neural networks with not more than one layer of neurons having activations cannot approximate functions on arbitrary finite sets.
We give examples of functional families, including two-hidden-layer neural networks, that approximate functions on arbitrary finite sets, but fail to do that on the whole domain of definition.
arXiv Detail & Related papers (2023-11-07T11:50:25Z) - Polynomial Width is Sufficient for Set Representation with
High-dimensional Features [69.65698500919869]
DeepSets is the most widely used neural network architecture for set representation.
We present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE)
arXiv Detail & Related papers (2023-07-08T16:00:59Z) - Universal approximation with complex-valued deep narrow neural networks [0.0]
We study the universality of complex-valued neural networks with bounded widths and arbitrary depths.
We show that deep narrow complex-valued networks are universal if and only if their activation function is neither holomorphic, nor antiholomorphic, nor $mathbbR$-affine.
arXiv Detail & Related papers (2023-05-26T13:22:14Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Submodular + Concave [53.208470310734825]
It has been well established that first order optimization methods can converge to the maximal objective value of concave functions.
In this work, we initiate the determinant of the smooth functions convex body $$F(x) = G(x) +C(x)$.
This class of functions is an extension of both concave and continuous DR-submodular functions for which no guarantee is known.
arXiv Detail & Related papers (2021-06-09T01:59:55Z) - Learning Aggregation Functions [78.47770735205134]
We introduce LAF (Learning Aggregation Functions), a learnable aggregator for sets of arbitrary cardinality.
We report experiments on semi-synthetic and real data showing that LAF outperforms state-of-the-art sum- (max-) decomposition architectures.
arXiv Detail & Related papers (2020-12-15T18:28:53Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - From Sets to Multisets: Provable Variational Inference for Probabilistic
Integer Submodular Models [82.95892656532696]
Submodular functions have been studied extensively in machine learning and data mining.
In this work, we propose a continuous DR-submodular extension for integer submodular functions.
We formulate a new probabilistic model which is defined through integer submodular functions.
arXiv Detail & Related papers (2020-06-01T22:20:45Z) - Representing Unordered Data Using Complex-Weighted Multiset Automata [23.68657135308002]
We show how the multiset representations of certain existing neural architectures can be viewed as special cases of ours.
Namely, we provide a new theoretical and intuitive justification for the Transformer model's representation of positions using sinusoidal functions.
We extend the DeepSets model to use complex numbers, enabling it to outperform the existing model on an extension of one of their tasks.
arXiv Detail & Related papers (2020-01-02T20:04:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.