Universal Approximation of Functions on Sets
- URL: http://arxiv.org/abs/2107.01959v1
- Date: Mon, 5 Jul 2021 11:56:26 GMT
- Title: Universal Approximation of Functions on Sets
- Authors: Edward Wagstaff, Fabian B. Fuchs, Martin Engelcke, Michael A. Osborne,
Ingmar Posner
- Abstract summary: Deep Sets is known to be a universal approximator for continuous set functions.
Deep Sets may be viewed as the most efficient incarnation of the Janossy pooling paradigm.
- Score: 39.35754251872388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modelling functions of sets, or equivalently, permutation-invariant
functions, is a long-standing challenge in machine learning. Deep Sets is a
popular method which is known to be a universal approximator for continuous set
functions. We provide a theoretical analysis of Deep Sets which shows that this
universal approximation property is only guaranteed if the model's latent space
is sufficiently high-dimensional. If the latent space is even one dimension
lower than necessary, there exist piecewise-affine functions for which Deep
Sets performs no better than a na\"ive constant baseline, as judged by
worst-case error. Deep Sets may be viewed as the most efficient incarnation of
the Janossy pooling paradigm. We identify this paradigm as encompassing most
currently popular set-learning methods. Based on this connection, we discuss
the implications of our results for set learning more broadly, and identify
some open questions on the universality of Janossy pooling in general.
Related papers
- On the (Non) Injectivity of Piecewise Linear Janossy Pooling [3.396731589928944]
We consider the family of k-ary Janossy pooling, which includes many of the most popular multiset models, and prove that no piecewise linear Janossy pooling function can be injective.<n>On the positive side, we show that when restricted to multisets without multiplicities, even simple deep-sets models suffice for injectivity and bi-Lipschitzness.
arXiv Detail & Related papers (2025-05-26T15:53:09Z) - Enhancing Neural Subset Selection: Integrating Background Information into Set Representations [53.15923939406772]
We show that when the target value is conditioned on both the input set and subset, it is essential to incorporate an textitinvariant sufficient statistic of the superset into the subset of interest.
This ensures that the output value remains invariant to permutations of the subset and its corresponding superset, enabling identification of the specific superset from which the subset originated.
arXiv Detail & Related papers (2024-02-05T16:09:35Z) - Complexity-Theoretic Implications of Multicalibration [8.315801422499863]
Multiaccurate predictors satisfy a stronger condition: they are calibrated on each set in the collection.
This complexity-theoretic Regularity Lemma is known to have implications in different areas.
We show that every function (regardless of its hardness) has a small collection of disjoint hardcore sets.
arXiv Detail & Related papers (2023-12-28T18:53:21Z) - Universal Representation of Permutation-Invariant Functions on Vectors
and Tensors [11.345796608258434]
A main object of our study is multiset functions -- that is, permutation-invariant functions over inputs of varying sizes.
Deep Sets, proposed by citezaheer 2017deep, provides a emphuniversal representation for continuous multiset functions on scalars via a sum-decomposable model.
We prove that universal representation is guaranteed for continuous and discontinuous multiset functions though a latent space dimension of $O(ND)$.
arXiv Detail & Related papers (2023-10-20T22:00:59Z) - Deep Generalized Schr\"odinger Bridge [26.540105544872958]
Mean-Field Game serves as a crucial mathematical framework in modeling the collective behavior of individual agents.
We show that Schr"odinger Bridge - as an entropy-regularized optimal transport model - can be generalized to accept mean-field structures.
Our method, named Deep Generalized Schr"odinger Bridge (DeepGSB), outperforms prior methods in solving classical population navigation MFGs.
arXiv Detail & Related papers (2022-09-20T17:47:15Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear
Modulation [69.34011200590817]
We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation.
By modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity.
We show that FiLM-Ensemble outperforms other implicit ensemble methods, and it comes very close to the upper bound of an explicit ensemble of networks.
arXiv Detail & Related papers (2022-05-31T18:33:15Z) - Learning Neural Set Functions Under the Optimal Subset Oracle [48.20868958542155]
We present a principled yet practical maximum likelihood learning framework, termed as EquiVSet.
Our framework meets the following desideratas of learning set functions under the OS oracle.
Thanks to the elegant combination of these advanced architectures, empirical studies on three real-world applications demonstrate that EquiVSet outperforms the baselines by a large margin.
arXiv Detail & Related papers (2022-03-03T12:59:00Z) - Pareto Navigation Gradient Descent: a First-Order Algorithm for
Optimization in Pareto Set [17.617944390196286]
Modern machine learning applications, such as multi-task learning, require finding optimal model parameters to trade-off multiple objective functions.
We propose a first-order algorithm that approximately solves OPT-in-Pareto using only gradient information.
arXiv Detail & Related papers (2021-10-17T04:07:04Z) - Learning Aggregation Functions [78.47770735205134]
We introduce LAF (Learning Aggregation Functions), a learnable aggregator for sets of arbitrary cardinality.
We report experiments on semi-synthetic and real data showing that LAF outperforms state-of-the-art sum- (max-) decomposition architectures.
arXiv Detail & Related papers (2020-12-15T18:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.