Conditional Set Generation with Transformers
- URL: http://arxiv.org/abs/2006.16841v2
- Date: Wed, 1 Jul 2020 06:00:00 GMT
- Title: Conditional Set Generation with Transformers
- Authors: Adam R Kosiorek, Hyunjik Kim, Danilo J Rezende
- Abstract summary: A set is an unordered collection of unique elements.
Many machine learning models generate sets that impose an implicit or explicit ordering.
An alternative solution is to use a permutation-equivariant set generator, which does not specify an order-ing.
We introduce the Transformer Set Prediction Network (TSPN), a flexible permutation-equivariant model for set prediction.
- Score: 15.315473956458227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A set is an unordered collection of unique elements--and yet many machine
learning models that generate sets impose an implicit or explicit ordering.
Since model performance can depend on the choice of order, any particular
ordering can lead to sub-optimal results. An alternative solution is to use a
permutation-equivariant set generator, which does not specify an order-ing. An
example of such a generator is the DeepSet Prediction Network (DSPN). We
introduce the Transformer Set Prediction Network (TSPN), a flexible
permutation-equivariant model for set prediction based on the transformer, that
builds upon and outperforms DSPN in the quality of predicted set elements and
in the accuracy of their predicted sizes. We test our model on
MNIST-as-point-clouds (SET-MNIST) for point-cloud generation and on CLEVR for
object detection.
Related papers
- Conformal Nucleus Sampling [67.5232384936661]
We assess whether a top-$p$ set is indeed aligned with its probabilistic meaning in various linguistic contexts.
We find that OPT models are overconfident, and that calibration shows a moderate inverse scaling with model size.
arXiv Detail & Related papers (2023-05-04T08:11:57Z) - Sampled Transformer for Point Sets [80.66097006145999]
sparse transformer can reduce the computational complexity of the self-attention layers to $O(n)$, whilst still being a universal approximator of continuous sequence-to-sequence functions.
We propose an $O(n)$ complexity sampled transformer that can process point set elements directly without any additional inductive bias.
arXiv Detail & Related papers (2023-02-28T06:38:05Z) - Sequence-to-Set Generative Models [9.525560801277903]
We propose a sequence-to-set method to transform any sequence generative model into a set generative model.
We present GRU2Set, which is an instance of our sequence-to-set method and employs the famous GRU model as the sequence generative model.
A direct application of our models is to learn an order/set distribution from a collection of e-commerce orders.
arXiv Detail & Related papers (2022-09-19T07:13:51Z) - Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets [18.582561853987027]
We show that existing permutation invariant architectures, Deep Sets and Set Transformer, can suffer from vanishing or exploding gradients when they are deep.
We introduce the clean path principle for equivariant residual connections and develop set norm, a normalization tailored for sets.
With these, we build Deep Sets++ and Set Transformer++, models that reach high depths with comparable or better performance than their original counterparts.
arXiv Detail & Related papers (2022-06-23T18:04:56Z) - Conditional set generation using Seq2seq models [52.516563721766445]
Conditional set generation learns a mapping from an input sequence of tokens to a set.
Sequence-to-sequence(Seq2seq) models are a popular choice to model set generation.
We propose a novel algorithm for effectively sampling informative orders over the space of label orders.
arXiv Detail & Related papers (2022-05-25T04:17:50Z) - Top-N: Equivariant set and graph generation without exchangeability [61.24699600833916]
We consider one-shot probabilistic decoders that map a vector-shaped prior to a distribution over sets or graphs.
These functions can be integrated into variational autoencoders (VAE), generative adversarial networks (GAN) or normalizing flows.
Top-n is a deterministic, non-exchangeable set creation mechanism which learns to select the most relevant points from a trainable reference set.
arXiv Detail & Related papers (2021-10-05T14:51:19Z) - Fantastically Ordered Prompts and Where to Find Them: Overcoming
Few-Shot Prompt Order Sensitivity [16.893758238773263]
When primed with only a handful of training samples, very large pretrained language models such as GPT-3, have shown competitive results.
We demonstrate that the order in which the samples are provided can be the difference between near state-of-the-art and random guess performance.
We use the generative nature of the language models to construct an artificial development set and based on entropy statistics of the candidate permutations from this set we identify performant prompts.
arXiv Detail & Related papers (2021-04-18T09:29:16Z) - IOT: Instance-wise Layer Reordering for Transformer Structures [173.39918590438245]
We break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure.
Our method can also be applied to other architectures beyond Transformer.
arXiv Detail & Related papers (2021-03-05T03:44:42Z) - Set Distribution Networks: a Generative Model for Sets of Images [22.405670277339023]
We introduce Set Distribution Networks (SDNs), a framework that learns to autoencode and freely generate sets.
We show that SDNs are able to reconstruct image sets that preserve salient attributes of the inputs in our benchmark datasets.
We examine the sets generated by SDN with a pre-trained 3D reconstruction network and a face verification network, respectively, as a novel way to evaluate the quality of generated sets of images.
arXiv Detail & Related papers (2020-06-18T17:38:56Z) - Learn to Predict Sets Using Feed-Forward Neural Networks [63.91494644881925]
This paper addresses the task of set prediction using deep feed-forward neural networks.
We present a novel approach for learning to predict sets with unknown permutation and cardinality.
We demonstrate the validity of our set formulations on relevant vision problems.
arXiv Detail & Related papers (2020-01-30T01:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.