Set Interdependence Transformer: Set-to-Sequence Neural Networks for
Permutation Learning and Structure Prediction
- URL: http://arxiv.org/abs/2206.03720v1
- Date: Wed, 8 Jun 2022 07:46:49 GMT
- Title: Set Interdependence Transformer: Set-to-Sequence Neural Networks for
Permutation Learning and Structure Prediction
- Authors: Mateusz Jurewicz and Leon Derczynski
- Abstract summary: Set-to-sequence problems occur in natural language processing, computer vision and structure prediction.
Previous attention-based methods require $n$ layers of their set transformations to explicitly represent $n$-th order relations.
We propose a novel neural set encoding method called the Set Interdependence Transformer, capable of relating the set's permutation invariant representation to its elements within sets of any cardinality.
- Score: 6.396288020763144
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The task of learning to map an input set onto a permuted sequence of its
elements is challenging for neural networks. Set-to-sequence problems occur in
natural language processing, computer vision and structure prediction, where
interactions between elements of large sets define the optimal output. Models
must exhibit relational reasoning, handle varying cardinalities and manage
combinatorial complexity. Previous attention-based methods require $n$ layers
of their set transformations to explicitly represent $n$-th order relations.
Our aim is to enhance their ability to efficiently model higher-order
interactions through an additional interdependence component. We propose a
novel neural set encoding method called the Set Interdependence Transformer,
capable of relating the set's permutation invariant representation to its
elements within sets of any cardinality. We combine it with a permutation
learning module into a complete, 3-part set-to-sequence model and demonstrate
its state-of-the-art performance on a number of tasks. These range from
combinatorial optimization problems, through permutation learning challenges on
both synthetic and established NLP datasets for sentence ordering, to a novel
domain of product catalog structure prediction. Additionally, the network's
ability to generalize to unseen sequence lengths is investigated and a
comparative empirical analysis of the existing methods' ability to learn
higher-order interactions is provided.
Related papers
- Equivariant Transduction through Invariant Alignment [71.45263447328374]
We introduce a novel group-equivariant architecture that incorporates a group-in hard alignment mechanism.
We find that our network's structure allows it to develop stronger equivariant properties than existing group-equivariant approaches.
We additionally find that it outperforms previous group-equivariant networks empirically on the SCAN task.
arXiv Detail & Related papers (2022-09-22T11:19:45Z) - Modeling Structure with Undirected Neural Networks [20.506232306308977]
We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order.
We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
arXiv Detail & Related papers (2022-02-08T10:06:51Z) - Discovering Non-monotonic Autoregressive Orderings with Variational
Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data.
We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass.
Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z) - Inducing Transformer's Compositional Generalization Ability via
Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions.
Existing neural models have been shown to lack this basic ability in learning symbolic structures.
We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Set-to-Sequence Methods in Machine Learning: a Review [0.0]
Machine learning on sets towards sequential output is an important and ubiquitous task, with applications ranging from language modelling and meta-learning to multi-agent strategy games and power grid optimization.
This paper provides a comprehensive introduction to the field as well as an overview of important machine learning methods tackling both of these key challenges.
arXiv Detail & Related papers (2021-03-17T13:52:33Z) - Set Representation Learning with Generalized Sliced-Wasserstein
Embeddings [22.845403993200932]
We propose a geometrically-interpretable framework for learning representations from set-structured data.
In particular, we treat elements of a set as samples from a probability measure and propose an exact Euclidean embedding for Generalized Sliced Wasserstein.
We evaluate our proposed framework on multiple supervised and unsupervised set learning tasks and demonstrate its superiority over state-of-the-art set representation learning approaches.
arXiv Detail & Related papers (2021-03-05T19:00:34Z) - Learn to Predict Sets Using Feed-Forward Neural Networks [63.91494644881925]
This paper addresses the task of set prediction using deep feed-forward neural networks.
We present a novel approach for learning to predict sets with unknown permutation and cardinality.
We demonstrate the validity of our set formulations on relevant vision problems.
arXiv Detail & Related papers (2020-01-30T01:52:07Z) - Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.