Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
- URL: http://arxiv.org/abs/2206.11925v1
- Date: Thu, 23 Jun 2022 18:04:56 GMT
- Title: Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
- Authors: Lily H. Zhang, Veronica Tozzo, John M. Higgins, and Rajesh Ranganath
- Abstract summary: We show that existing permutation invariant architectures, Deep Sets and Set Transformer, can suffer from vanishing or exploding gradients when they are deep.
We introduce the clean path principle for equivariant residual connections and develop set norm, a normalization tailored for sets.
With these, we build Deep Sets++ and Set Transformer++, models that reach high depths with comparable or better performance than their original counterparts.
- Score: 18.582561853987027
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Permutation invariant neural networks are a promising tool for making
predictions from sets. However, we show that existing permutation invariant
architectures, Deep Sets and Set Transformer, can suffer from vanishing or
exploding gradients when they are deep. Additionally, layer norm, the
normalization of choice in Set Transformer, can hurt performance by removing
information useful for prediction. To address these issues, we introduce the
clean path principle for equivariant residual connections and develop set norm,
a normalization tailored for sets. With these, we build Deep Sets++ and Set
Transformer++, models that reach high depths with comparable or better
performance than their original counterparts on a diverse suite of tasks. We
additionally introduce Flow-RBC, a new single-cell dataset and real-world
application of permutation invariant prediction. We open-source our data and
code here: https://github.com/rajesh-lab/deep_permutation_invariant.
Related papers
- Restore Translation Using Equivariant Neural Networks [7.78895108256899]
In this paper, we propose a pre-classifier restorer to recover translated (or even rotated) inputs to a convolutional neural network.
The restorer is based on a theoretical result which gives a sufficient and necessary condition for an affine operator to be translational equivariant on a tensor space.
arXiv Detail & Related papers (2023-06-29T13:34:35Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Deep Transformers without Shortcuts: Modifying Self-attention for
Faithful Signal Propagation [105.22961467028234]
Skip connections and normalisation layers are ubiquitous for the training of Deep Neural Networks (DNNs)
Recent approaches such as Deep Kernel Shaping have made progress towards reducing our reliance on them.
But these approaches are incompatible with the self-attention layers present in transformers.
arXiv Detail & Related papers (2023-02-20T21:26:25Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Top-N: Equivariant set and graph generation without exchangeability [61.24699600833916]
We consider one-shot probabilistic decoders that map a vector-shaped prior to a distribution over sets or graphs.
These functions can be integrated into variational autoencoders (VAE), generative adversarial networks (GAN) or normalizing flows.
Top-n is a deterministic, non-exchangeable set creation mechanism which learns to select the most relevant points from a trainable reference set.
arXiv Detail & Related papers (2021-10-05T14:51:19Z) - IOT: Instance-wise Layer Reordering for Transformer Structures [173.39918590438245]
We break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure.
Our method can also be applied to other architectures beyond Transformer.
arXiv Detail & Related papers (2021-03-05T03:44:42Z) - Conditional Set Generation with Transformers [15.315473956458227]
A set is an unordered collection of unique elements.
Many machine learning models generate sets that impose an implicit or explicit ordering.
An alternative solution is to use a permutation-equivariant set generator, which does not specify an order-ing.
We introduce the Transformer Set Prediction Network (TSPN), a flexible permutation-equivariant model for set prediction.
arXiv Detail & Related papers (2020-06-26T17:52:27Z) - Learn to Predict Sets Using Feed-Forward Neural Networks [63.91494644881925]
This paper addresses the task of set prediction using deep feed-forward neural networks.
We present a novel approach for learning to predict sets with unknown permutation and cardinality.
We demonstrate the validity of our set formulations on relevant vision problems.
arXiv Detail & Related papers (2020-01-30T01:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.