Translation Equivariant Transformer Neural Processes
- URL: http://arxiv.org/abs/2406.12409v1
- Date: Tue, 18 Jun 2024 08:58:59 GMT
- Title: Translation Equivariant Transformer Neural Processes
- Authors: Matthew Ashman, Cristiana Diaconu, Junhyuck Kim, Lakee Sivaraya, Stratis Markou, James Requeima, Wessel P. Bruinsma, Richard E. Turner,
- Abstract summary: The effectiveness of neural processes (NPs) in modelling posterior prediction maps has significantly improved since their inception.
This improvement can be attributed to two principal factors: (1) advancements in the architecture of permutation invariant set functions, which are to all distributionss; and (2) leveraging symmetries present in the true posterior predictive map.
- Score: 22.463975744505717
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The effectiveness of neural processes (NPs) in modelling posterior prediction maps -- the mapping from data to posterior predictive distributions -- has significantly improved since their inception. This improvement can be attributed to two principal factors: (1) advancements in the architecture of permutation invariant set functions, which are intrinsic to all NPs; and (2) leveraging symmetries present in the true posterior predictive map, which are problem dependent. Transformers are a notable development in permutation invariant set functions, and their utility within NPs has been demonstrated through the family of models we refer to as TNPs. Despite significant interest in TNPs, little attention has been given to incorporating symmetries. Notably, the posterior prediction maps for data that are stationary -- a common assumption in spatio-temporal modelling -- exhibit translation equivariance. In this paper, we introduce of a new family of translation equivariant TNPs that incorporate translation equivariance. Through an extensive range of experiments on synthetic and real-world spatio-temporal data, we demonstrate the effectiveness of TE-TNPs relative to their non-translation-equivariant counterparts and other NP baselines.
Related papers
- Relative Representations: Topological and Geometric Perspectives [53.88896255693922]
Relative representations are an established approach to zero-shot model stitching.
We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.
Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z) - Neural Functional Transformers [99.98750156515437]
This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers called neural functional Transformers (NFTs)
NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains.
We also leverage NFTs to develop Inr2Array, a novel method for computing permutation invariant representations from the weights of implicit neural representations (INRs)
arXiv Detail & Related papers (2023-05-22T23:38:27Z) - Interrelation of equivariant Gaussian processes and convolutional neural
networks [77.34726150561087]
Currently there exists rather promising new trend in machine leaning (ML) based on the relationship between neural networks (NN) and Gaussian processes (GP)
In this work we establish a relationship between the many-channel limit for CNNs equivariant with respect to two-dimensional Euclidean group with vector-valued neuron activations and the corresponding independently introduced equivariant Gaussian processes (GP)
arXiv Detail & Related papers (2022-09-17T17:02:35Z) - Transformer Neural Processes: Uncertainty-Aware Meta Learning Via
Sequence Modeling [26.377099481072992]
We propose Transformer Neural Processes (TNPs) for uncertainty-aware meta learning.
We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture.
We show that TNPs achieve state-of-the-art performance on various benchmark problems.
arXiv Detail & Related papers (2022-07-09T02:28:58Z) - Deformation Robust Roto-Scale-Translation Equivariant CNNs [10.44236628142169]
Group-equivariant convolutional neural networks (G-CNNs) achieve significantly improved generalization performance with intrinsic symmetry.
General theory and practical implementation of G-CNNs have been studied for planar images under either rotation or scaling transformation.
arXiv Detail & Related papers (2021-11-22T03:58:24Z) - Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables.
We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST.
We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z) - Group Equivariant Conditional Neural Processes [30.134634059773703]
We present the group equivariant conditional neural process (EquivCNP)
We show that EquivCNP achieves comparable performance to conventional conditional neural processes in a 1D regression task.
arXiv Detail & Related papers (2021-02-17T13:50:07Z) - Equivariant Learning of Stochastic Fields: Gaussian Processes and
Steerable Conditional Neural Processes [44.51932024971217]
We study the problem of learning fields, i.e. processes whose samples are fields like those occurring in physics and engineering.
We introduce Steerable Conditional Neural Processes (SteerCNPs), a new, fully equivariant member of the Neural Process family.
In experiments with Gaussian process vector fields, images, and real-world weather data, we observe that SteerCNPs significantly improve the performance of previous models.
arXiv Detail & Related papers (2020-11-25T18:00:40Z) - Meta-Learning Stationary Stochastic Process Prediction with
Convolutional Neural Processes [32.02612871707347]
We propose ConvNP, which endows Neural Processes (NPs) with translation equivariance and extends convolutional conditional NPs to allow for dependencies in the predictive distribution.
We demonstrate the strong performance and generalization capabilities of ConvNPs on 1D, regression image completion, and various tasks with real-world-temporal data.
arXiv Detail & Related papers (2020-07-02T18:25:27Z) - NP-PROV: Neural Processes with Position-Relevant-Only Variances [113.20013269514327]
We present a new member named Neural Processes with Position-Relevant-Only Variances (NP-PROV)
NP-PROV hypothesizes that a target point close to a context point has small uncertainty, regardless of the function value at that position.
Our evaluation on synthetic and real-world datasets reveals that NP-PROV can achieve state-of-the-art likelihood while retaining a bounded variance.
arXiv Detail & Related papers (2020-06-15T06:11:21Z) - Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.