Latent Bottlenecked Attentive Neural Processes
- URL: http://arxiv.org/abs/2211.08458v1
- Date: Tue, 15 Nov 2022 19:21:41 GMT
- Title: Latent Bottlenecked Attentive Neural Processes
- Authors: Leo Feng, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed
- Abstract summary: We present Latent Bottlenecked Attentive Neural Processes (LBANPs)
LBANPs have a querying computational complexity independent of the number of context datapoints.
We show LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits.
- Score: 71.18817592128207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Processes (NPs) are popular methods in meta-learning that can estimate
predictive uncertainty on target datapoints by conditioning on a context
dataset. Previous state-of-the-art method Transformer Neural Processes (TNPs)
achieve strong performance but require quadratic computation with respect to
the number of context datapoints, significantly limiting its scalability.
Conversely, existing sub-quadratic NP variants perform significantly worse than
that of TNPs. Tackling this issue, we propose Latent Bottlenecked Attentive
Neural Processes (LBANPs), a new computationally efficient sub-quadratic NP
variant, that has a querying computational complexity independent of the number
of context datapoints. The model encodes the context dataset into a constant
number of latent vectors on which self-attention is performed. When making
predictions, the model retrieves higher-order information from the context
dataset via multiple cross-attention mechanisms on the latent vectors. We
empirically show that LBANPs achieve results competitive with the
state-of-the-art on meta-regression, image completion, and contextual
multi-armed bandits. We demonstrate that LBANPs can trade-off the computational
cost and performance according to the number of latent vectors. Finally, we
show LBANPs can scale beyond existing attention-based NP variants to larger
dataset settings.
Related papers
- In-Context In-Context Learning with Transformer Neural Processes [50.57807892496024]
We develop the in-context in-context learning pseudo-token TNP (ICICL-TNP)
The ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning.
We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.
arXiv Detail & Related papers (2024-06-19T12:26:36Z) - Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions.
Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost.
We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z) - Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy.
To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN.
Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z) - Transformer Neural Processes: Uncertainty-Aware Meta Learning Via
Sequence Modeling [26.377099481072992]
We propose Transformer Neural Processes (TNPs) for uncertainty-aware meta learning.
We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture.
We show that TNPs achieve state-of-the-art performance on various benchmark problems.
arXiv Detail & Related papers (2022-07-09T02:28:58Z) - Practical Conditional Neural Processes Via Tractable Dependent
Predictions [25.15531845287349]
Conditional Neural Processes (CNPs) are meta-learning models which leverage the flexibility of deep learning to produce well-calibrated predictions.
CNPs do not produce correlated predictions, making them inappropriate for many estimation and decision making tasks.
We present a new class of Neural Process models that make correlated predictions and support exact maximum likelihood training.
arXiv Detail & Related papers (2022-03-16T17:37:41Z) - Message Passing Neural Processes [3.0969191504482247]
We introduce Message Passing Neural Processes (MPNPs), which explicitly makes use of relational structure within the model.
MPNPs thrive at lower sampling rates, on existing benchmarks and newly-proposed CA and Cora-Branched tasks.
We report strong generalisation over density-based CA rulesets and significant gains in challenging arbitrary-labelling and few-shot learning setups.
arXiv Detail & Related papers (2020-09-29T09:40:09Z) - Bootstrapping Neural Processes [114.97111530885093]
Neural Processes (NPs) implicitly define a broad class of processes with neural networks.
NPs still rely on an assumption that uncertainty in processes is modeled by a single latent variable.
We propose the Boostrapping Neural Process (BNP), a novel extension of the NP family using the bootstrap.
arXiv Detail & Related papers (2020-08-07T02:23:34Z) - Meta-Learning Stationary Stochastic Process Prediction with
Convolutional Neural Processes [32.02612871707347]
We propose ConvNP, which endows Neural Processes (NPs) with translation equivariance and extends convolutional conditional NPs to allow for dependencies in the predictive distribution.
We demonstrate the strong performance and generalization capabilities of ConvNPs on 1D, regression image completion, and various tasks with real-world-temporal data.
arXiv Detail & Related papers (2020-07-02T18:25:27Z) - Continual Learning using a Bayesian Nonparametric Dictionary of Weight
Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings.
We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity.
We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.