In-Context In-Context Learning with Transformer Neural Processes
- URL: http://arxiv.org/abs/2406.13493v1
- Date: Wed, 19 Jun 2024 12:26:36 GMT
- Title: In-Context In-Context Learning with Transformer Neural Processes
- Authors: Matthew Ashman, Cristiana Diaconu, Adrian Weller, Richard E. Turner,
- Abstract summary: We develop the in-context in-context learning pseudo-token TNP (ICICL-TNP)
The ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning.
We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.
- Score: 50.57807892496024
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural processes (NPs) are a powerful family of meta-learning models that seek to approximate the posterior predictive map of the ground-truth stochastic process from which each dataset in a meta-dataset is sampled. There are many cases in which practitioners, besides having access to the dataset of interest, may also have access to other datasets that share similarities with it. In this case, integrating these datasets into the NP can improve predictions. We equip NPs with this functionality and describe this paradigm as in-context in-context learning. Standard NP architectures, such as the convolutional conditional NP (ConvCNP) or the family of transformer neural processes (TNPs), are not capable of in-context in-context learning, as they are only able to condition on a single dataset. We address this shortcoming by developing the in-context in-context learning pseudo-token TNP (ICICL-TNP). The ICICL-TNP builds on the family of PT-TNPs, which utilise pseudo-token-based transformer architectures to sidestep the quadratic computational complexity associated with regular transformer architectures. Importantly, the ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.
Related papers
- Making Pre-trained Language Models Great on Tabular Prediction [50.70574370855663]
The transferability of deep neural networks (DNNs) has made significant progress in image and language processing.
We present TP-BERTa, a specifically pre-trained LM for tabular data prediction.
A novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names.
arXiv Detail & Related papers (2024-03-04T08:38:56Z) - Latent Bottlenecked Attentive Neural Processes [71.18817592128207]
We present Latent Bottlenecked Attentive Neural Processes (LBANPs)
LBANPs have a querying computational complexity independent of the number of context datapoints.
We show LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits.
arXiv Detail & Related papers (2022-11-15T19:21:41Z) - Transformer Neural Processes: Uncertainty-Aware Meta Learning Via
Sequence Modeling [26.377099481072992]
We propose Transformer Neural Processes (TNPs) for uncertainty-aware meta learning.
We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture.
We show that TNPs achieve state-of-the-art performance on various benchmark problems.
arXiv Detail & Related papers (2022-07-09T02:28:58Z) - NP-Match: When Neural Processes meet Semi-Supervised Learning [133.009621275051]
Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data.
In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match.
arXiv Detail & Related papers (2022-07-03T15:24:31Z) - Neural Processes with Stochastic Attention: Paying more attention to the
context dataset [11.301294319986477]
Neural processes (NPs) aim to complete unseen data points based on a given context dataset.
We propose a attention mechanism for NPs to capture appropriate context information.
We empirically show that our approach substantially outperforms conventional NPs in various domains.
arXiv Detail & Related papers (2022-04-11T23:57:19Z) - Text-based NP Enrichment [51.403543011094975]
We establish the task of text-based NP enrichment (TNE), that is, enriching each NP with all the preposition-mediated relations that hold between this and the other NPs in the text.
Humans recover such relations seamlessly, while current state-of-the-art models struggle with them due to the implicit nature of the problem.
We build the first large-scale dataset for the problem, provide the formal framing and scope of annotation, analyze the data, and report the result of fine-tuned neural language models on the task.
arXiv Detail & Related papers (2021-09-24T17:23:25Z) - Message Passing Neural Processes [3.0969191504482247]
We introduce Message Passing Neural Processes (MPNPs), which explicitly makes use of relational structure within the model.
MPNPs thrive at lower sampling rates, on existing benchmarks and newly-proposed CA and Cora-Branched tasks.
We report strong generalisation over density-based CA rulesets and significant gains in challenging arbitrary-labelling and few-shot learning setups.
arXiv Detail & Related papers (2020-09-29T09:40:09Z) - Bootstrapping Neural Processes [114.97111530885093]
Neural Processes (NPs) implicitly define a broad class of processes with neural networks.
NPs still rely on an assumption that uncertainty in processes is modeled by a single latent variable.
We propose the Boostrapping Neural Process (BNP), a novel extension of the NP family using the bootstrap.
arXiv Detail & Related papers (2020-08-07T02:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.