Related papers: In-Context In-Context Learning with Transformer Neural Processes

In-Context In-Context Learning with Transformer Neural Processes

URL: http://arxiv.org/abs/2406.13493v1
Date: Wed, 19 Jun 2024 12:26:36 GMT
Title: In-Context In-Context Learning with Transformer Neural Processes
Authors: Matthew Ashman, Cristiana Diaconu, Adrian Weller, Richard E. Turner,
Abstract summary: We develop the in-context in-context learning pseudo-token TNP (ICICL-TNP) The ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.
Score: 50.57807892496024
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural processes (NPs) are a powerful family of meta-learning models that seek to approximate the posterior predictive map of the ground-truth stochastic process from which each dataset in a meta-dataset is sampled. There are many cases in which practitioners, besides having access to the dataset of interest, may also have access to other datasets that share similarities with it. In this case, integrating these datasets into the NP can improve predictions. We equip NPs with this functionality and describe this paradigm as in-context in-context learning. Standard NP architectures, such as the convolutional conditional NP (ConvCNP) or the family of transformer neural processes (TNPs), are not capable of in-context in-context learning, as they are only able to condition on a single dataset. We address this shortcoming by developing the in-context in-context learning pseudo-token TNP (ICICL-TNP). The ICICL-TNP builds on the family of PT-TNPs, which utilise pseudo-token-based transformer architectures to sidestep the quadratic computational complexity associated with regular transformer architectures. Importantly, the ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.

Related papers

Exploring Pseudo-Token Approaches in Transformer Neural Processes [0.0]
We introduce the Induced Set Attentive Neural Processes (ISANPs) ISANPs perform competitively with Transformer Neural Processes (TNPs) and often surpass state-of-the-art models in 1D regression, image completion, contextual bandits, and Bayesian optimization. ISANPs offer a tunable balance between performance and computational complexity, which scale well to larger datasets.
arXiv Detail & Related papers (2025-04-19T22:47:59Z)
Dimension Agnostic Neural Processes [17.417747846307996]
Meta-learning aims to train models that can generalize to new tasks with limited labeled data by extracting shared features across diverse task datasets. We introduce Dimension A Neural Processes(DANP), which transforms input features into a fixed-dimensional space and learns a wider range of features that are generalizable across various tasks. We empirically show that DANP outperforms previous NP variations, showcasing its effectiveness in overcoming the limitations of traditional NP models.
arXiv Detail & Related papers (2025-02-28T02:40:59Z)
A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities [51.08999772842298]
Tabular Prior-data Fitted Network v2 (TabPFN v2) achieves unprecedented in-context learning performance across diverse downstream datasets.<n>We show that TabPFN v2 can infer attribute relationships even when provided with randomized attribute token inputs.<n>We demonstrate that TabPFN v2's limitations can be addressed through a test-time divide-and-context strategy.
arXiv Detail & Related papers (2025-02-24T17:38:42Z)
Making Pre-trained Language Models Great on Tabular Prediction [50.70574370855663]
The transferability of deep neural networks (DNNs) has made significant progress in image and language processing. We present TP-BERTa, a specifically pre-trained LM for tabular data prediction. A novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names.
arXiv Detail & Related papers (2024-03-04T08:38:56Z)
Latent Bottlenecked Attentive Neural Processes [71.18817592128207]
We present Latent Bottlenecked Attentive Neural Processes (LBANPs) LBANPs have a querying computational complexity independent of the number of context datapoints. We show LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits.
arXiv Detail & Related papers (2022-11-15T19:21:41Z)
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling [26.377099481072992]
We propose Transformer Neural Processes (TNPs) for uncertainty-aware meta learning. We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture. We show that TNPs achieve state-of-the-art performance on various benchmark problems.
arXiv Detail & Related papers (2022-07-09T02:28:58Z)
NP-Match: When Neural Processes meet Semi-Supervised Learning [133.009621275051]
Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match.
arXiv Detail & Related papers (2022-07-03T15:24:31Z)
Neural Processes with Stochastic Attention: Paying more attention to the context dataset [11.301294319986477]
Neural processes (NPs) aim to complete unseen data points based on a given context dataset. We propose a attention mechanism for NPs to capture appropriate context information. We empirically show that our approach substantially outperforms conventional NPs in various domains.
arXiv Detail & Related papers (2022-04-11T23:57:19Z)
Text-based NP Enrichment [51.403543011094975]
We establish the task of text-based NP enrichment (TNE), that is, enriching each NP with all the preposition-mediated relations that hold between this and the other NPs in the text. Humans recover such relations seamlessly, while current state-of-the-art models struggle with them due to the implicit nature of the problem. We build the first large-scale dataset for the problem, provide the formal framing and scope of annotation, analyze the data, and report the result of fine-tuned neural language models on the task.
arXiv Detail & Related papers (2021-09-24T17:23:25Z)
Message Passing Neural Processes [3.0969191504482247]
We introduce Message Passing Neural Processes (MPNPs), which explicitly makes use of relational structure within the model. MPNPs thrive at lower sampling rates, on existing benchmarks and newly-proposed CA and Cora-Branched tasks. We report strong generalisation over density-based CA rulesets and significant gains in challenging arbitrary-labelling and few-shot learning setups.
arXiv Detail & Related papers (2020-09-29T09:40:09Z)
Bootstrapping Neural Processes [114.97111530885093]
Neural Processes (NPs) implicitly define a broad class of processes with neural networks. NPs still rely on an assumption that uncertainty in processes is modeled by a single latent variable. We propose the Boostrapping Neural Process (BNP), a novel extension of the NP family using the bootstrap.
arXiv Detail & Related papers (2020-08-07T02:23:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.