Related papers: Incremental Transformer Neural Processes

Incremental Transformer Neural Processes

URL: http://arxiv.org/abs/2602.18955v1
Date: Sat, 21 Feb 2026 20:30:04 GMT
Title: Incremental Transformer Neural Processes
Authors: Philip Mortimer, Cristiana Diaconu, Tommy Rochussen, Bruno Mlodozeniec, Richard E. Turner,
Abstract summary: We introduce the Incremental Neural Processes (incTNP)<n>incTNP matches the predictive performance of standard TNPs while reducing the computational cost of updates.<n>We empirically evaluate our model on a range of synthetic and real-world tasks.
Score: 19.42901413521077
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural Processes (NPs), and specifically Transformer Neural Processes (TNPs), have demonstrated remarkable performance across tasks ranging from spatiotemporal forecasting to tabular data modelling. However, many of these applications are inherently sequential, involving continuous data streams such as real-time sensor readings or database updates. In such settings, models should support cheap, incremental updates rather than recomputing internal representations from scratch for every new observation -- a capability existing TNP variants lack. Drawing inspiration from Large Language Models, we introduce the Incremental TNP (incTNP). By leveraging causal masking, Key-Value (KV) caching, and a data-efficient autoregressive training strategy, incTNP matches the predictive performance of standard TNPs while reducing the computational cost of updates from quadratic to linear time complexity. We empirically evaluate our model on a range of synthetic and real-world tasks, including tabular regression and temperature prediction. Our results show that, surprisingly, incTNP delivers performance comparable to -- or better than -- non-causal TNPs while unlocking orders-of-magnitude speedups for sequential inference. Finally, we assess the consistency of the model's updates -- by adapting a metric of ``implicit Bayesianness", we show that incTNP retains a prediction rule as implicitly Bayesian as standard non-causal TNPs, demonstrating that incTNP achieves the computational benefits of causal masking without sacrificing the consistency required for streaming inference.

Related papers

Exploring Pseudo-Token Approaches in Transformer Neural Processes [0.0]
We introduce the Induced Set Attentive Neural Processes (ISANPs)<n>ISANPs perform competitively with Transformer Neural Processes (TNPs) and often surpass state-of-the-art models in 1D regression, image completion, contextual bandits, and Bayesian optimization.<n>ISANPs offer a tunable balance between performance and computational complexity, which scale well to larger datasets.
arXiv Detail & Related papers (2025-04-19T22:47:59Z)
In-Context In-Context Learning with Transformer Neural Processes [50.57807892496024]
We develop the in-context in-context learning pseudo-token TNP (ICICL-TNP) The ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.
arXiv Detail & Related papers (2024-06-19T12:26:36Z)
Rényi Neural Processes [14.11793373584558]
We show that Neural Processes enforce parameterization coupling between the conditional prior model and the posterior model.<n>We propose R'enyi Neural Processes (RNP), a method that replaces the standard KL divergence with the R'enyi divergence.<n>We show significant performance improvements of RNPs in real-world problems.
arXiv Detail & Related papers (2024-05-25T00:14:55Z)
Memory Efficient Neural Processes via Constant Memory Attention Block [55.82269384896986]
Constant Memory Attentive Neural Processes (CMANPs) are an NP variant that only requires constant memory. We show CMANPs achieve state-of-the-art results on popular NP benchmarks while being significantly more memory efficient than prior methods.
arXiv Detail & Related papers (2023-05-23T23:10:19Z)
Latent Bottlenecked Attentive Neural Processes [71.18817592128207]
We present Latent Bottlenecked Attentive Neural Processes (LBANPs) LBANPs have a querying computational complexity independent of the number of context datapoints. We show LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits.
arXiv Detail & Related papers (2022-11-15T19:21:41Z)
Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy. To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN. Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z)
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling [26.377099481072992]
We propose Transformer Neural Processes (TNPs) for uncertainty-aware meta learning. We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture. We show that TNPs achieve state-of-the-art performance on various benchmark problems.
arXiv Detail & Related papers (2022-07-09T02:28:58Z)
Truncated tensor Schatten p-norm based approach for spatiotemporal traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers. Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z)
Message Passing Neural Processes [3.0969191504482247]
We introduce Message Passing Neural Processes (MPNPs), which explicitly makes use of relational structure within the model. MPNPs thrive at lower sampling rates, on existing benchmarks and newly-proposed CA and Cora-Branched tasks. We report strong generalisation over density-based CA rulesets and significant gains in challenging arbitrary-labelling and few-shot learning setups.
arXiv Detail & Related papers (2020-09-29T09:40:09Z)
Bootstrapping Neural Processes [114.97111530885093]
Neural Processes (NPs) implicitly define a broad class of processes with neural networks. NPs still rely on an assumption that uncertainty in processes is modeled by a single latent variable. We propose the Boostrapping Neural Process (BNP), a novel extension of the NP family using the bootstrap.
arXiv Detail & Related papers (2020-08-07T02:23:34Z)
Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes [32.02612871707347]
We propose ConvNP, which endows Neural Processes (NPs) with translation equivariance and extends convolutional conditional NPs to allow for dependencies in the predictive distribution. We demonstrate the strong performance and generalization capabilities of ConvNPs on 1D, regression image completion, and various tasks with real-world-temporal data.
arXiv Detail & Related papers (2020-07-02T18:25:27Z)
Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks. To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor. For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.