Related papers: Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes

Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes

URL: http://arxiv.org/abs/2506.09163v1
Date: Tue, 10 Jun 2025 18:24:08 GMT
Title: Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes
Authors: Daniel Jenson, Jhonathan Navott, Piotr Grynfelder, Mengyan Zhang, Makkunda Sharma, Elizaveta Semenova, Seth Flaxman,
Abstract summary: Biased Scan Attention Transformer Neural Process (BSA-TNP)<n>BSA-TNP is able to: (1) match or exceed the accuracy of the best models while often training in a fraction of the time, (2) exhibit translation invariance, enabling learning at multiple resolutions simultaneously, (3) transparently model processes that evolve in both space and time, and (5) scale gracefully -- running inference with over 1M points with 100K context points in under a minute on a single 24GB GPU.
Score: 2.198760145670348
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Neural Processes (NPs) are a rapidly evolving class of models designed to directly model the posterior predictive distribution of stochastic processes. While early architectures were developed primarily as a scalable alternative to Gaussian Processes (GPs), modern NPs tackle far more complex and data hungry applications spanning geology, epidemiology, climate, and robotics. These applications have placed increasing pressure on the scalability of these models, with many architectures compromising accuracy for scalability. In this paper, we demonstrate that this tradeoff is often unnecessary, particularly when modeling fully or partially translation invariant processes. We propose a versatile new architecture, the Biased Scan Attention Transformer Neural Process (BSA-TNP), which introduces Kernel Regression Blocks (KRBlocks), group-invariant attention biases, and memory-efficient Biased Scan Attention (BSA). BSA-TNP is able to: (1) match or exceed the accuracy of the best models while often training in a fraction of the time, (2) exhibit translation invariance, enabling learning at multiple resolutions simultaneously, (3) transparently model processes that evolve in both space and time, (4) support high dimensional fixed effects, and (5) scale gracefully -- running inference with over 1M test points with 100K context points in under a minute on a single 24GB GPU.

Related papers

Transformer Neural Processes - Kernel Regression [2.309018557701645]
We introduce the Transformer Neural Process - Kernel Regression (TNP-KR), a scalable Neural Process (NP)<n>TNP-KR features a Kernel Regression Block (KR-Block), a simple, parameter, and efficient transformer block with complexity $O(n_c2 + n_c n_t)$, and two novel attention mechanisms: scan attention (SA), a memory-efficient scan-based bias, and deep kernel attention (DKA), a Performer-style attention that implicitly incoporates a distance bias.<n>These enhancements enable both TNP-KR variants to perform inference with 100K
arXiv Detail & Related papers (2024-11-19T13:40:49Z)
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.<n>As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z)
Parameter Estimation of Long Memory Stochastic Processes with Deep Neural Networks [0.0]
We present a purely deep neural network-based approach for estimating long memory parameters of time series models. Parameters, such as the Hurst exponent, are critical in characterizing the long-range dependence, roughness, and self-similarity of processes.
arXiv Detail & Related papers (2024-10-03T03:14:58Z)
Convolutional Conditional Neural Processes [6.532867867011488]
This thesis advances neural processes in three ways. ConvNPs improve data efficiency by building in a symmetry called translationvariance. GNPs directly parametrise dependencies in the predictions of a neural process. AR CNPs train a neural process without any modifications to the model or training procedure and, at test time, roll out the model in an autoregressive fashion.
arXiv Detail & Related papers (2024-08-18T19:53:38Z)
AD-NEV: A Scalable Multi-level Neuroevolution Framework for Multivariate Anomaly Detection [1.0323063834827415]
Anomaly detection tools and methods present a key capability in modern cyberphysical and failure prediction systems. Model optimization for a given dataset is a cumbersome and time consuming process. We propose Anomaly Detection Neuroevolution (AD-NEv) - a scalable multi-level optimized neuroevolution framework.
arXiv Detail & Related papers (2023-05-25T21:52:38Z)
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support. The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG) Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z)
Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision. This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z)
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver [59.13397937903832]
We introduce a deep learning-based corrector called Neural Vector (NeurVec) NeurVec can compensate for integration errors and enable larger time step sizes in simulations. Our experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability.
arXiv Detail & Related papers (2022-08-07T09:02:18Z)
Multi-fidelity Hierarchical Neural Processes [79.0284780825048]
Multi-fidelity surrogate modeling reduces the computational cost by fusing different simulation outputs. We propose Multi-fidelity Hierarchical Neural Processes (MF-HNP), a unified neural latent variable model for multi-fidelity surrogate modeling. We evaluate MF-HNP on epidemiology and climate modeling tasks, achieving competitive performance in terms of accuracy and uncertainty estimation.
arXiv Detail & Related papers (2022-06-10T04:54:13Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Neural ODE Processes [64.10282200111983]
We introduce Neural ODE Processes (NDPs), a new class of processes determined by a distribution over Neural ODEs. We show that our model can successfully capture the dynamics of low-dimensional systems from just a few data-points.
arXiv Detail & Related papers (2021-03-23T09:32:06Z)
Learning Multivariate Hawkes Processes at Scale [17.17906360554892]
We show that our approach allows to compute the exact likelihood and gradients of an MHP -- independently of the ambient dimensions of the underlying network. We show on synthetic and real-world datasets that our model does not only achieve state-of-the-art predictive results, but also improves runtime performance by multiple orders of magnitude.
arXiv Detail & Related papers (2020-02-28T01:18:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.