Taylorformer: Probabilistic Modelling for Random Processes including Time Series
- URL: http://arxiv.org/abs/2305.19141v2
- Date: Mon, 23 Sep 2024 15:56:20 GMT
- Title: Taylorformer: Probabilistic Modelling for Random Processes including Time Series
- Authors: Omer Nivron, Raghul Parthipan, Damon J. Wischik,
- Abstract summary: We propose the Taylorformer for random processes such as time series.
Its two key components are: 1) the LocalTaylor wrapper which adapts Taylor approximations for use in neural network-based probabilistic models, and 2) the MHA-X attention block which makes predictions in a way inspired by how Gaussian Processes' mean predictions are linear smoothings of contextual data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose the Taylorformer for random processes such as time series. Its two key components are: 1) the LocalTaylor wrapper which adapts Taylor approximations (used in dynamical systems) for use in neural network-based probabilistic models, and 2) the MHA-X attention block which makes predictions in a way inspired by how Gaussian Processes' mean predictions are linear smoothings of contextual data. Taylorformer outperforms the state-of-the-art in terms of log-likelihood on 5/6 classic Neural Process tasks such as meta-learning 1D functions, and has at least a 14\% MSE improvement on forecasting tasks, including electricity, oil temperatures and exchange rates. Taylorformer approximates a consistent stochastic process and provides uncertainty-aware predictions. Our code is provided in the supplementary material.
Related papers
- Learning-Order Autoregressive Models with Application to Molecular Graph Generation [52.44913282062524]
We introduce a variant of ARM that generates high-dimensional data using a probabilistic ordering that is sequentially inferred from data.
We demonstrate experimentally that our method can learn meaningful autoregressive orderings in image and graph generation.
arXiv Detail & Related papers (2025-03-07T23:24:24Z) - von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Logistic-beta processes for dependent random probabilities with beta marginals [58.91121576998588]
We propose a novel process called the logistic-beta process, whose logistic transformation yields a process with common beta marginals.
It can model dependence on both discrete and continuous domains, such as space or time, and has a flexible dependence structure through correlation kernels.
We illustrate the benefits through nonparametric binary regression and conditional density estimation examples, both in simulation studies and in a pregnancy outcome application.
arXiv Detail & Related papers (2024-02-10T21:41:32Z) - Likelihood-based inference and forecasting for trawl processes: a
stochastic optimization approach [0.0]
We develop the first likelihood-based methodology for the inference of real-valued trawl processes.
We introduce novel deterministic and probabilistic forecasting methods.
We release a Python library which can be used to fit a large class of trawl processes.
arXiv Detail & Related papers (2023-08-30T15:37:48Z) - A Heavy-Tailed Algebra for Probabilistic Programming [53.32246823168763]
We propose a systematic approach for analyzing the tails of random variables.
We show how this approach can be used during the static analysis (before drawing samples) pass of a probabilistic programming language compiler.
Our empirical results confirm that inference algorithms that leverage our heavy-tailed algebra attain superior performance across a number of density modeling and variational inference tasks.
arXiv Detail & Related papers (2023-06-15T16:37:36Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Correcting Model Bias with Sparse Implicit Processes [0.9187159782788579]
We show that Sparse Implicit Processes (SIP) is capable of correcting model bias when the data generating mechanism differs strongly from the one implied by the model.
We use synthetic datasets to show that SIP is capable of providing predictive distributions that reflect the data better than the exact predictions of the initial, but wrongly assumed model.
arXiv Detail & Related papers (2022-07-21T18:00:01Z) - B\'ezier Curve Gaussian Processes [8.11969931278838]
This paper proposes a new probabilistic sequence model building on probabilistic B'ezier curves.
Combined with a Mixture Density network, Bayesian conditional inference can be performed without the need for mean field variational approximation.
The model is used for pedestrian trajectory prediction, where a generated prediction also serves as a GP prior.
arXiv Detail & Related papers (2022-05-03T19:49:57Z) - On the Dynamics of Inference and Learning [0.0]
We present a treatment of this Bayesian updating process as a continuous dynamical system.
We show that when the Cram'er-Rao bound is saturated the learning rate is governed by a simple $1/T$ power-law.
arXiv Detail & Related papers (2022-04-19T18:04:36Z) - Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic
Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees.
We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z) - Deep Distributional Time Series Models and the Probabilistic Forecasting
of Intraday Electricity Prices [0.0]
We propose two approaches to constructing deep time series probabilistic models.
The first is where the output layer of the ESN has disturbances and a shrinkage prior for additional regularization.
The second approach employs the implicit copula of an ESN with Gaussian disturbances, which is a deep copula process on the feature space.
arXiv Detail & Related papers (2020-10-05T08:02:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.