Locking and Quacking: Stacking Bayesian model predictions by log-pooling
and superposition
- URL: http://arxiv.org/abs/2305.07334v1
- Date: Fri, 12 May 2023 09:26:26 GMT
- Title: Locking and Quacking: Stacking Bayesian model predictions by log-pooling
and superposition
- Authors: Yuling Yao, Luiz Max Carvalho, Diego Mesquita, Yann McLatchie
- Abstract summary: We present two novel tools for combining predictions from different models.
These are generalisations of model stacking, but combine posterior densities by log-linear pooling and quantum superposition.
To optimise model weights while avoiding the burden of normalising constants, we investigate the Hyvarinen score of the combined posterior predictions.
- Score: 0.5735035463793007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Combining predictions from different models is a central problem in Bayesian
inference and machine learning more broadly. Currently, these predictive
distributions are almost exclusively combined using linear mixtures such as
Bayesian model averaging, Bayesian stacking, and mixture of experts. Such
linear mixtures impose idiosyncrasies that might be undesirable for some
applications, such as multi-modality. While there exist alternative strategies
(e.g. geometric bridge or superposition), optimising their parameters usually
involves computing an intractable normalising constant repeatedly. We present
two novel Bayesian model combination tools. These are generalisations of model
stacking, but combine posterior densities by log-linear pooling (locking) and
quantum superposition (quacking). To optimise model weights while avoiding the
burden of normalising constants, we investigate the Hyvarinen score of the
combined posterior predictions. We demonstrate locking with an illustrative
example and discuss its practical application with importance sampling.
Related papers
- von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Predictive Modeling in the Reservoir Kernel Motif Space [0.9217021281095907]
This work proposes a time series prediction method based on the kernel view of linear reservoirs.
We provide a geometric interpretation of our approach shedding light on how our approach is related to the core reservoir models.
Empirical experiments then compare predictive performances of our suggested model with those of recent state-of-art transformer based models.
arXiv Detail & Related papers (2024-05-11T16:12:25Z) - BayesBlend: Easy Model Blending using Pseudo-Bayesian Model Averaging, Stacking and Hierarchical Stacking in Python [0.0]
We introduce the BayesBlend Python package to estimate weights and blend multiple (Bayesian) models' predictive distributions.
BayesBlend implements pseudo-Bayesian model averaging, stacking and, uniquely, hierarchical Bayesian stacking to estimate model weights.
We demonstrate the usage of BayesBlend with examples of insurance loss modeling.
arXiv Detail & Related papers (2024-04-30T19:15:33Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Bayesian Regression Approach for Building and Stacking Predictive Models
in Time Series Analytics [0.0]
The paper describes the use of Bayesian regression for building time series models and stacking different predictive models for time series.
It makes it possible to estimate an uncertainty of time series prediction and calculate value at risk characteristics.
arXiv Detail & Related papers (2022-01-06T12:58:23Z) - Mixtures of Gaussian Processes for regression under multiple prior
distributions [0.0]
We extend the idea of Mixture models for Gaussian Process regression in order to work with multiple prior beliefs at once.
We consider the usage of our approach to additionally account for the problem of prior misspecification in functional regression problems.
arXiv Detail & Related papers (2021-04-19T10:19:14Z) - Bayesian hierarchical stacking [10.371079239965836]
We show that stacking is most effective when the model predictive performance is heterogeneous in inputs.
With the input-varying yet partially-pooled model weights, hierarchical stacking improves average and conditional predictions.
arXiv Detail & Related papers (2021-01-22T05:19:49Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z) - Distributed Sketching Methods for Privacy Preserving Regression [54.51566432934556]
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches.
We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.
arXiv Detail & Related papers (2020-02-16T08:35:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.