On the Laplace Approximation as Model Selection Criterion for Gaussian Processes
- URL: http://arxiv.org/abs/2403.09215v1
- Date: Thu, 14 Mar 2024 09:28:28 GMT
- Title: On the Laplace Approximation as Model Selection Criterion for Gaussian Processes
- Authors: Andreas Besginow, Jan David Hüwel, Thomas Pawellek, Christian Beecks, Markus Lange-Hegermann,
- Abstract summary: We introduce multiple metrics based on the Laplace approximation.
Experiments show that our metrics are comparable in quality to the gold standard dynamic nested sampling.
- Score: 6.990493129893112
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model selection aims to find the best model in terms of accuracy, interpretability or simplicity, preferably all at once. In this work, we focus on evaluating model performance of Gaussian process models, i.e. finding a metric that provides the best trade-off between all those criteria. While previous work considers metrics like the likelihood, AIC or dynamic nested sampling, they either lack performance or have significant runtime issues, which severely limits applicability. We address these challenges by introducing multiple metrics based on the Laplace approximation, where we overcome a severe inconsistency occuring during naive application of the Laplace approximation. Experiments show that our metrics are comparable in quality to the gold standard dynamic nested sampling without compromising for computational speed. Our model selection criteria allow significantly faster and high quality model selection of Gaussian process models.
Related papers
- Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Out-of-sample scoring and automatic selection of causal estimators [0.0]
We propose novel scoring approaches for both the CATE case and an important subset of instrumental variable problems.
We implement that in an open source package that relies on DoWhy and EconML libraries.
arXiv Detail & Related papers (2022-12-20T08:29:18Z) - A moment-matching metric for latent variable generative models [0.0]
In scope of Goodhart's law, when a metric becomes a target it ceases to be a good metric.
We propose a new metric for model comparison or regularization that relies on moments.
It is common to draw samples from the fitted distribution when evaluating latent variable models.
arXiv Detail & Related papers (2021-10-04T17:51:08Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Community Detection in the Stochastic Block Model by Mixed Integer
Programming [3.8073142980733]
Degree-Corrected Block Model (DCSBM) is a popular model to generate random graphs with community structure given an expected degree sequence.
Standard approach of community detection based on the DCSBM is to search for the model parameters that are the most likely to have produced the observed network data through maximum likelihood estimation (MLE)
We present mathematical programming formulations and exact solution methods that can provably find the model parameters and community assignments of maximum likelihood given an observed graph.
arXiv Detail & Related papers (2021-01-26T22:04:40Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Referenced Thermodynamic Integration for Bayesian Model Selection:
Application to COVID-19 Model Selection [1.9599274203282302]
We show how to compute the ratio of two models' normalising constants, known as the Bayes factor.
In this paper we apply a variation of the TI method, referred to as referenced TI, which computes a single model's normalising constant in an efficient way.
The approach is shown to be useful in practice when applied to a real problem - to perform model selection for a semi-mechanistic hierarchical Bayesian model of COVID-19 transmission in South Korea.
arXiv Detail & Related papers (2020-09-08T16:32:06Z) - Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization
without Compounding Errors [10.906666680425754]
We propose a Dyna-style model-based reinforcement learning algorithm, which we called Maximum Entropy Model Rollouts (MEMR)
To eliminate the compounding errors, we only use our model to generate single-step rollouts.
arXiv Detail & Related papers (2020-06-08T21:38:15Z) - Nonparametric Estimation in the Dynamic Bradley-Terry Model [69.70604365861121]
We develop a novel estimator that relies on kernel smoothing to pre-process the pairwise comparisons over time.
We derive time-varying oracle bounds for both the estimation error and the excess risk in the model-agnostic setting.
arXiv Detail & Related papers (2020-02-28T21:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.