Adaptive Cholesky Gaussian Processes
- URL: http://arxiv.org/abs/2202.10769v2
- Date: Wed, 23 Feb 2022 11:23:11 GMT
- Title: Adaptive Cholesky Gaussian Processes
- Authors: Simon Bartels, Kristoffer Stensbo-Smidt, Pablo Moreno-Mu\~noz, Wouter
Boomsma, Jes Frellsen, S{\o}ren Hauberg
- Abstract summary: We present a method to fit exact Gaussian process models to large datasets by considering only a subset of the data.
Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead.
- Score: 7.684183064816171
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a method to fit exact Gaussian process models to large datasets by
considering only a subset of the data. Our approach is novel in that the size
of the subset is selected on the fly during exact inference with little
computational overhead. From an empirical observation that the log-marginal
likelihood often exhibits a linear trend once a sufficient subset of a dataset
has been observed, we conclude that many large datasets contain redundant
information that only slightly affects the posterior. Based on this, we provide
probabilistic bounds on the full model evidence that can identify such subsets.
Remarkably, these bounds are largely composed of terms that appear in
intermediate steps of the standard Cholesky decomposition, allowing us to
modify the algorithm to adaptively stop the decomposition once enough data have
been observed. Empirically, we show that our method can be directly plugged
into well-known inference schemes to fit exact Gaussian process models to large
datasets.
Related papers
- Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - Implicit Manifold Gaussian Process Regression [49.0787777751317]
Gaussian process regression is widely used to provide well-calibrated uncertainty estimates.
It struggles with high-dimensional data because of the implicit low-dimensional manifold upon which the data actually lies.
In this paper we propose a technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way.
arXiv Detail & Related papers (2023-10-30T09:52:48Z) - Manifold Learning with Sparse Regularised Optimal Transport [0.17205106391379024]
Real-world datasets are subject to noisy observations and sampling, so that distilling information about the underlying manifold is a major challenge.
We propose a method for manifold learning that utilises a symmetric version of optimal transport with a quadratic regularisation.
We prove that the resulting kernel is consistent with a Laplace-type operator in the continuous limit, establish robustness to heteroskedastic noise and exhibit these results in simulations.
arXiv Detail & Related papers (2023-07-19T08:05:46Z) - Sketched Gaussian Model Linear Discriminant Analysis via the Randomized
Kaczmarz Method [7.593861427248019]
We present sketched linear discriminant analysis, an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data.
We harness a least squares formulation and mobilize the descent gradient framework.
We present convergence guarantees for the sketched predictions on new data within a fixed number of iterations.
arXiv Detail & Related papers (2022-11-10T18:29:36Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Probabilistic Registration for Gaussian Process 3D shape modelling in
the presence of extensive missing data [63.8376359764052]
We propose a shape fitting/registration method based on a Gaussian Processes formulation, suitable for shapes with extensive regions of missing data.
Experiments are conducted both for a 2D small dataset with diverse transformations and a 3D dataset of ears.
arXiv Detail & Related papers (2022-03-26T16:48:27Z) - Regularization of Mixture Models for Robust Principal Graph Learning [0.0]
A regularized version of Mixture Models is proposed to learn a principal graph from a distribution of $D$-dimensional data points.
Parameters of the model are iteratively estimated through an Expectation-Maximization procedure.
arXiv Detail & Related papers (2021-06-16T18:00:02Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.