Input Dependent Sparse Gaussian Processes
- URL: http://arxiv.org/abs/2107.07281v1
- Date: Thu, 15 Jul 2021 12:19:10 GMT
- Title: Input Dependent Sparse Gaussian Processes
- Authors: Bahram Jafrasteh and Carlos Villacampa-Calvo and Daniel
Hern\'andez-Lobato
- Abstract summary: We use a neural network that receives the observed data as an input and outputs the inducing points locations and the parameters of $q$.
We evaluate our method in several experiments, showing that it performs similar or better than other state-of-the-art sparse variational GP approaches.
- Score: 1.1470070927586014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian Processes (GPs) are Bayesian models that provide uncertainty
estimates associated to the predictions made. They are also very flexible due
to their non-parametric nature. Nevertheless, GPs suffer from poor scalability
as the number of training instances N increases. More precisely, they have a
cubic cost with respect to $N$. To overcome this problem, sparse GP
approximations are often used, where a set of $M \ll N$ inducing points is
introduced during training. The location of the inducing points is learned by
considering them as parameters of an approximate posterior distribution $q$.
Sparse GPs, combined with variational inference for inferring $q$, reduce the
training cost of GPs to $\mathcal{O}(M^3)$. Critically, the inducing points
determine the flexibility of the model and they are often located in regions of
the input space where the latent function changes. A limitation is, however,
that for some learning tasks a large number of inducing points may be required
to obtain a good prediction performance. To address this limitation, we propose
here to amortize the computation of the inducing points locations, as well as
the parameters of the variational posterior approximation q. For this, we use a
neural network that receives the observed data as an input and outputs the
inducing points locations and the parameters of $q$. We evaluate our method in
several experiments, showing that it performs similar or better than other
state-of-the-art sparse variational GP approaches. However, with our method the
number of inducing points is reduced drastically due to their dependency on the
input data. This makes our method scale to larger datasets and have faster
training and prediction times.
Related papers
- Local Prediction-Powered Inference [7.174572371800217]
This paper introduces a specific algorithm for local multivariable regression using PPI.
The confidence intervals, bias correction, and coverage probabilities are analyzed and proved the correctness and superiority of our algorithm.
arXiv Detail & Related papers (2024-09-26T22:15:53Z) - Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - A Coreset-based, Tempered Variational Posterior for Accurate and
Scalable Stochastic Gaussian Process Inference [2.7855886538423187]
We present a novel variational Gaussian process ($mathcalGP$) inference method, based on a posterior over a learnable set of weighted pseudo input-output points (coresets)
We deriveGP's lower bound for the log-marginal likelihood via marginalization of latent $mathcalGP$ coreset variables.
arXiv Detail & Related papers (2023-11-02T17:22:22Z) - Leveraging Locality and Robustness to Achieve Massively Scalable
Gaussian Process Regression [1.3518297878940662]
We introduce a new perspective by exploring robustness properties and limiting behaviour of GP nearest-neighbour (GPnn) prediction.
As the data-size n increases, accuracy of estimated parameters and GP model assumptions become increasingly irrelevant to GPnn predictive accuracy.
We show that this source of inaccuracy can be corrected for, thereby achieving both well-calibrated uncertainty measures and accurate predictions at remarkably low computational cost.
arXiv Detail & Related papers (2023-06-26T14:32:46Z) - Revisiting Active Sets for Gaussian Process Decoders [0.0]
We develop a new estimate of the log-marginal likelihood based on recently discovered links to cross-validation.
We demonstrate that the resulting active sets (SAS) approximation significantly improves the robustness of GP decoder training.
arXiv Detail & Related papers (2022-09-10T10:49:31Z) - Neural Greedy Pursuit for Feature Selection [72.4121881681861]
We propose a greedy algorithm to select $N$ important features among $P$ input features for a non-linear prediction problem.
We use neural networks as predictors in the algorithm to compute the loss.
arXiv Detail & Related papers (2022-07-19T16:39:16Z) - Adaptive Self-supervision Algorithms for Physics-informed Neural
Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function.
We study the impact of the location of the collocation points on the trainability of these models.
We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Sparse within Sparse Gaussian Processes using Neighbor Information [23.48831040972227]
We introduce a novel hierarchical prior, which imposes sparsity on the set of inducing variables.
We experimentally show considerable computational gains compared to standard sparse GPs.
arXiv Detail & Related papers (2020-11-10T11:07:53Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.