Related papers: Fast and Scalable Spike and Slab Variable Selection in High-Dimensional Gaussian Processes

Fast and Scalable Spike and Slab Variable Selection in High-Dimensional Gaussian Processes

URL: http://arxiv.org/abs/2111.04558v1
Date: Mon, 8 Nov 2021 15:13:24 GMT
Title: Fast and Scalable Spike and Slab Variable Selection in High-Dimensional Gaussian Processes
Authors: Hugh Dance and Brooks Paige
Abstract summary: We develop a fast and scalable variational inference algorithm for the spike and slab GP that is tractable with arbitrary differentiable kernels. In experiments our method consistently outperforms vanilla and sparse variational GPs whilst retaining similar runtimes.
Score: 12.667478571732449
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Variable selection in Gaussian processes (GPs) is typically undertaken by thresholding the inverse lengthscales of `automatic relevance determination' kernels, but in high-dimensional datasets this approach can be unreliable. A more probabilistically principled alternative is to use spike and slab priors and infer a posterior probability of variable inclusion. However, existing implementations in GPs are extremely costly to run in both high-dimensional and large-$n$ datasets, or are intractable for most kernels. As such, we develop a fast and scalable variational inference algorithm for the spike and slab GP that is tractable with arbitrary differentiable kernels. We improve our algorithm's ability to adapt to the sparsity of relevant variables by Bayesian model averaging over hyperparameters, and achieve substantial speed ups using zero temperature posterior restrictions, dropout pruning and nearest neighbour minibatching. In experiments our method consistently outperforms vanilla and sparse variational GPs whilst retaining similar runtimes (even when $n=10^6$) and performs competitively with a spike and slab GP using MCMC but runs up to $1000$ times faster.

Related papers

STRIDE: Sparse Techniques for Regression in Deep Gaussian Processes [0.3277163122167433]
We develop a particle-based expectation expectation training method for deep GP training on large-scale data.<n>We test our method on standard benchmark problems.
arXiv Detail & Related papers (2025-05-16T15:18:15Z)
Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data [2.8377382540923004]
The Gaussian process (GP) is a widely used machine learning method with implicit uncertainty characterization. Traditional implementations of GPs involve stationary kernels that limit their flexibility. We derive an alternative kernel that can discover and encode both sparsity and nonstationarity.
arXiv Detail & Related papers (2024-11-07T20:07:21Z)
Sparse Kernel Gaussian Processes through Iterative Charted Refinement (ICR) [0.0]
We present a new, generative method named Iterative Charted Refinement (ICR) to model Gaussian Processes. ICR represents long- and short-range correlations by combining views of the modeled locations at varying resolutions with a user-provided coordinate chart. ICR outperforms existing methods in terms of computational speed by one order of magnitude on the CPU and GPU.
arXiv Detail & Related papers (2022-06-21T18:00:01Z)
Shallow and Deep Nonparametric Convolutions for Gaussian Processes [0.0]
We introduce a nonparametric process convolution formulation for GPs that alleviates weaknesses by using a functional sampling approach. We propose a composition of these nonparametric convolutions that serves as an alternative to classic deep GP models.
arXiv Detail & Related papers (2022-06-17T19:03:04Z)
Exact Gaussian Processes for Massive Datasets via Non-Stationary Sparsity-Discovering Kernels [0.0]
We propose to take advantage of naturally-structured sparsity by allowing the kernel to discover -- instead of induce -- sparse structure. The principle of ultra-flexible, compactly-supported, and non-stationary kernels, combined with HPC and constrained optimization, lets us scale exact GPs well beyond 5 million data points.
arXiv Detail & Related papers (2022-05-18T16:56:53Z)
Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times [119.41129787351092]
We show that sequential black-box optimization based on GPs can be made efficient by sticking to a candidate solution for multiple evaluation steps. We modify two well-established GP-Opt algorithms, GP-UCB and GP-EI to adapt rules from batched GP-Opt.
arXiv Detail & Related papers (2022-01-30T20:42:14Z)
Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them. NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication. We prove that preconditioning has an additional benefit that has been previously unexplored. It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z)
Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models. This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models. We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z)
MuyGPs: Scalable Gaussian Process Hyperparameter Estimation Using Local Cross-Validation [1.2233362977312945]
We present MuyGPs, a novel efficient GP hyper parameter estimation method. MuyGPs builds upon prior methods that take advantage of the nearest neighbors structure of the data. We show that our method outperforms all known competitors both in terms of time-to-solution and the root mean squared error of the predictions.
arXiv Detail & Related papers (2021-04-29T18:10:21Z)
Likelihood-Free Inference with Deep Gaussian Processes [70.74203794847344]
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations. We propose a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions. Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases.
arXiv Detail & Related papers (2020-06-18T14:24:05Z)
Quadruply Stochastic Gaussian Processes [10.152838128195466]
We introduce a variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, $n$, and the number basis functions used in the kernel approximation, $m$. We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to $m = 107$ basis functions.
arXiv Detail & Related papers (2020-06-04T17:06:25Z)
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification [119.41129787351092]
We introduce BBKB, the first no-regret GP optimization algorithm that provably runs in near-linear time and selects candidates in batches. We show that the same bound can be used to adaptively delay costly updates to the sparse GP approximation, achieving a near-constant per-step amortized cost.
arXiv Detail & Related papers (2020-02-23T17:43:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.