Exact Gaussian Processes for Massive Datasets via Non-Stationary
Sparsity-Discovering Kernels
- URL: http://arxiv.org/abs/2205.09070v1
- Date: Wed, 18 May 2022 16:56:53 GMT
- Title: Exact Gaussian Processes for Massive Datasets via Non-Stationary
Sparsity-Discovering Kernels
- Authors: Marcus M. Noack, Harinarayan Krishnan, Mark D. Risser, Kristofer G.
Reyes
- Abstract summary: We propose to take advantage of naturally-structured sparsity by allowing the kernel to discover -- instead of induce -- sparse structure.
The principle of ultra-flexible, compactly-supported, and non-stationary kernels, combined with HPC and constrained optimization, lets us scale exact GPs well beyond 5 million data points.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Gaussian Process (GP) is a prominent mathematical framework for stochastic
function approximation in science and engineering applications. This success is
largely attributed to the GP's analytical tractability, robustness,
non-parametric structure, and natural inclusion of uncertainty quantification.
Unfortunately, the use of exact GPs is prohibitively expensive for large
datasets due to their unfavorable numerical complexity of $O(N^3)$ in
computation and $O(N^2)$ in storage. All existing methods addressing this issue
utilize some form of approximation -- usually considering subsets of the full
dataset or finding representative pseudo-points that render the covariance
matrix well-structured and sparse. These approximate methods can lead to
inaccuracies in function approximations and often limit the user's flexibility
in designing expressive kernels. Instead of inducing sparsity via data-point
geometry and structure, we propose to take advantage of naturally-occurring
sparsity by allowing the kernel to discover -- instead of induce -- sparse
structure. The premise of this paper is that GPs, in their most native form,
are often naturally sparse, but commonly-used kernels do not allow us to
exploit this sparsity. The core concept of exact, and at the same time sparse
GPs relies on kernel definitions that provide enough flexibility to learn and
encode not only non-zero but also zero covariances. This principle of
ultra-flexible, compactly-supported, and non-stationary kernels, combined with
HPC and constrained optimization, lets us scale exact GPs well beyond 5 million
data points.
Related papers
- Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices [14.25435308779899]
Hilbert-space Gaussian Process (HGP) approach offers hyper-independent basis function approximation for speeding up GP inference.
In this paper, we lower this dominating computational complexity to $mathcalO(NM)$ with no additional approximations.
Our contribution provides a pure speed-up of several existing, widely used, GP approximations, without further approximations.
arXiv Detail & Related papers (2024-08-05T09:45:31Z) - Higher-order topological kernels via quantum computation [68.8204255655161]
Topological data analysis (TDA) has emerged as a powerful tool for extracting meaningful insights from complex data.
We propose a quantum approach to defining Betti kernels, which is based on constructing Betti curves with increasing order.
arXiv Detail & Related papers (2023-07-14T14:48:52Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - FaDIn: Fast Discretized Inference for Hawkes Processes with General
Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support.
The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG)
Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z) - Sparse Kernel Gaussian Processes through Iterative Charted Refinement
(ICR) [0.0]
We present a new, generative method named Iterative Charted Refinement (ICR) to model Gaussian Processes.
ICR represents long- and short-range correlations by combining views of the modeled locations at varying resolutions with a user-provided coordinate chart.
ICR outperforms existing methods in terms of computational speed by one order of magnitude on the CPU and GPU.
arXiv Detail & Related papers (2022-06-21T18:00:01Z) - Shallow and Deep Nonparametric Convolutions for Gaussian Processes [0.0]
We introduce a nonparametric process convolution formulation for GPs that alleviates weaknesses by using a functional sampling approach.
We propose a composition of these nonparametric convolutions that serves as an alternative to classic deep GP models.
arXiv Detail & Related papers (2022-06-17T19:03:04Z) - Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method.
It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate.
We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - Sparse Gaussian Processes via Parametric Families of Compactly-supported
Kernels [0.6091702876917279]
We propose a method for deriving parametric families of kernel functions with compact support.
The parameters of this family of kernels can be learned from data using maximum likelihood estimation.
We show that these approximations incur minimal error over the exact models when modeling data drawn directly from a target GP.
arXiv Detail & Related papers (2020-06-05T20:44:09Z) - Quadruply Stochastic Gaussian Processes [10.152838128195466]
We introduce a variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, $n$, and the number basis functions used in the kernel approximation, $m$.
We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to $m = 107$ basis functions.
arXiv Detail & Related papers (2020-06-04T17:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.