Related papers: Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices

Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices

URL: http://arxiv.org/abs/2408.02346v1
Date: Mon, 5 Aug 2024 09:45:31 GMT
Title: Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices
Authors: Frida Viset, Anton Kullberg, Frederiek Wesel, Arno Solin,
Abstract summary: Hilbert-space Gaussian Process (HGP) approach offers hyper-independent basis function approximation for speeding up GP inference. In this paper, we lower this dominating computational complexity to $mathcalO(NM)$ with no additional approximations. Our contribution provides a pure speed-up of several existing, widely used, GP approximations, without further approximations.
Score: 14.25435308779899
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Hilbert-space Gaussian Process (HGP) approach offers a hyperparameter-independent basis function approximation for speeding up Gaussian Process (GP) inference by projecting the GP onto M basis functions. These properties result in a favorable data-independent $\mathcal{O}(M^3)$ computational complexity during hyperparameter optimization but require a dominating one-time precomputation of the precision matrix costing $\mathcal{O}(NM^2)$ operations. In this paper, we lower this dominating computational complexity to $\mathcal{O}(NM)$ with no additional approximations. We can do this because we realize that the precision matrix can be split into a sum of Hankel-Toeplitz matrices, each having $\mathcal{O}(M)$ unique entries. Based on this realization we propose computing only these unique entries at $\mathcal{O}(NM)$ costs. Further, we develop two theorems that prescribe sufficient conditions for the complexity reduction to hold generally for a wide range of other approximate GP models, such as the Variational Fourier Feature (VFF) approach. The two theorems do this with no assumptions on the data and no additional approximations of the GP models themselves. Thus, our contribution provides a pure speed-up of several existing, widely used, GP approximations, without further approximations.

Related papers

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation [53.17668583030862]
We study infinite-horizon average-reward Markov decision processes (AMDPs) in the context of general function approximation. We propose a novel algorithmic framework named Local-fitted Optimization with OPtimism (LOOP) We show that LOOP achieves a sublinear $tildemathcalO(mathrmpoly(d, mathrmsp(V*)) sqrtTbeta )$ regret, where $d$ and $beta$ correspond to AGEC and log-covering number of the hypothesis class respectively
arXiv Detail & Related papers (2024-04-19T06:24:22Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Efficient displacement convex optimization with particle gradient descent [57.88860627977882]
Particle gradient descent is widely used to optimize functions of probability measures. This paper considers particle gradient descent with a finite number of particles and establishes its theoretical guarantees to optimize functions that are emphdisplacement convex in measures.
arXiv Detail & Related papers (2023-02-09T16:35:59Z)
Sparse Kernel Gaussian Processes through Iterative Charted Refinement (ICR) [0.0]
We present a new, generative method named Iterative Charted Refinement (ICR) to model Gaussian Processes. ICR represents long- and short-range correlations by combining views of the modeled locations at varying resolutions with a user-provided coordinate chart. ICR outperforms existing methods in terms of computational speed by one order of magnitude on the CPU and GPU.
arXiv Detail & Related papers (2022-06-21T18:00:01Z)
Exact Gaussian Processes for Massive Datasets via Non-Stationary Sparsity-Discovering Kernels [0.0]
We propose to take advantage of naturally-structured sparsity by allowing the kernel to discover -- instead of induce -- sparse structure. The principle of ultra-flexible, compactly-supported, and non-stationary kernels, combined with HPC and constrained optimization, lets us scale exact GPs well beyond 5 million data points.
arXiv Detail & Related papers (2022-05-18T16:56:53Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication. We prove that preconditioning has an additional benefit that has been previously unexplored. It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z)
High-Dimensional Gaussian Process Inference with Derivatives [90.8033626920884]
We show that in the low-data regime $ND$, the Gram matrix can be decomposed in a manner that reduces the cost of inference to $mathcalO(N2D + (N2)3)$. We demonstrate this potential in a variety of tasks relevant for machine learning, such as optimization and Hamiltonian Monte Carlo with predictive gradients.
arXiv Detail & Related papers (2021-02-15T13:24:41Z)
Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations. In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z)
Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS [19.252319300590653]
A black-box function $f:mathcalX mapsto mathbbR$ is optimized under the assumption that $f$ is H"older smooth and has bounded norm in the RKHS associated with a given kernel $K$. In this paper, we propose a new algorithm (textttLP-GP-UCB) where the usual GP surrogate model is augmented with Local Polynomial (LP) estimators of the H" smooth parameter $f$.
arXiv Detail & Related papers (2020-05-11T01:55:39Z)
Efficient algorithms for multivariate shape-constrained convex regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints. For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.