Kernel Interpolation with Sparse Grids
- URL: http://arxiv.org/abs/2305.14451v1
- Date: Tue, 23 May 2023 18:17:49 GMT
- Title: Kernel Interpolation with Sparse Grids
- Authors: Mohit Yadav, Daniel Sheldon, Cameron Musco
- Abstract summary: We propose the use of sparse grids within the Structured kernel amenable (SKI) framework.
These grids enable accurate algebra, but with a number of points growing more slowly with dimension.
We show that SKI can be scaled to higher dimensions while maintaining accuracy.
- Score: 25.167018679176838
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structured kernel interpolation (SKI) accelerates Gaussian process (GP)
inference by interpolating the kernel covariance function using a dense grid of
inducing points, whose corresponding kernel matrix is highly structured and
thus amenable to fast linear algebra. Unfortunately, SKI scales poorly in the
dimension of the input points, since the dense grid size grows exponentially
with the dimension. To mitigate this issue, we propose the use of sparse grids
within the SKI framework. These grids enable accurate interpolation, but with a
number of points growing more slowly with dimension. We contribute a novel
nearly linear time matrix-vector multiplication algorithm for the sparse grid
kernel matrix. Next, we describe how sparse grids can be combined with an
efficient interpolation scheme based on simplices. With these changes, we
demonstrate that SKI can be scaled to higher dimensions while maintaining
accuracy.
Related papers
- Optimized Sparse Matrix Operations for Reverse Mode Automatic
Differentiation [3.72826300260966]
We present an implementation of a CSR-based sparse matrix wrapper for PyTorch with acceleration for basic matrix operations, as well as automatic differentiability.
We also present several applications of the resulting sparse kernels to optimization problems, demonstrating ease of implementation and performance measurements versus their dense counterparts.
arXiv Detail & Related papers (2022-12-10T00:46:51Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Learning in High-Dimensional Feature Spaces Using ANOVA-Based Fast
Matrix-Vector Multiplication [0.0]
kernel matrix is typically dense and large-scale. Depending on the dimension of the feature space even the computation of all of its entries in reasonable time becomes a challenging task.
We propose the use of an ANOVA kernel, where we construct several kernels based on lower-dimensional feature spaces for which we provide fast algorithms realizing the matrix-vector products.
Based on a feature grouping approach, we then show how the fast matrix-vector products can be embedded into a learning method choosing kernel ridge regression and the preconditioned conjugate gradient solver.
arXiv Detail & Related papers (2021-11-19T10:29:39Z) - Sparse Quadratic Optimisation over the Stiefel Manifold with Application
to Permutation Synchronisation [71.27989298860481]
We address the non- optimisation problem of finding a matrix on the Stiefel manifold that maximises a quadratic objective function.
We propose a simple yet effective sparsity-promoting algorithm for finding the dominant eigenspace matrix.
arXiv Detail & Related papers (2021-09-30T19:17:35Z) - SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice
for Scalable Gaussian Processes [39.821400341226315]
Structured Kernel Interpolation (SKI) framework is used to perform efficient matrix vector multiplies (MVMs) on a grid.
We develop a connection between SKI and the permutohedral lattice used for high-dimensional fast bilateral filtering.
Using a sparse simplicial grid instead of a dense rectangular one, we can perform GP inference exponentially faster in the dimension than SKI.
We additionally provide a implementation of Simplex-GP, which enables significant acceleration of MVM based inference.
arXiv Detail & Related papers (2021-06-12T06:04:56Z) - The Fast Kernel Transform [21.001203328543006]
We propose the Fast Kernel Transform (FKT), a general algorithm to compute matrix-vector multiplications for datasets in moderate dimensions with quasilinear complexity.
The FKT is easily applied to a broad class of kernels, including Gaussian, Matern, and Rational Quadratic covariance functions and physically motivated Green's functions.
We illustrate the efficacy and versatility of the FKT by providing timing and accuracy benchmarks and by applying it to scale the neighborhood embedding (t-SNE) and Gaussian processes to large real-world data sets.
arXiv Detail & Related papers (2021-06-08T16:15:47Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Square Root Bundle Adjustment for Large-Scale Reconstruction [56.44094187152862]
We propose a new formulation for the bundle adjustment problem which relies on nullspace marginalization of landmark variables by QR decomposition.
Our approach, which we call square root bundle adjustment, is algebraically equivalent to the commonly used Schur complement trick.
We show in real-world experiments with the BAL datasets that even in single precision the proposed solver achieves on average equally accurate solutions.
arXiv Detail & Related papers (2021-03-02T16:26:20Z) - Faster Kernel Interpolation for Gaussian Processes [30.04235162264955]
Key challenge in scaling Process (GP) regression to massive datasets is that exact inference requires a dense n x n kernel matrix.
Structured kernel (SKI) is among the most scalable methods.
We show that SKI can be reduced to O(m log m) after a single O(n) time precomputation step.
We demonstrate speedups in practice for a wide range of m and n and apply the method to GP inference on a three-dimensional weather radar dataset with over 100 million points.
arXiv Detail & Related papers (2021-01-28T00:09:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.