High-Dimensional Gaussian Process Regression with Soft Kernel Interpolation
- URL: http://arxiv.org/abs/2410.21419v2
- Date: Sat, 10 May 2025 01:41:15 GMT
- Title: High-Dimensional Gaussian Process Regression with Soft Kernel Interpolation
- Authors: Chris CamaƱo, Daniel Huang,
- Abstract summary: We introduce Soft Kernel Interpolation (SoftKI), a method that combines aspects of Structured Kernel Interpolation (SKI) and variational inducing point methods.<n>SoftKI approximates a kernel via softmax from a smaller number of points learned.<n>We demonstrate the effectiveness of SoftKI across various examples and show that it is competitive with other approximated GP methods when the data dimensionality is modest.
- Score: 0.8057006406834466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Soft Kernel Interpolation (SoftKI), a method that combines aspects of Structured Kernel Interpolation (SKI) and variational inducing point methods, to achieve scalable Gaussian Process (GP) regression on high-dimensional datasets. SoftKI approximates a kernel via softmax interpolation from a smaller number of interpolation points learned by optimizing a combination of the SoftKI marginal log-likelihood (MLL), and when needed, an approximate MLL for improved numerical stability. Consequently, it can overcome the dimensionality scaling challenges that SKI faces when interpolating from a dense and static lattice while retaining the flexibility of variational methods to adapt inducing points to the dataset. We demonstrate the effectiveness of SoftKI across various examples and show that it is competitive with other approximated GP methods when the data dimensionality is modest (around 10).
Related papers
- Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes [9.913418444556486]
We propose Vecchia-inducing-points full-scale (VIF) approximations for Gaussian processes.<n>We show that VIF approximations are both computationally efficient as well as more accurate and numerically stable than state-of-the-art alternatives.<n>All methods are implemented in the open source C++ library GPBoost with high-level Python and R interfaces.
arXiv Detail & Related papers (2025-07-07T14:49:06Z) - Scaling Gaussian Process Regression with Full Derivative Observations [0.951828574518325]
We present a scalable Gaussian Process (GP) method that can fit and predict full derivative observations called DSoftKI.<n>DSoftKI extends SoftKI, a method that approximates a kernel via softmax from learned point locations, to the setting with derivatives.<n>We evaluate DSoftKI on a synthetic function benchmark and high-dimensional molecular force field prediction (100-1000 dimensions)
arXiv Detail & Related papers (2025-05-14T04:35:26Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Adaptive Consensus Gradients Aggregation for Scaled Distributed Training [6.234802839923543]
We analyze the distributed gradient aggregation process through the lens of subspace optimization.
Our method demonstrates improved performance over the ubiquitous averaging on multiple tasks while remaining extremely efficient in both communicational and computational complexity.
arXiv Detail & Related papers (2024-11-06T08:16:39Z) - Large-scale magnetic field maps using structured kernel interpolation
for Gaussian process regression [0.9208007322096532]
We present a mapping algorithm to compute large-scale magnetic field maps in indoor environments.
In our simulations, we show that our method achieves better accuracy than current state-of-the-art methods on magnetic field maps with a growing mapping area.
arXiv Detail & Related papers (2023-10-25T11:58:18Z) - SIGMA: Scale-Invariant Global Sparse Shape Matching [50.385414715675076]
We propose a novel mixed-integer programming (MIP) formulation for generating precise sparse correspondences for non-rigid shapes.
We show state-of-the-art results for sparse non-rigid matching on several challenging 3D datasets.
arXiv Detail & Related papers (2023-08-16T14:25:30Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Interactive Segmentation as Gaussian Process Classification [58.44673380545409]
Click-based interactive segmentation (IS) aims to extract the target objects under user interaction.
Most of the current deep learning (DL)-based methods mainly follow the general pipelines of semantic segmentation.
We propose to formulate the IS task as a Gaussian process (GP)-based pixel-wise binary classification model on each image.
arXiv Detail & Related papers (2023-02-28T14:01:01Z) - Inverse Kernel Decomposition [3.066967635405937]
We propose a novel nonlinear dimensionality reduction method -- Inverse Kernel Decomposition (IKD)
IKD is based on an eigen-decomposition of the sample covariance matrix of data.
We use synthetic datasets and four real-world datasets to show that IKD is a better dimensionality reduction method than other eigen-decomposition-based methods.
arXiv Detail & Related papers (2022-11-11T02:14:29Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Exact Gaussian Processes for Massive Datasets via Non-Stationary
Sparsity-Discovering Kernels [0.0]
We propose to take advantage of naturally-structured sparsity by allowing the kernel to discover -- instead of induce -- sparse structure.
The principle of ultra-flexible, compactly-supported, and non-stationary kernels, combined with HPC and constrained optimization, lets us scale exact GPs well beyond 5 million data points.
arXiv Detail & Related papers (2022-05-18T16:56:53Z) - Robust Training in High Dimensions via Block Coordinate Geometric Median
Descent [69.47594803719333]
Geometric median (textGm) is a classical method in statistics for achieving a robust estimation of the uncorrupted data.
In this paper, we show that by that by applying textscGm to only a chosen block of coordinates at a time, one can retain a breakdown point of 0.5 judiciously for smooth nontext problems.
arXiv Detail & Related papers (2021-06-16T15:55:50Z) - SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice
for Scalable Gaussian Processes [39.821400341226315]
Structured Kernel Interpolation (SKI) framework is used to perform efficient matrix vector multiplies (MVMs) on a grid.
We develop a connection between SKI and the permutohedral lattice used for high-dimensional fast bilateral filtering.
Using a sparse simplicial grid instead of a dense rectangular one, we can perform GP inference exponentially faster in the dimension than SKI.
We additionally provide a implementation of Simplex-GP, which enables significant acceleration of MVM based inference.
arXiv Detail & Related papers (2021-06-12T06:04:56Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Faster Kernel Interpolation for Gaussian Processes [30.04235162264955]
Key challenge in scaling Process (GP) regression to massive datasets is that exact inference requires a dense n x n kernel matrix.
Structured kernel (SKI) is among the most scalable methods.
We show that SKI can be reduced to O(m log m) after a single O(n) time precomputation step.
We demonstrate speedups in practice for a wide range of m and n and apply the method to GP inference on a three-dimensional weather radar dataset with over 100 million points.
arXiv Detail & Related papers (2021-01-28T00:09:22Z) - Fast Gravitational Approach for Rigid Point Set Registration with
Ordinary Differential Equations [79.71184760864507]
This article introduces a new physics-based method for rigid point set alignment called Fast Gravitational Approach (FGA)
In FGA, the source and target point sets are interpreted as rigid particle swarms with masses interacting in a globally multiply-linked manner while moving in a simulated gravitational force field.
We show that the new method class has characteristics not found in previous alignment methods.
arXiv Detail & Related papers (2020-09-28T15:05:39Z) - Sparse Gaussian Processes via Parametric Families of Compactly-supported
Kernels [0.6091702876917279]
We propose a method for deriving parametric families of kernel functions with compact support.
The parameters of this family of kernels can be learned from data using maximum likelihood estimation.
We show that these approximations incur minimal error over the exact models when modeling data drawn directly from a target GP.
arXiv Detail & Related papers (2020-06-05T20:44:09Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.