The Geometry of the Pivot: A Note on Lazy Pivoted Cholesky and Farthest Point Sampling
- URL: http://arxiv.org/abs/2601.03706v2
- Date: Thu, 08 Jan 2026 12:49:45 GMT
- Title: The Geometry of the Pivot: A Note on Lazy Pivoted Cholesky and Farthest Point Sampling
- Authors: Gil Shabat,
- Abstract summary: The Pivoted Cholesky decomposition is a standard tool for this task.<n>We elucidate the geometric interpretation of the algorithm within the Reproducing Kernel Hilbert Space.<n>We provide a concise derivation and a minimalist Python implementation to bridge the gap between theory and practice.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-rank approximations of large kernel matrices are ubiquitous in machine learning, particularly for scaling Gaussian Processes to massive datasets. The Pivoted Cholesky decomposition is a standard tool for this task, offering a computationally efficient, greedy low-rank approximation. While its algebraic properties are well-documented in numerical linear algebra, its geometric intuition within the context of kernel methods often remains obscure. In this note, we elucidate the geometric interpretation of the algorithm within the Reproducing Kernel Hilbert Space (RKHS). We demonstrate that the pivotal selection step is mathematically equivalent to Farthest Point Sampling (FPS) using the kernel metric, and that the Cholesky factor construction is an implicit Gram-Schmidt orthogonalization. We provide a concise derivation and a minimalist Python implementation to bridge the gap between theory and practice.
Related papers
- Classifiers in High Dimensional Hilbert Metrics [3.2990108763152164]
Classifying points in high dimensional spaces is a fundamental geometric problem in machine learning.<n>We first present an efficient LP-based algorithm in the Hilbert metric for the large-margin SVM problem.<n>We also present efficient algorithms for the soft-margin SVM problem and for nearest neighbor-based classification in the Hilbert metric.
arXiv Detail & Related papers (2026-01-19T21:42:02Z) - Tensor Sketch: Fast and Scalable Polynomial Kernel Approximation [19.363672064425504]
Approximation of non-linear kernels using random feature maps has become a powerful technique for scaling kernel methods to large datasets.<n>We provide theoretical guarantees on the approximation error, ensuring the fidelity of the resulting kernel function estimates.
arXiv Detail & Related papers (2025-05-13T00:47:17Z) - Contraction rates for conjugate gradient and Lanczos approximate posteriors in Gaussian process regression [0.0]
We analyze a class of recently proposed approximation algorithms from the field of Probabilistic numerics.
We combine result from the numerical analysis literature with state of the art concentration results for spectra of kernel matrices to obtain minimax contraction rates.
arXiv Detail & Related papers (2024-06-18T14:50:42Z) - Sketching the Heat Kernel: Using Gaussian Processes to Embed Data [4.220336689294244]
We introduce a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on realizations of a Gaussian process depending on the geometry of the data.
Our method demonstrates further advantage in its robustness to outliers.
arXiv Detail & Related papers (2024-03-01T22:56:19Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Sparse Cholesky Factorization for Solving Nonlinear PDEs via Gaussian
Processes [3.750429354590631]
We present a sparse Cholesky factorization algorithm for dense kernel matrices.
We numerically illustrate our algorithm's near-linear space/time complexity for a broad class of nonlinear PDEs.
arXiv Detail & Related papers (2023-04-03T18:35:28Z) - Isotropic Gaussian Processes on Finite Spaces of Graphs [71.26737403006778]
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs.
We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon.
Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
arXiv Detail & Related papers (2022-11-03T10:18:17Z) - Geometry-aware Bayesian Optimization in Robotics using Riemannian
Mat\'ern Kernels [64.62221198500467]
We show how to implement geometry-aware kernels for Bayesian optimization.
This technique can be used for control parameter tuning, parametric policy adaptation, and structure design in robotics.
arXiv Detail & Related papers (2021-11-02T09:47:22Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery.
We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability.
We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z) - Fourier Sparse Leverage Scores and Approximate Kernel Learning [29.104055676527913]
We prove new explicit upper bounds on the leverage scores of Fourier functions under both the Gaussian and Laplace measures.
Our bounds are motivated by two important applications in machine learning.
arXiv Detail & Related papers (2020-06-12T17:25:39Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.