Sub-quadratic Algorithms for Kernel Matrices via Kernel Density
Estimation
- URL: http://arxiv.org/abs/2212.00642v1
- Date: Thu, 1 Dec 2022 16:42:56 GMT
- Title: Sub-quadratic Algorithms for Kernel Matrices via Kernel Density
Estimation
- Authors: Ainesh Bakshi, Piotr Indyk, Praneeth Kacham, Sandeep Silwal and Samson
Zhou
- Abstract summary: We develop efficient reductions from $textitweighted edge sampling$ on kernel graphs, $textitsimulating random walks$ on kernel graphs, and $textitweighted sampling$ on matrices to Kernel Density Estimation.
Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest.
- Score: 24.166833799353476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Kernel matrices, as well as weighted graphs represented by them, are
ubiquitous objects in machine learning, statistics and other related fields.
The main drawback of using kernel methods (learning and inference using kernel
matrices) is efficiency -- given $n$ input points, most kernel-based algorithms
need to materialize the full $n \times n$ kernel matrix before performing any
subsequent computation, thus incurring $\Omega(n^2)$ runtime. Breaking this
quadratic barrier for various problems has therefore, been a subject of
extensive research efforts.
We break the quadratic barrier and obtain $\textit{subquadratic}$ time
algorithms for several fundamental linear-algebraic and graph processing
primitives, including approximating the top eigenvalue and eigenvector,
spectral sparsification, solving linear systems, local clustering, low-rank
approximation, arboricity estimation and counting weighted triangles. We build
on the recent Kernel Density Estimation framework, which (after preprocessing
in time subquadratic in $n$) can return estimates of row/column sums of the
kernel matrix. In particular, we develop efficient reductions from
$\textit{weighted vertex}$ and $\textit{weighted edge sampling}$ on kernel
graphs, $\textit{simulating random walks}$ on kernel graphs, and
$\textit{importance sampling}$ on matrices to Kernel Density Estimation and
show that we can generate samples from these distributions in
$\textit{sublinear}$ (in the support of the distribution) time. Our reductions
are the central ingredient in each of our applications and we believe they may
be of independent interest. We empirically demonstrate the efficacy of our
algorithms on low-rank approximation (LRA) and spectral sparsification, where
we observe a $\textbf{9x}$ decrease in the number of kernel evaluations over
baselines for LRA and a $\textbf{41x}$ reduction in the graph size for spectral
sparsification.
Related papers
- Data-Driven Linear Complexity Low-Rank Approximation of General Kernel
Matrices: A Geometric Approach [0.9453554184019107]
A kernel matrix may be defined as $K_ij = kappa(x_i,y_j)$ where $kappa(x,y)$ is a kernel function.
We seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large.
arXiv Detail & Related papers (2022-12-24T07:15:00Z) - Scalable First-Order Bayesian Optimization via Structured Automatic
Differentiation [4.061135251278187]
We show that a wide range of kernels gives rise to structured matrices, enabling an exact $mathcalO(n2d)$ matrix-vector multiply for gradient observations and $mathcalO(n2d2)$ for Hessian observations.
Our methods apply to virtually all canonical kernels and automatically extend to complex kernels, like the neural network, radial basis function network, and spectral mixture kernels.
arXiv Detail & Related papers (2022-06-16T17:59:48Z) - Leverage Score Sampling for Tensor Product Matrices in Input Sparsity
Time [54.65688986250061]
We give an input sparsity time sampling algorithm for approximating the Gram matrix corresponding to the $q$-fold column-wise tensor product of $q$ matrices.
Our sampling technique relies on a collection of $q$ partially correlated random projections which can be simultaneously applied to a dataset $X$ in total time.
arXiv Detail & Related papers (2022-02-09T15:26:03Z) - Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms.
We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$.
In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - Two-way kernel matrix puncturing: towards resource-efficient PCA and
spectral clustering [43.50783459690612]
The method consists in randomly "puncturing" both the data matrix $XinmathbbCptimes n$ and its corresponding kernel (Gram) matrix $K$ through Bernoulli masks.
We empirically confirm on GAN-generated image databases, that it is possible to drastically puncture the data, thereby providing possibly huge computational and storage gains.
arXiv Detail & Related papers (2021-02-24T14:01:58Z) - Faster Kernel Matrix Algebra via Density Estimation [46.253698241653254]
We study fast algorithms for computing fundamental properties of a positive semidefinite kernel matrix $K in mathbbRn times n$ corresponding to $n$ points.
arXiv Detail & Related papers (2021-02-16T18:25:47Z) - Quantum algorithms for spectral sums [50.045011844765185]
We propose new quantum algorithms for estimating spectral sums of positive semi-definite (PSD) matrices.
We show how the algorithms and techniques used in this work can be applied to three problems in spectral graph theory.
arXiv Detail & Related papers (2020-11-12T16:29:45Z) - Learning Sparse Graph Laplacian with K Eigenvector Prior via Iterative
GLASSO and Projection [58.5350491065936]
We consider a structural assumption on the graph Laplacian matrix $L$.
The first $K$ eigenvectors of $L$ are pre-selected, e.g., based on domain-specific criteria.
We design an efficient hybrid graphical lasso/projection algorithm to compute the most suitable graph Laplacian matrix $L* in H_u+$ given $barC$.
arXiv Detail & Related papers (2020-10-25T18:12:50Z) - Fourier Sparse Leverage Scores and Approximate Kernel Learning [29.104055676527913]
We prove new explicit upper bounds on the leverage scores of Fourier functions under both the Gaussian and Laplace measures.
Our bounds are motivated by two important applications in machine learning.
arXiv Detail & Related papers (2020-06-12T17:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.