Related papers: Fast Spectrum Estimation of Some Kernel Matrices

Fast Spectrum Estimation of Some Kernel Matrices

URL: http://arxiv.org/abs/2411.00657v1
Date: Fri, 01 Nov 2024 15:19:54 GMT
Title: Fast Spectrum Estimation of Some Kernel Matrices
Authors: Mikhail Lepilov,
Abstract summary: We introduce a new eigenvalue quantile estimation framework for some kernel matrices. This framework gives meaningful bounds for all the eigenvalues of a kernel matrix while avoiding the cost of constructing the full matrix. We prove the efficacy of this framework given certain bounds on the kernel function, and we provide empirical evidence for its accuracy.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In data science, individual observations are often assumed to come independently from an underlying probability space. Kernel matrices formed from large sets of such observations arise frequently, for example during classification tasks. It is desirable to know the eigenvalue decay properties of these matrices without explicitly forming them, such as when determining if a low-rank approximation is feasible. In this work, we introduce a new eigenvalue quantile estimation framework for some kernel matrices. This framework gives meaningful bounds for all the eigenvalues of a kernel matrix while avoiding the cost of constructing the full matrix. The kernel matrices under consideration come from a kernel with quick decay away from the diagonal applied to uniformly-distributed sets of points in Euclidean space of any dimension. We prove the efficacy of this framework given certain bounds on the kernel function, and we provide empirical evidence for its accuracy. In the process, we also prove a very general interlacing-type theorem for finite sets of numbers. Additionally, we indicate an application of this framework to the study of the intrinsic dimension of data, as well as several other directions in which to generalize this work.

Related papers

Spectral Estimation with Free Decompression [47.81955761814048]
We introduce a novel method of "free decompression" to estimate the spectrum of very large (impalpable) matrices.<n>Our method can be used to extrapolate from the empirical spectral densities of small submatrices to infer the eigenspectrum of extremely large (impalpable) matrices.
arXiv Detail & Related papers (2025-06-13T17:49:25Z)
Determinant Estimation under Memory Constraints and Neural Scaling Laws [48.68885778257016]
We derive a novel hierarchical algorithm for large-scale log-determinant calculation in memory-constrained settings. We show that the ratio of pseudo-determinants satisfies a power-law relationship, allowing us to derive corresponding scaling laws. This enables accurate estimation of NTK log-determinants from a tiny fraction of the full dataset.
arXiv Detail & Related papers (2025-03-06T13:32:13Z)
Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry [63.694184882697435]
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
arXiv Detail & Related papers (2024-07-15T07:11:44Z)
Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition. A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues. We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z)
Higher-order topological kernels via quantum computation [68.8204255655161]
Topological data analysis (TDA) has emerged as a powerful tool for extracting meaningful insights from complex data. We propose a quantum approach to defining Betti kernels, which is based on constructing Betti curves with increasing order.
arXiv Detail & Related papers (2023-07-14T14:48:52Z)
Interpolation with the polynomial kernels [5.8720142291102135]
kernels are widely used in machine learning and they are one of the default choices to develop kernel-based regression models. They are rarely used and considered in numerical analysis due to their lack of strict positive definiteness. This paper is devoted to establish some initial results for the study of these kernels, and their related algorithms.
arXiv Detail & Related papers (2022-12-15T08:30:23Z)
An Equivalence Principle for the Spectrum of Random Inner-Product Kernel Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics. We establish the weak limit of the empirical distribution of these random matrices in a scaling regime. Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z)
Revisiting Memory Efficient Kernel Approximation: An Indefinite Learning Perspective [0.8594140167290097]
Matrix approximations are a key element in large-scale machine learning approaches. We extend MEKA to be applicable not only for shift-invariant kernels but also for non-stationary kernels. We present a Lanczos-based estimation of a spectrum shift to develop a stable positive semi-definite MEKA approximation.
arXiv Detail & Related papers (2021-12-18T10:01:34Z)
Learning in High-Dimensional Feature Spaces Using ANOVA-Based Fast Matrix-Vector Multiplication [0.0]
kernel matrix is typically dense and large-scale. Depending on the dimension of the feature space even the computation of all of its entries in reasonable time becomes a challenging task. We propose the use of an ANOVA kernel, where we construct several kernels based on lower-dimensional feature spaces for which we provide fast algorithms realizing the matrix-vector products. Based on a feature grouping approach, we then show how the fast matrix-vector products can be embedded into a learning method choosing kernel ridge regression and the preconditioned conjugate gradient solver.
arXiv Detail & Related papers (2021-11-19T10:29:39Z)
Robust 1-bit Compressive Sensing with Partial Gaussian Circulant Matrices and Generative Priors [54.936314353063494]
We provide recovery guarantees for a correlation-based optimization algorithm for robust 1-bit compressive sensing. We make use of a practical iterative algorithm, and perform numerical experiments on image datasets to corroborate our results.
arXiv Detail & Related papers (2021-08-08T05:28:06Z)
A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense. We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z)
High-Dimensional Gaussian Process Inference with Derivatives [90.8033626920884]
We show that in the low-data regime $ND$, the Gram matrix can be decomposed in a manner that reduces the cost of inference to $mathcalO(N2D + (N2)3)$. We demonstrate this potential in a variety of tasks relevant for machine learning, such as optimization and Hamiltonian Monte Carlo with predictive gradients.
arXiv Detail & Related papers (2021-02-15T13:24:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.