Non-asymptotic spectral bounds on the $\varepsilon$-entropy of kernel classes
- URL: http://arxiv.org/abs/2204.04512v2
- Date: Mon, 30 Dec 2024 17:41:16 GMT
- Title: Non-asymptotic spectral bounds on the $\varepsilon$-entropy of kernel classes
- Authors: Rustem Takhanov,
- Abstract summary: This topic is an important direction in the modern statistical theory of kernel-based methods.
We discuss a number of consequences of our bounds and show that they are substantially tighter than bounds for general kernels.
- Score: 4.178980693837599
- License:
- Abstract: Let $K: \boldsymbol{\Omega}\times \boldsymbol{\Omega}$ be a continuous Mercer kernel defined on a compact subset of ${\mathbb R}^n$ and $\mathcal{H}_K$ be the reproducing kernel Hilbert space (RKHS) associated with $K$. Given a finite measure $\nu$ on $\boldsymbol{\Omega}$, we investigate upper and lower bounds on the $\varepsilon$-entropy of the unit ball of $\mathcal{H}_K$ in the space $L_p(\nu)$. This topic is an important direction in the modern statistical theory of kernel-based methods. We prove sharp upper and lower bounds for $p\in [1,+\infty]$. For $p\in [1,2]$, the upper bounds are determined solely by the eigenvalue behaviour of the corresponding integral operator $\phi\to \int_{\boldsymbol{\Omega}} K(\cdot,{\mathbf y})\phi({\mathbf y})d\nu({\mathbf y})$. In constrast, for $p>2$, the bounds additionally depend on the convergence rate of the truncated Mercer series to the kernel $K$ in the $L_p(\nu)$-norm. We discuss a number of consequences of our bounds and show that they are substantially tighter than previous bounds for general kernels. Furthermore, for specific cases, such as zonal kernels and the Gaussian kernel on a box, our bounds are asymptotically tight as $\varepsilon\to +0$.
Related papers
- Sparsifying Suprema of Gaussian Processes [6.638504164134713]
We show that there is an $O_varepsilon(1)$-size subset $S subseteq T$ and a set of real values $c_s_s in S$.
We also use our sparsification result for suprema of centered Gaussian processes to give a sparsification lemma for convex sets of bounded geometric width.
arXiv Detail & Related papers (2024-11-22T01:43:58Z) - The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Dimension Independent Disentanglers from Unentanglement and Applications [55.86191108738564]
We construct a dimension-independent k-partite disentangler (like) channel from bipartite unentangled input.
We show that to capture NEXP, it suffices to have unentangled proofs of the form $| psi rangle = sqrta | sqrt1-a | psi_+ rangle where $| psi_+ rangle has non-negative amplitudes.
arXiv Detail & Related papers (2024-02-23T12:22:03Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - For Kernel Range Spaces a Constant Number of Queries Are Sufficient [13.200502573462712]
A kernel range space concerns a set of points $X subset mathbbRd$ and the space of all queries by a fixed kernel.
Anvarepsilon$-cover is a subset of points $Q subset mathbbRd$ for any $p in mathbbRd$ that $frac1n |R_p - R_q|leq varepsilon$ for some $q in Q$.
arXiv Detail & Related papers (2023-06-28T19:19:33Z) - On the speed of uniform convergence in Mercer's theorem [6.028247638616059]
A continuous positive definite kernel $K(mathbf x, mathbf y)$ on a compact set can be represented as $sum_i=1infty lambda_iphi_i(mathbf x)phi_i(mathbf y)$ where $(lambda_i,phi_i)$ are eigenvalue-eigenvector pairs of the corresponding integral operator.
We estimate the speed of this convergence in terms of the decay rate of eigenvalues and demonstrate that for $3m
arXiv Detail & Related papers (2022-05-01T15:07:57Z) - On the Self-Penalization Phenomenon in Feature Selection [69.16452769334367]
We describe an implicit sparsity-inducing mechanism based on over a family of kernels.
As an application, we use this sparsity-inducing mechanism to build algorithms consistent for feature selection.
arXiv Detail & Related papers (2021-10-12T09:36:41Z) - Kernel Thinning [26.25415159542831]
kernel thinning is a new procedure for compressing a distribution $mathbbP$ more effectively than i.i.d. sampling or standard thinning.
We derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat'ern, and B-spline kernels.
arXiv Detail & Related papers (2021-05-12T17:56:42Z) - Linear Bandits on Uniformly Convex Sets [88.3673525964507]
Linear bandit algorithms yield $tildemathcalO(nsqrtT)$ pseudo-regret bounds on compact convex action sets.
Two types of structural assumptions lead to better pseudo-regret bounds.
arXiv Detail & Related papers (2021-03-10T07:33:03Z) - Optimal Coreset for Gaussian Kernel Density Estimation [0.8376091455761259]
Given a point set $Psubset mathbbRd$, the kernel density estimate of $P$ is defined as [ overlinemathcalG_P(x) = frac1left|Pright|sum_pin Pe-leftlVert x-p rightrVert2 ] for any $xinmathbbRd$.
We study how to construct a small subset $Q$ of $P
arXiv Detail & Related papers (2020-07-15T22:58:50Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.