Related papers: On the speed of uniform convergence in Mercer's theorem

On the speed of uniform convergence in Mercer's theorem

URL: http://arxiv.org/abs/2205.00487v1
Date: Sun, 1 May 2022 15:07:57 GMT
Title: On the speed of uniform convergence in Mercer's theorem
Authors: Rustem Takhanov
Abstract summary: A continuous positive definite kernel $K(mathbf x, mathbf y)$ on a compact set can be represented as $sum_i=1infty lambda_iphi_i(mathbf x)phi_i(mathbf y)$ where $(lambda_i,phi_i)$ are eigenvalue-eigenvector pairs of the corresponding integral operator. We estimate the speed of this convergence in terms of the decay rate of eigenvalues and demonstrate that for $3m
Score: 6.028247638616059
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The classical Mercer's theorem claims that a continuous positive definite kernel $K({\mathbf x}, {\mathbf y})$ on a compact set can be represented as $\sum_{i=1}^\infty \lambda_i\phi_i({\mathbf x})\phi_i({\mathbf y})$ where $\{(\lambda_i,\phi_i)\}$ are eigenvalue-eigenvector pairs of the corresponding integral operator. This infinite representation is known to converge uniformly to the kernel $K$. We estimate the speed of this convergence in terms of the decay rate of eigenvalues and demonstrate that for $3m$ times differentiable kernels the first $N$ terms of the series approximate $K$ as $\mathcal{O}\big((\sum_{i=N+1}^\infty\lambda_i)^{\frac{m}{m+n}}\big)$ or $\mathcal{O}\big((\sum_{i=N+1}^\infty\lambda^2_i)^{\frac{m}{2m+n}}\big)$.

Related papers

On the $O(\rac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm [54.28350823319057]
This paper establishes the convergence rate $frac1Ksum_k=1KEleft[|nabla f(xk)|_1right]leq O(fracsqrtdCK1/4) for AdamW measured by $ell_$ norm, where $K$ represents the iteration number, $d denotes the model dimension, and $C$ matches the constant in the optimal convergence rate of SGD.
arXiv Detail & Related papers (2025-05-17T05:02:52Z)
Infinite series involving special functions obtained using simple one-dimensional quantum mechanical problems [0.0]
In this paper certain classes of infinite sums involving special functions are evaluated analytically. $L_nu2n+1-nuleft(fracnu+22;frac32;frac12right)$ is generalized hypergeometric function, $L_nu2n+1-nuleft(fracnu+22;frac32;frac12right)$ is calculated for integer $nu$.
arXiv Detail & Related papers (2024-11-15T11:51:36Z)
The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$. As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z)
Quantum Algorithms and Lower Bounds for Finite-Sum Optimization [22.076317220348145]
We give a quantum algorithm with complexity $tildeObig(n+sqrtd+sqrtell/mubig)$, improving the classical tight bound $tildeThetabig(n+sqrtnell/mubig)$. We also prove a quantum lower bound $tildeOmega(n+n3/4(ell/mu)1/4)$ when $d$ is large enough.
arXiv Detail & Related papers (2024-06-05T07:13:52Z)
Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models. In this work, we initiate the study of provably learning a multi-head attention layer from random examples. We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z)
On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [59.65871549878937]
This paper considers the RMSProp and its momentum extension and establishes the convergence rate of $frac1Tsum_k=1T. Our convergence rate matches the lower bound with respect to all the coefficients except the dimension $d$. Our convergence rate can be considered to be analogous to the $frac1Tsum_k=1T.
arXiv Detail & Related papers (2024-02-01T07:21:32Z)
Spectral Statistics of the Sample Covariance Matrix for High Dimensional Linear Gaussians [12.524855369455421]
Performance of ordinary least squares(OLS) method for the emphestimation of high dimensional stable state transition matrix. OLS estimator incurs a emphphase transition and becomes emphtransient: increasing only worsens estimation error.
arXiv Detail & Related papers (2023-12-10T06:55:37Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously. Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples. We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z)
Classical shadows of fermions with particle number symmetry [0.0]
We provide an estimator for any $k$-RDM with $mathcalO(k2eta)$ classical complexity. Our method, in the worst-case of half-filling, still provides a factor of $4k$ advantage in sample complexity.
arXiv Detail & Related papers (2022-08-18T17:11:12Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations. The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z)
Non-asymptotic spectral bounds on the $\varepsilon$-entropy of kernel classes [4.178980693837599]
This topic is an important direction in the modern statistical theory of kernel-based methods. We discuss a number of consequences of our bounds and show that they are substantially tighter than bounds for general kernels.
arXiv Detail & Related papers (2022-04-09T16:45:22Z)
Topological entanglement and hyperbolic volume [1.1909611351044664]
Chern-Simons theory provides setting to visualise the $m$-moment of reduced density matrix as a three-manifold invariant $Z(M_mathcalK_m)$. For SU(2) group, we show that $Z(M_mathcalK_m)$ can grow at mostly in $k$. We conjecture that $ln Z(M_mathcalK_m)$ is the hyperbolic volume of the knot complement $S3backslash mathcalK_m
arXiv Detail & Related papers (2021-06-07T07:51:03Z)
Kernel Thinning [26.25415159542831]
kernel thinning is a new procedure for compressing a distribution $mathbbP$ more effectively than i.i.d. sampling or standard thinning. We derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat'ern, and B-spline kernels.
arXiv Detail & Related papers (2021-05-12T17:56:42Z)
Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z)
On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model. Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.