On the speed of uniform convergence in Mercer's theorem
- URL: http://arxiv.org/abs/2205.00487v1
- Date: Sun, 1 May 2022 15:07:57 GMT
- Title: On the speed of uniform convergence in Mercer's theorem
- Authors: Rustem Takhanov
- Abstract summary: A continuous positive definite kernel $K(mathbf x, mathbf y)$ on a compact set can be represented as $sum_i=1infty lambda_iphi_i(mathbf x)phi_i(mathbf y)$ where $(lambda_i,phi_i)$ are eigenvalue-eigenvector pairs of the corresponding integral operator.
We estimate the speed of this convergence in terms of the decay rate of eigenvalues and demonstrate that for $3m
- Score: 6.028247638616059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The classical Mercer's theorem claims that a continuous positive definite
kernel $K({\mathbf x}, {\mathbf y})$ on a compact set can be represented as
$\sum_{i=1}^\infty \lambda_i\phi_i({\mathbf x})\phi_i({\mathbf y})$ where
$\{(\lambda_i,\phi_i)\}$ are eigenvalue-eigenvector pairs of the corresponding
integral operator. This infinite representation is known to converge uniformly
to the kernel $K$. We estimate the speed of this convergence in terms of the
decay rate of eigenvalues and demonstrate that for $3m$ times differentiable
kernels the first $N$ terms of the series approximate $K$ as
$\mathcal{O}\big((\sum_{i=N+1}^\infty\lambda_i)^{\frac{m}{m+n}}\big)$ or
$\mathcal{O}\big((\sum_{i=N+1}^\infty\lambda^2_i)^{\frac{m}{2m+n}}\big)$.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Quantum Algorithms and Lower Bounds for Finite-Sum Optimization [22.076317220348145]
We give a quantum algorithm with complexity $tildeObig(n+sqrtd+sqrtell/mubig)$, improving the classical tight bound $tildeThetabig(n+sqrtnell/mubig)$.
We also prove a quantum lower bound $tildeOmega(n+n3/4(ell/mu)1/4)$ when $d$ is large enough.
arXiv Detail & Related papers (2024-06-05T07:13:52Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [59.65871549878937]
This paper considers the RMSProp and its momentum extension and establishes the convergence rate of $frac1Tsum_k=1T.
Our convergence rate matches the lower bound with respect to all the coefficients except the dimension $d$.
Our convergence rate can be considered to be analogous to the $frac1Tsum_k=1T.
arXiv Detail & Related papers (2024-02-01T07:21:32Z) - Spectral Statistics of the Sample Covariance Matrix for High Dimensional
Linear Gaussians [12.524855369455421]
Performance of ordinary least squares(OLS) method for the emphestimation of high dimensional stable state transition matrix.
OLS estimator incurs a emphphase transition and becomes emphtransient: increasing only worsens estimation error.
arXiv Detail & Related papers (2023-12-10T06:55:37Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Classical shadows of fermions with particle number symmetry [0.0]
We provide an estimator for any $k$-RDM with $mathcalO(k2eta)$ classical complexity.
Our method, in the worst-case of half-filling, still provides a factor of $4k$ advantage in sample complexity.
arXiv Detail & Related papers (2022-08-18T17:11:12Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Topological entanglement and hyperbolic volume [1.1909611351044664]
Chern-Simons theory provides setting to visualise the $m$-moment of reduced density matrix as a three-manifold invariant $Z(M_mathcalK_m)$.
For SU(2) group, we show that $Z(M_mathcalK_m)$ can grow at mostly in $k$.
We conjecture that $ln Z(M_mathcalK_m)$ is the hyperbolic volume of the knot complement $S3backslash mathcalK_m
arXiv Detail & Related papers (2021-06-07T07:51:03Z) - Kernel Thinning [26.25415159542831]
kernel thinning is a new procedure for compressing a distribution $mathbbP$ more effectively than i.i.d. sampling or standard thinning.
We derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat'ern, and B-spline kernels.
arXiv Detail & Related papers (2021-05-12T17:56:42Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.