Related papers: Non-asymptotic spectral bounds on the $\varepsilon$-entropy of kernel classes

Related papers

Sparsifying Suprema of Gaussian Processes [6.638504164134713]
We show that there is an $O_varepsilon(1)$-size subset $S subseteq T$ and a set of real values $c_s_s in S$. We also use our sparsification result for suprema of centered Gaussian processes to give a sparsification lemma for convex sets of bounded geometric width.
arXiv Detail & Related papers (2024-11-22T01:43:58Z)
The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$. As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z)
Gaussian kernel expansion with basis functions uniformly bounded in $\mathcal{L}_{\infty}$ [0.6138671548064355]
Kernel expansions are a topic of considerable interest in machine learning. Recent work in the literature has derived some of these results by assuming uniformly bounded basis functions in $mathcalL_infty$. Our main result is the construction on $mathbbR2$ of a Gaussian kernel expansion with weights in $ell_p$ for any $p>1$.
arXiv Detail & Related papers (2024-10-02T10:10:30Z)
KPZ scaling from the Krylov space [83.88591755871734]
Recently, a superdiffusion exhibiting the Kardar-Parisi-Zhang scaling in late-time correlators and autocorrelators has been reported. Inspired by these results, we explore the KPZ scaling in correlation functions using their realization in the Krylov operator basis.
arXiv Detail & Related papers (2024-06-04T20:57:59Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Dimension Independent Disentanglers from Unentanglement and Applications [55.86191108738564]
We construct a dimension-independent k-partite disentangler (like) channel from bipartite unentangled input. We show that to capture NEXP, it suffices to have unentangled proofs of the form $| psi rangle = sqrta | sqrt1-a | psi_+ rangle where $| psi_+ rangle has non-negative amplitudes.
arXiv Detail & Related papers (2024-02-23T12:22:03Z)
Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models. In this work, we initiate the study of provably learning a multi-head attention layer from random examples. We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously. Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples. We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z)
Kernel $\epsilon$-Greedy for Contextual Bandits [4.1347433277076036]
We consider a kernelized version of the $epsilon$-greedy strategy for contextual bandits. We propose an online weighted kernel ridge regression estimator for the reward functions.
arXiv Detail & Related papers (2023-06-29T22:48:34Z)
For Kernel Range Spaces a Constant Number of Queries Are Sufficient [13.200502573462712]
A kernel range space concerns a set of points $X subset mathbbRd$ and the space of all queries by a fixed kernel. Anvarepsilon$-cover is a subset of points $Q subset mathbbRd$ for any $p in mathbbRd$ that $frac1n |R_p - R_q|leq varepsilon$ for some $q in Q$.
arXiv Detail & Related papers (2023-06-28T19:19:33Z)
Gaussian random field approximation via Stein's method with applications to wide random neural networks [20.554836643156726]
We develop a novel Gaussian smoothing technique that allows us to transfer a bound in a smoother metric to the $W_$ distance. We obtain the first bounds on the Gaussian random field approximation of wide random neural networks. Our bounds are explicitly expressed in terms of the widths of the network and moments of the random weights.
arXiv Detail & Related papers (2023-06-28T15:35:10Z)
A Nearly Tight Bound for Fitting an Ellipsoid to Gaussian Random Points [50.90125395570797]
This nearly establishes a conjecture ofciteSaundersonCPW12, within logarithmic factors. The latter conjecture has attracted significant attention over the past decade, due to its connections to machine learning and sum-of-squares lower bounds for certain statistical problems.
arXiv Detail & Related papers (2022-12-21T17:48:01Z)
Sobolev Spaces, Kernels and Discrepancies over Hyperspheres [4.521119623956821]
This work provides theoretical foundations for kernel methods in the hyperspherical context. We characterise the native spaces (reproducing kernel Hilbert spaces) and the Sobolev spaces associated with kernels defined over hyperspheres. Our results have direct consequences for kernel cubature, determining the rate of convergence of the worst case error, and expanding the applicability of cubature algorithms.
arXiv Detail & Related papers (2022-11-16T20:31:38Z)
Continuous percolation in a Hilbert space for a large system of qubits [58.720142291102135]
The percolation transition is defined through the appearance of the infinite cluster. We show that the exponentially increasing dimensionality of the Hilbert space makes its covering by finite-size hyperspheres inefficient. Our approach to the percolation transition in compact metric spaces may prove useful for its rigorous treatment in other contexts.
arXiv Detail & Related papers (2022-10-15T13:53:21Z)
On the speed of uniform convergence in Mercer's theorem [6.028247638616059]
A continuous positive definite kernel $K(mathbf x, mathbf y)$ on a compact set can be represented as $sum_i=1infty lambda_iphi_i(mathbf x)phi_i(mathbf y)$ where $(lambda_i,phi_i)$ are eigenvalue-eigenvector pairs of the corresponding integral operator. We estimate the speed of this convergence in terms of the decay rate of eigenvalues and demonstrate that for $3m
arXiv Detail & Related papers (2022-05-01T15:07:57Z)
Unique Games hardness of Quantum Max-Cut, and a conjectured vector-valued Borell's inequality [6.621324975749854]
We show that the noise stability of a function $f:mathbbRn to -1, 1$ is the expected value of $f(boldsymbolx) cdot f(boldsymboly)$. We conjecture that the expected value of $langle f(boldsymbolx), f(boldsymboly)rangle$ is minimized by the function $f(x) = x_leq k / Vert x_leq k /
arXiv Detail & Related papers (2021-11-01T20:45:42Z)
On the Self-Penalization Phenomenon in Feature Selection [69.16452769334367]
We describe an implicit sparsity-inducing mechanism based on over a family of kernels. As an application, we use this sparsity-inducing mechanism to build algorithms consistent for feature selection.
arXiv Detail & Related papers (2021-10-12T09:36:41Z)
Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals. Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z)
Kernel Thinning [26.25415159542831]
kernel thinning is a new procedure for compressing a distribution $mathbbP$ more effectively than i.i.d. sampling or standard thinning. We derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat'ern, and B-spline kernels.
arXiv Detail & Related papers (2021-05-12T17:56:42Z)
Linear Bandits on Uniformly Convex Sets [88.3673525964507]
Linear bandit algorithms yield $tildemathcalO(nsqrtT)$ pseudo-regret bounds on compact convex action sets. Two types of structural assumptions lead to better pseudo-regret bounds.
arXiv Detail & Related papers (2021-03-10T07:33:03Z)
High-Dimensional Gaussian Process Inference with Derivatives [90.8033626920884]
We show that in the low-data regime $ND$, the Gram matrix can be decomposed in a manner that reduces the cost of inference to $mathcalO(N2D + (N2)3)$. We demonstrate this potential in a variety of tasks relevant for machine learning, such as optimization and Hamiltonian Monte Carlo with predictive gradients.
arXiv Detail & Related papers (2021-02-15T13:24:41Z)
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS [10.578438886506076]
We prove that the exponential power kernel with a smaller power (making the kernel less smooth) leads to a larger RKHS. We also prove that the reproducing kernel Hilbert spaces (RKHS) of a deep neural tangent kernel and the Laplace kernel include the same set of functions.
arXiv Detail & Related papers (2020-09-22T16:58:26Z)
Optimal Coreset for Gaussian Kernel Density Estimation [0.8376091455761259]
Given a point set $Psubset mathbbRd$, the kernel density estimate of $P$ is defined as [ overlinemathcalG_P(x) = frac1left|Pright|sum_pin Pe-leftlVert x-p rightrVert2 ] for any $xinmathbbRd$. We study how to construct a small subset $Q$ of $P
arXiv Detail & Related papers (2020-07-15T22:58:50Z)
Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and ReLUs under Gaussian Marginals [49.60752558064027]
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals. Our lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.
arXiv Detail & Related papers (2020-06-29T17:10:10Z)
Kernel-Based Reinforcement Learning: A Finite-Time Analysis [53.47210316424326]
We introduce Kernel-UCBVI, a model-based optimistic algorithm that leverages the smoothness of the MDP and a non-parametric kernel estimator of the rewards. We empirically validate our approach in continuous MDPs with sparse rewards.
arXiv Detail & Related papers (2020-04-12T12:23:46Z)
On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model. Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.