Upper Bounds for Learning in Reproducing Kernel Hilbert Spaces for Orbits of an Iterated Function System
- URL: http://arxiv.org/abs/2410.08361v1
- Date: Thu, 10 Oct 2024 20:34:22 GMT
- Title: Upper Bounds for Learning in Reproducing Kernel Hilbert Spaces for Orbits of an Iterated Function System
- Authors: Priyanka Roy, Susanne Saminger-Platz,
- Abstract summary: A key problem in learning theory is to compute a function $f$ that closely approximates the relationship between some input $x$ and corresponding output $y$.
This approximation is based on sample points $(x_t,y_t)_t=1m$, where the function $f$ can be approximated within reproducing kernel Hilbert spaces.
- Score: 1.1510009152620668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the key problems in learning theory is to compute a function $f$ that closely approximates the relationship between some input $x$ and corresponding output $y$, such that $y\approx f(x)$. This approximation is based on sample points $(x_t,y_t)_{t=1}^{m}$, where the function $f$ can be approximated within reproducing kernel Hilbert spaces using various learning algorithms. In the context of learning theory, it is usually customary to assume that the sample points are drawn independently and identically distributed (i.i.d.) from an unknown underlying distribution. However, we relax this i.i.d. assumption by considering an input sequence $(x_t)_{t\in {\mathbb N}}$ as a trajectory generated by an iterated function system, which forms a particular Markov chain, with $(y_t)_{t\in {\mathbb N}}$ corresponding to an observation sequence when the model is in the corresponding state $x_t$. For such a process, we approximate the function $f$ using the Markov chain stochastic gradient algorithm and estimate the error by deriving upper bounds within reproducing kernel Hilbert spaces.
Related papers
- Near-Optimal and Tractable Estimation under Shift-Invariance [0.21756081703275998]
Class of all such signals is but extremely rich: it contains all exponential oscillations over $mathbbCn$ with total degree $s$.
We show that the statistical complexity of this class, as measured by the radius squared minimax frequencies of the $(delta)$-confidence $ell$-ball, is nearly the same as for the class of $s$-sparse signals, namely $Oleft(slog(en) + log(delta-1)right) cdot log(en/s)
arXiv Detail & Related papers (2024-11-05T18:11:23Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - On the Self-Penalization Phenomenon in Feature Selection [69.16452769334367]
We describe an implicit sparsity-inducing mechanism based on over a family of kernels.
As an application, we use this sparsity-inducing mechanism to build algorithms consistent for feature selection.
arXiv Detail & Related papers (2021-10-12T09:36:41Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Optimal Spectral Recovery of a Planted Vector in a Subspace [80.02218763267992]
We study efficient estimation and detection of a planted vector $v$ whose $ell_4$ norm differs from that of a Gaussian vector with the same $ell$ norm.
We show that in the regime $n rho gg sqrtN$, any spectral method from a large class (and more generally, any low-degree of the input) fails to detect the planted vector.
arXiv Detail & Related papers (2021-05-31T16:10:49Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z) - Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation [30.137884459159107]
We consider the question of learning $Q$-function in a sample efficient manner for reinforcement learning with continuous state and action spaces.
We develop a simple, iterative learning algorithm that finds $epsilon$-Schmidt $Q$-function with sample complexity of $widetildeO(frac1epsilonmax(d1), d_2)+2)$ when the optimal $Q$-function has low rank $r$ and the factor $$ is below a certain threshold.
arXiv Detail & Related papers (2020-06-11T00:55:35Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.