Optimal Rates for Regularized Conditional Mean Embedding Learning
- URL: http://arxiv.org/abs/2208.01711v3
- Date: Tue, 12 Dec 2023 10:06:41 GMT
- Title: Optimal Rates for Regularized Conditional Mean Embedding Learning
- Authors: Zhu Li, Dimitri Meunier, Mattes Mollenhauer, Arthur Gretton
- Abstract summary: We derive a novel and adaptive statistical learning rate for the empirical CME estimator under the misspecified setting.
Our analysis reveals that our rates match the optimal $O(log n / n)$ rates without assuming $mathcalH_Y$ to be finite dimensional.
- Score: 32.870965795423366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We address the consistency of a kernel ridge regression estimate of the
conditional mean embedding (CME), which is an embedding of the conditional
distribution of $Y$ given $X$ into a target reproducing kernel Hilbert space
$\mathcal{H}_Y$. The CME allows us to take conditional expectations of target
RKHS functions, and has been employed in nonparametric causal and Bayesian
inference. We address the misspecified setting, where the target CME is in the
space of Hilbert-Schmidt operators acting from an input interpolation space
between $\mathcal{H}_X$ and $L_2$, to $\mathcal{H}_Y$. This space of operators
is shown to be isomorphic to a newly defined vector-valued interpolation space.
Using this isomorphism, we derive a novel and adaptive statistical learning
rate for the empirical CME estimator under the misspecified setting. Our
analysis reveals that our rates match the optimal $O(\log n / n)$ rates without
assuming $\mathcal{H}_Y$ to be finite dimensional. We further establish a lower
bound on the learning rate, which shows that the obtained upper bound is
optimal.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications [2.0584253077707477]
We study the convergence properties of the Gradient Descent (SGD) method for finding a stationary point of an objective function $J(cdot)$.
Our results apply to a class of invex'' functions, which have the property that every stationary point is also a global minimizer.
arXiv Detail & Related papers (2023-12-05T15:22:39Z) - Rates of Convergence in Certain Native Spaces of Approximations used in
Reinforcement Learning [0.0]
This paper studies convergence rates for some value function approximations that arise in a collection of reproducing kernel Hilbert spaces (RKHS) $H(Omega)$.
Explicit upper bounds on error in value function and controller approximations are derived in terms of power function $mathcalP_H,N$ for the space of finite dimensional approximants $H_N$ in the native space $H(Omega)$.
arXiv Detail & Related papers (2023-09-14T02:02:08Z) - Statistical Learning under Heterogeneous Distribution Shift [71.8393170225794]
Ground-truth predictor is additive $mathbbE[mathbfz mid mathbfx,mathbfy] = f_star(mathbfx) +g_star(mathbfy)$.
arXiv Detail & Related papers (2023-02-27T16:34:21Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Provably Breaking the Quadratic Error Compounding Barrier in Imitation
Learning, Optimally [58.463668865380946]
We study the statistical limits of Imitation Learning in episodic Markov Decision Processes (MDPs) with a state space $mathcalS$.
We establish an upper bound $O(|mathcalS|H3/2/N)$ for the suboptimality using the Mimic-MD algorithm in Rajaraman et al ( 2020)
We show the minimax suboptimality grows as $Omega( H3/2/N)$ when $mathcalS|geq 3$ while the unknown-transition setting suffers from a larger sharp rate
arXiv Detail & Related papers (2021-02-25T15:50:19Z) - Nonparametric approximation of conditional expectation operators [0.3655021726150368]
We investigate the approximation of the $L2$-operator defined by $[Pf](x) := mathbbE[ f(Y) mid X = x ]$ under minimal assumptions.
We prove that $P$ can be arbitrarily well approximated in operator norm by Hilbert-Schmidt operators acting on a reproducing kernel space.
arXiv Detail & Related papers (2020-12-23T19:06:12Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Learning the Hypotheses Space from data Part II: Convergence and
Feasibility [0.0]
We show consistency of a model selection framework based on Learning Spaces.
We show that it is feasible to learn the Hypotheses Space from data.
arXiv Detail & Related papers (2020-01-30T21:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.