Sample Complexity of Kernel-Based Q-Learning
- URL: http://arxiv.org/abs/2302.00727v1
- Date: Wed, 1 Feb 2023 19:46:25 GMT
- Title: Sample Complexity of Kernel-Based Q-Learning
- Authors: Sing-Yuan Yeh, Fu-Chieh Chang, Chang-Wei Yueh, Pei-Yuan Wu, Alberto
Bernacchia, Sattar Vakili
- Abstract summary: We propose a non-parametric Q-learning algorithm which finds an $epsilon$-optimal policy in an arbitrarily large scale discounted MDP.
To the best of our knowledge, this is the first result showing a finite sample complexity under such a general model.
- Score: 11.32718794195643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern reinforcement learning (RL) often faces an enormous state-action
space. Existing analytical results are typically for settings with a small
number of state-actions, or simple models such as linearly modeled Q-functions.
To derive statistically efficient RL policies handling large state-action
spaces, with more general Q-functions, some recent works have considered
nonlinear function approximation using kernel ridge regression. In this work,
we derive sample complexities for kernel based Q-learning when a generative
model exists. We propose a nonparametric Q-learning algorithm which finds an
$\epsilon$-optimal policy in an arbitrarily large scale discounted MDP. The
sample complexity of the proposed algorithm is order optimal with respect to
$\epsilon$ and the complexity of the kernel (in terms of its information gain).
To the best of our knowledge, this is the first result showing a finite sample
complexity under such a general model.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - On Representation Complexity of Model-based and Model-free Reinforcement
Learning [11.843778337443824]
We study the representation complexity of model-based and model-free reinforcement learning (RL) in the context of circuit complexity.
We prove theoretically that there exists a broad class of MDPs such that their underlying transition and reward functions can be represented by constant depth circuits with size.
By drawing attention to the approximation errors and building connections to complexity theory, our theory provides unique insights into why model-based algorithms usually enjoy better sample complexity than model-free algorithms from a novel representation complexity perspective.
arXiv Detail & Related papers (2023-10-03T00:01:58Z) - Sample-efficient Model-based Reinforcement Learning for Quantum Control [0.2999888908665658]
We propose a model-based reinforcement learning (RL) approach for noisy time-dependent gate optimization.
We show an order of magnitude advantage in the sample complexity of our method over standard model-free RL.
Our algorithm is well suited for controlling partially characterised one and two qubit systems.
arXiv Detail & Related papers (2023-04-19T15:05:19Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Sample-Efficient Reinforcement Learning Is Feasible for Linearly
Realizable MDPs with Limited Revisiting [60.98700344526674]
Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning.
In this paper, we investigate a new sampling protocol, which draws samples in an online/exploratory fashion but allows one to backtrack and revisit previous states in a controlled and infrequent manner.
We develop an algorithm tailored to this setting, achieving a sample complexity that scales practicallyly with the feature dimension, the horizon, and the inverse sub-optimality gap, but not the size of the state/action space.
arXiv Detail & Related papers (2021-05-17T17:22:07Z) - Bilinear Classes: A Structural Framework for Provable Generalization in
RL [119.42509700822484]
Bilinear Classes is a new structural framework which permits generalization in reinforcement learning.
The framework incorporates nearly all existing models in which a sample complexity is achievable.
Our main result provides an RL algorithm which has sample complexity for Bilinear Classes.
arXiv Detail & Related papers (2021-03-19T16:34:20Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.