Finite-Time Analysis of Asynchronous Q-learning under Diminishing
Step-Size from Control-Theoretic View
- URL: http://arxiv.org/abs/2207.12217v1
- Date: Mon, 25 Jul 2022 14:15:55 GMT
- Title: Finite-Time Analysis of Asynchronous Q-learning under Diminishing
Step-Size from Control-Theoretic View
- Authors: Han-Dong Lim, Donghwan Lee
- Abstract summary: This paper investigates new finite-time analysis of asynchronous Q-learning under Markovian observation models.
In particular, we introduce a discrete-time time-varying switching system model of Q-learning with diminishing step-sizes.
The proposed analysis brings in additional insights, covers different scenarios, and provides new simplified templates for analysis.
- Score: 3.5823366350053325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Q-learning has long been one of the most popular reinforcement learning
algorithms, and theoretical analysis of Q-learning has been an active research
topic for decades. Although researches on asymptotic convergence analysis of
Q-learning have a long tradition, non-asymptotic convergence has only recently
come under active study. The main goal of this paper is to investigate new
finite-time analysis of asynchronous Q-learning under Markovian observation
models via a control system viewpoint. In particular, we introduce a
discrete-time time-varying switching system model of Q-learning with
diminishing step-sizes for our analysis, which significantly improves recent
development of the switching system analysis with constant step-sizes, and
leads to \(\mathcal{O}\left( \sqrt{\frac{\log k}{k}} \right)\) convergence rate
that is comparable to or better than most of the state of the art results in
the literature. In the mean while, a technique using the similarly
transformation is newly applied to avoid the difficulty in the analysis posed
by diminishing step-sizes. The proposed analysis brings in additional insights,
covers different scenarios, and provides new simplified templates for analysis
to deepen our understanding on Q-learning via its unique connection to
discrete-time switching systems.
Related papers
- Unified ODE Analysis of Smooth Q-Learning Algorithms [5.152147416671501]
Recently, an convergence analysis for Q-learning was introduced using a switching system framework.
We present a more general and unified convergence analysis that improves upon the switching system approach.
arXiv Detail & Related papers (2024-04-20T01:16:27Z) - Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach [4.36117236405564]
Soft Q-learning is a variation of Q-learning designed to solve entropy regularized Markov decision problems.
This paper aims to offer a novel and unified finite-time, control-theoretic analysis of soft Q-learning algorithms.
arXiv Detail & Related papers (2024-03-11T01:36:37Z) - It Is Time To Steer: A Scalable Framework for Analysis-driven Attack Graph Generation [50.06412862964449]
Attack Graph (AG) represents the best-suited solution to support cyber risk assessment for multi-step attacks on computer networks.
Current solutions propose to address the generation problem from the algorithmic perspective and postulate the analysis only after the generation is complete.
This paper rethinks the classic AG analysis through a novel workflow in which the analyst can query the system anytime.
arXiv Detail & Related papers (2023-12-27T10:44:58Z) - Unifying (Quantum) Statistical and Parametrized (Quantum) Algorithms [65.268245109828]
We take inspiration from Kearns' SQ oracle and Valiant's weak evaluation oracle.
We introduce an extensive yet intuitive framework that yields unconditional lower bounds for learning from evaluation queries.
arXiv Detail & Related papers (2023-10-26T18:23:21Z) - Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum
Markov Games: Switching System Approach [3.5823366350053325]
This paper investigates the finite-time analysis of a Q-learning algorithm applied to two-player zero-sum Markov games.
We employ the switching system model of minimax Q-learning and the associated value iteration.
arXiv Detail & Related papers (2023-06-09T06:39:37Z) - Classical-to-Quantum Transfer Learning Facilitates Machine Learning with Variational Quantum Circuit [62.55763504085508]
We prove that a classical-to-quantum transfer learning architecture using a Variational Quantum Circuit (VQC) improves the representation and generalization (estimation error) capabilities of the VQC model.
We show that the architecture of classical-to-quantum transfer learning leverages pre-trained classical generative AI models, making it easier to find the optimal parameters for the VQC in the training stage.
arXiv Detail & Related papers (2023-05-18T03:08:18Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - An Analysis of Quantile Temporal-Difference Learning [53.36758478669685]
quantile temporal-difference learning (QTD) has proven to be a key component in several successful large-scale applications of reinforcement learning.
Unlike classical TD learning, QTD updates do not approximate contraction mappings, are highly non-linear, and may have multiple fixed points.
This paper is a proof of convergence to the fixed points of a related family of dynamic programming procedures with probability 1.
arXiv Detail & Related papers (2023-01-11T13:41:56Z) - Finite-Time Analysis of Asynchronous Q-Learning with Discrete-Time
Switching System Models [6.85316573653194]
We prove that Q-learning with a constant step-size can be naturally formulated as discrete-time switched linear systems.
It offers novel and intuitive insights on Q-learning mainly based on controloretic frameworks.
arXiv Detail & Related papers (2021-02-17T05:32:07Z) - Momentum Q-learning with Finite-Sample Convergence Guarantee [49.38471009162477]
This paper analyzes a class of momentum-based Q-learning algorithms with finite-sample guarantee.
We establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling.
We demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.
arXiv Detail & Related papers (2020-07-30T12:27:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.