Temporal-Differential Learning in Continuous Environments
- URL: http://arxiv.org/abs/2006.00997v1
- Date: Mon, 1 Jun 2020 15:01:03 GMT
- Title: Temporal-Differential Learning in Continuous Environments
- Authors: Tao Bian and Zhong-Ping Jiang
- Abstract summary: A new reinforcement learning (RL) method known as the method of temporal differential is introduced.
It plays a crucial role in developing novel RL techniques for continuous environments.
- Score: 12.982941756429952
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, a new reinforcement learning (RL) method known as the method
of temporal differential is introduced. Compared to the traditional
temporal-difference learning method, it plays a crucial role in developing
novel RL techniques for continuous environments. In particular, the
continuous-time least squares policy evaluation (CT-LSPE) and the
continuous-time temporal-differential (CT-TD) learning methods are developed.
Both theoretical and empirical evidences are provided to demonstrate the
effectiveness of the proposed temporal-differential learning methodology.
Related papers
- A Unified and General Framework for Continual Learning [58.72671755989431]
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge.
Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques.
This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies.
arXiv Detail & Related papers (2024-03-20T02:21:44Z) - Revisiting the Temporal Modeling in Spatio-Temporal Predictive Learning
under A Unified View [73.73667848619343]
We introduce USTEP (Unified S-TEmporal Predictive learning), an innovative framework that reconciles the recurrent-based and recurrent-free methods by integrating both micro-temporal and macro-temporal scales.
arXiv Detail & Related papers (2023-10-09T16:17:42Z) - The Statistical Benefits of Quantile Temporal-Difference Learning for
Value Estimation [53.53493178394081]
We analyse the use of a distributional reinforcement learning algorithm, quantile temporal-difference learning (QTD)
Even if a practitioner has no interest in the return distribution beyond the mean, QTD may offer performance superior to approaches such as classical TD learning.
arXiv Detail & Related papers (2023-05-28T10:52:46Z) - Backstepping Temporal Difference Learning [3.5823366350053325]
We propose a new convergent algorithm for off-policy TD-learning.
Our method relies on the backstepping technique, which is widely used in nonlinear control theory.
convergence of the proposed algorithm is experimentally verified in environments where the standard TD-learning is known to be unstable.
arXiv Detail & Related papers (2023-02-20T10:06:49Z) - Finite-Time Analysis of Temporal Difference Learning: Discrete-Time
Linear System Perspective [3.5823366350053325]
TD-learning is a fundamental algorithm in the field of reinforcement learning (RL)
Recent research has uncovered guarantees concerning its statistical efficiency by developing finite-time error bounds.
arXiv Detail & Related papers (2022-04-22T03:21:30Z) - Control Theoretic Analysis of Temporal Difference Learning [7.191780076353627]
TD-learning serves as a cornerstone in the realm of reinforcement learning.
We introduce a finite-time, control-theoretic framework for analyzing TD-learning.
arXiv Detail & Related papers (2021-12-29T06:43:29Z) - Online Bootstrap Inference For Policy Evaluation in Reinforcement
Learning [90.59143158534849]
The recent emergence of reinforcement learning has created a demand for robust statistical inference methods.
Existing methods for statistical inference in online learning are restricted to settings involving independently sampled observations.
The online bootstrap is a flexible and efficient approach for statistical inference in linear approximation algorithms, but its efficacy in settings involving Markov noise has yet to be explored.
arXiv Detail & Related papers (2021-08-08T18:26:35Z) - Towards Continual Reinforcement Learning: A Review and Perspectives [69.48324517535549]
We aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL)
While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners.
These include applications such as those in the fields of healthcare, education, logistics, and robotics.
arXiv Detail & Related papers (2020-12-25T02:35:27Z) - Heterogeneous Knowledge Distillation using Information Flow Modeling [82.83891707250926]
We propose a novel KD method that works by modeling the information flow through the various layers of the teacher model.
The proposed method is capable of overcoming the aforementioned limitations by using an appropriate supervision scheme during the different phases of the training process.
arXiv Detail & Related papers (2020-05-02T06:56:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.