A Unifying Framework for Reinforcement Learning and Planning
- URL: http://arxiv.org/abs/2006.15009v4
- Date: Thu, 31 Mar 2022 08:06:35 GMT
- Title: A Unifying Framework for Reinforcement Learning and Planning
- Authors: Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
- Abstract summary: This paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP)
At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions.
- Score: 2.564530030795554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequential decision making, commonly formalized as optimization of a Markov
Decision Process, is a key challenge in artificial intelligence. Two successful
approaches to MDP optimization are reinforcement learning and planning, which
both largely have their own research communities. However, if both research
fields solve the same problem, then we might be able to disentangle the common
factors in their solution approaches. Therefore, this paper presents a unifying
algorithmic framework for reinforcement learning and planning (FRAP), which
identifies underlying dimensions on which MDP planning and learning algorithms
have to decide. At the end of the paper, we compare a variety of well-known
planning, model-free and model-based RL algorithms along these dimensions.
Altogether, the framework may help provide deeper insight in the algorithmic
design space of planning and reinforcement learning.
Related papers
- A Survey of Contextual Optimization Methods for Decision Making under
Uncertainty [47.73071218563257]
This review article identifies three main frameworks for learning policies from data and discusses their strengths and limitations.
We present the existing models and methods under a uniform notation and terminology and classify them according to the three main frameworks.
arXiv Detail & Related papers (2023-06-17T15:21:02Z) - The Statistical Complexity of Interactive Decision Making [126.04974881555094]
We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning.
A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
arXiv Detail & Related papers (2021-12-27T02:53:44Z) - Learning MDPs from Features: Predict-Then-Optimize for Sequential
Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning.
Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - Decision-Making Algorithms for Learning and Adaptation with Application
to COVID-19 Data [46.71828464689144]
This work focuses on the development of a new family of decision-making algorithms for adaptation and learning.
A key observation is that estimation and decision problems are structurally different and, therefore, algorithms that have proven successful for the former need not perform well when adjusted for decision problems.
arXiv Detail & Related papers (2020-12-14T18:24:45Z) - Abstract Value Iteration for Hierarchical Reinforcement Learning [23.08652058034536]
We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces.
A key challenge is that the ADP may not be Markov, which we address by proposing two algorithms for planning in the ADP.
Our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks.
arXiv Detail & Related papers (2020-10-29T14:41:42Z) - Model-based Reinforcement Learning: A Survey [2.564530030795554]
Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is a important challenge in artificial intelligence.
Two key approaches to this problem are reinforcement learning (RL) and planning.
This paper presents a survey of the integration of both fields, better known as model-based reinforcement learning.
arXiv Detail & Related papers (2020-06-30T12:10:07Z) - Towards Minimax Optimal Reinforcement Learning in Factored Markov
Decision Processes [53.72166325215299]
We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs)
First one achieves minimax optimal regret guarantees for a rich class of factored structures.
Second one enjoys better computational complexity with a slightly worse regret.
arXiv Detail & Related papers (2020-06-24T00:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.