Related papers: A Unifying Framework for Reinforcement Learning and Planning

A Unifying Framework for Reinforcement Learning and Planning

URL: http://arxiv.org/abs/2006.15009v4
Date: Thu, 31 Mar 2022 08:06:35 GMT
Title: A Unifying Framework for Reinforcement Learning and Planning
Authors: Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
Abstract summary: This paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP) At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions.
Score: 2.564530030795554
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sequential decision making, commonly formalized as optimization of a Markov Decision Process, is a key challenge in artificial intelligence. Two successful approaches to MDP optimization are reinforcement learning and planning, which both largely have their own research communities. However, if both research fields solve the same problem, then we might be able to disentangle the common factors in their solution approaches. Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide. At the end of the paper, we compare a variety of well-known planning, model-free and model-based RL algorithms along these dimensions. Altogether, the framework may help provide deeper insight in the algorithmic design space of planning and reinforcement learning.

Related papers

On Sequential Fault-Intolerant Process Planning [60.66853798340345]
We propose and study a planning problem we call Sequential Fault-Intolerant Process Planning (SFIPP) SFIPP captures a reward structure common in many sequential multi-stage decision problems where the planning is deemed successful only if all stages succeed. We design provably tight online algorithms for settings in which we need to pick between different actions with unknown success chances at each stage.
arXiv Detail & Related papers (2025-02-07T15:20:35Z)
A Survey of Contextual Optimization Methods for Decision Making under Uncertainty [47.73071218563257]
This review article identifies three main frameworks for learning policies from data and discusses their strengths and limitations. We present the existing models and methods under a uniform notation and terminology and classify them according to the three main frameworks.
arXiv Detail & Related papers (2023-06-17T15:21:02Z)
The Statistical Complexity of Interactive Decision Making [126.04974881555094]
We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning. A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
arXiv Detail & Related papers (2021-12-27T02:53:44Z)
Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning. Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z)
A Two-stage Framework and Reinforcement Learning-based Optimization Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together. The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively. Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z)
Decision-Making Algorithms for Learning and Adaptation with Application to COVID-19 Data [46.71828464689144]
This work focuses on the development of a new family of decision-making algorithms for adaptation and learning. A key observation is that estimation and decision problems are structurally different and, therefore, algorithms that have proven successful for the former need not perform well when adjusted for decision problems.
arXiv Detail & Related papers (2020-12-14T18:24:45Z)
Abstract Value Iteration for Hierarchical Reinforcement Learning [23.08652058034536]
We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. A key challenge is that the ADP may not be Markov, which we address by proposing two algorithms for planning in the ADP. Our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks.
arXiv Detail & Related papers (2020-10-29T14:41:42Z)
Model-based Reinforcement Learning: A Survey [2.564530030795554]
Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is a important challenge in artificial intelligence. Two key approaches to this problem are reinforcement learning (RL) and planning. This paper presents a survey of the integration of both fields, better known as model-based reinforcement learning.
arXiv Detail & Related papers (2020-06-30T12:10:07Z)
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes [53.72166325215299]
We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs) First one achieves minimax optimal regret guarantees for a rich class of factored structures. Second one enjoys better computational complexity with a slightly worse regret.
arXiv Detail & Related papers (2020-06-24T00:50:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.