Related papers: Compositional Reinforcement Learning from Logical Specifications

Compositional Reinforcement Learning from Logical Specifications

URL: http://arxiv.org/abs/2106.13906v1
Date: Fri, 25 Jun 2021 22:54:28 GMT
Title: Compositional Reinforcement Learning from Logical Specifications
Authors: Kishor Jothimurugan, Suguman Bansal, Osbert Bastani and Rajeev Alur
Abstract summary: Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy. We develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph.
Score: 21.193231846438895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning. First, DiRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.

Related papers

Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning [2.5352713493505785]
Reinforcement learning -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning. We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.
arXiv Detail & Related papers (2025-04-02T08:15:16Z)
DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents [2.1438108757511958]
Our key contribution is a Discrete Hierarchical Planning (DHP) method, an alternative to traditional distance-based approaches. We provide theoretical foundations for the method and demonstrate its effectiveness through extensive empirical evaluations. We evaluate our method on long-horizon visual planning tasks in a 25-room environment, where it significantly outperforms previous benchmarks at success rate and average episode length.
arXiv Detail & Related papers (2025-02-04T03:05:55Z)
Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs) In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z)
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver. We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications. We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z)
Learning High-level Semantic-Relational Concepts for SLAM [10.528810470934781]
We propose an algorithm for learning high-level semantic-relational concepts that can be inferred from the low-level factor graph. We validate our method in both simulated and real datasets demonstrating improved performance over two baseline approaches.
arXiv Detail & Related papers (2023-09-30T14:54:31Z)
Feudal Graph Reinforcement Learning [18.069747511100132]
Graph-based representations and message-passing modular policies constitute prominent approaches to tackling composable control problems in reinforcement learning (RL) We propose a novel methodology, named Feudal Graph Reinforcement Learning (FGRL), that addresses such challenges by relying on hierarchical RL and a pyramidal message-passing architecture. In particular, FGRL defines a hierarchy of policies where high-level commands are propagated from the top of the hierarchy down through a layered graph structure.
arXiv Detail & Related papers (2023-04-11T09:51:13Z)
Imitating Graph-Based Planning with Goal-Conditioned Policies [72.61631088613048]
We present a self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy. We empirically show that our method can significantly boost the sample-efficiency of the existing goal-conditioned RL methods.
arXiv Detail & Related papers (2023-03-20T14:51:10Z)
Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z)
Goal Agnostic Planning using Maximum Likelihood Paths in Hypergraph World Models [1.370633147306388]
We present a hypergraph-based machine learning algorithm, a datastructure--driven maintenance method, and a planning algorithm based on a probabilistic application of Dijkstra's algorithm. We prove that the algorithm determines optimal solutions within the problem space, mathematically bound learning performance, and supply a mathematical model analyzing system state progression through time.
arXiv Detail & Related papers (2021-10-18T16:22:33Z)
Abstract Value Iteration for Hierarchical Reinforcement Learning [23.08652058034536]
We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. A key challenge is that the ADP may not be Markov, which we address by proposing two algorithms for planning in the ADP. Our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks.
arXiv Detail & Related papers (2020-10-29T14:41:42Z)
Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning [78.65083326918351]
We consider alternatives to an implicit sequential planning assumption. We propose Divide-and-Conquer Monte Carlo Tree Search (DC-MCTS) for approximating the optimal plan. We show that this algorithmic flexibility over planning order leads to improved results in navigation tasks in grid-worlds.
arXiv Detail & Related papers (2020-04-23T18:08:58Z)
Graph Ordering: Towards the Optimal by Learning [69.72656588714155]
Graph representation learning has achieved a remarkable success in many graph-based applications, such as node classification, prediction, and community detection. However, for some kind of graph applications, such as graph compression and edge partition, it is very hard to reduce them to some graph representation learning tasks. In this paper, we propose to attack the graph ordering problem behind such applications by a novel learning approach.
arXiv Detail & Related papers (2020-01-18T09:14:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.