Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning
- URL: http://arxiv.org/abs/2004.11410v1
- Date: Thu, 23 Apr 2020 18:08:58 GMT
- Title: Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning
- Authors: Giambattista Parascandolo, Lars Buesing, Josh Merel, Leonard
Hasenclever, John Aslanides, Jessica B. Hamrick, Nicolas Heess, Alexander
Neitz, Theophane Weber
- Abstract summary: We consider alternatives to an implicit sequential planning assumption.
We propose Divide-and-Conquer Monte Carlo Tree Search (DC-MCTS) for approximating the optimal plan.
We show that this algorithmic flexibility over planning order leads to improved results in navigation tasks in grid-worlds.
- Score: 78.65083326918351
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard planners for sequential decision making (including Monte Carlo
planning, tree search, dynamic programming, etc.) are constrained by an
implicit sequential planning assumption: The order in which a plan is
constructed is the same in which it is executed. We consider alternatives to
this assumption for the class of goal-directed Reinforcement Learning (RL)
problems. Instead of an environment transition model, we assume an imperfect,
goal-directed policy. This low-level policy can be improved by a plan,
consisting of an appropriate sequence of sub-goals that guide it from the start
to the goal state. We propose a planning algorithm, Divide-and-Conquer Monte
Carlo Tree Search (DC-MCTS), for approximating the optimal plan by means of
proposing intermediate sub-goals which hierarchically partition the initial
tasks into simpler ones that are then solved independently and recursively. The
algorithm critically makes use of a learned sub-goal proposal for finding
appropriate partitions trees of new tasks based on prior experience. Different
strategies for learning sub-goal proposals give rise to different planning
strategies that strictly generalize sequential planning. We show that this
algorithmic flexibility over planning order leads to improved results in
navigation tasks in grid-worlds as well as in challenging continuous control
environments.
Related papers
- Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - Learning adaptive planning representations with natural language
guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations.
Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - Lifted Sequential Planning with Lazy Constraint Generation Solvers [28.405198103927955]
This paper studies the possibilities made open by the use of Lazy Clause Generation (LCG) based approaches to Constraint Programming (CP)
We propose a novel CP model based on seminal ideas on so-called lifted causal encodings for planning as satisfiability.
We report that for planning problem instances requiring fewer plan steps our methods compare very well with the state-of-the-art in optimal sequential planning.
arXiv Detail & Related papers (2023-07-17T04:54:58Z) - Imitating Graph-Based Planning with Goal-Conditioned Policies [72.61631088613048]
We present a self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy.
We empirically show that our method can significantly boost the sample-efficiency of the existing goal-conditioned RL methods.
arXiv Detail & Related papers (2023-03-20T14:51:10Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Visual scoping operations for physical assembly [0.0]
We propose visual scoping, a strategy that interleaves planning and acting by alternately defining a spatial region as the next subgoal.
We find that visual scoping achieves comparable task performance to the subgoal planner while requiring only a fraction of the total computational cost.
arXiv Detail & Related papers (2021-06-10T10:50:35Z) - Extended Task and Motion Planning of Long-horizon Robot Manipulation [28.951816622135922]
Task and Motion Planning (TAMP) requires integration of symbolic reasoning with metric motion planning.
Most TAMP approaches fail to provide feasible solutions when there is missing knowledge about the environment at the symbolic level.
We propose a novel approach for decision-making on extended decision spaces over plan skeletons and action parameters.
arXiv Detail & Related papers (2021-03-09T14:44:08Z) - Robust Hierarchical Planning with Policy Delegation [6.1678491628787455]
We propose a novel framework and algorithm for hierarchical planning based on the principle of delegation.
We show this planning approach is experimentally very competitive to classic planning and reinforcement learning techniques on a variety of domains.
arXiv Detail & Related papers (2020-10-25T04:36:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.