Graph Value Iteration
- URL: http://arxiv.org/abs/2209.09608v1
- Date: Tue, 20 Sep 2022 10:45:03 GMT
- Title: Graph Value Iteration
- Authors: Dieqiao Feng, Carla P. Gomes, Bart Selman
- Abstract summary: deep Reinforcement Learning (RL) has been successful in various search domains, such as two-player games and scientific discovery.
One major difficulty is that without a human-crafted function, reward signals remain zero unless the learning framework discovers any solution plan.
We propose a domain-independent method that augments graph search with graph value iteration to solve hard planning instances.
- Score: 35.87805182676444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, deep Reinforcement Learning (RL) has been successful in
various combinatorial search domains, such as two-player games and scientific
discovery. However, directly applying deep RL in planning domains is still
challenging. One major difficulty is that without a human-crafted heuristic
function, reward signals remain zero unless the learning framework discovers
any solution plan. Search space becomes \emph{exponentially larger} as the
minimum length of plans grows, which is a serious limitation for planning
instances with a minimum plan length of hundreds to thousands of steps.
Previous learning frameworks that augment graph search with deep neural
networks and extra generated subgoals have achieved success in various
challenging planning domains. However, generating useful subgoals requires
extensive domain knowledge. We propose a domain-independent method that
augments graph search with graph value iteration to solve hard planning
instances that are out of reach for domain-specialized solvers. In particular,
instead of receiving learning signals only from discovered plans, our approach
also learns from failed search attempts where no goal state has been reached.
The graph value iteration component can exploit the graph structure of local
search space and provide more informative learning signals. We also show how we
use a curriculum strategy to smooth the learning process and perform a full
analysis of how graph value iteration scales and enables learning.
Related papers
- A Schema-aware Logic Reformulation for Graph Reachability [0.0]
We propose a strategy to automatically exclude and sort certain graph paths by exploiting the higher-level conceptualization of instances.
The aim is to obtain a new first-order logic reformulation of the graph reachability scenario, capable of improving the traditional algorithms in terms of time, space requirements, and number of backtracks.
arXiv Detail & Related papers (2024-10-03T14:39:49Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Learning Graph Search Heuristics [48.83557172525969]
We present PHIL (Path Heuristic with Imitation Learning), a novel neural architecture and a training algorithm for discovering graph search and navigations from data.
Our function learns graph embeddings useful for inferring node distances, runs in constant time independent of graph sizes, and can be easily incorporated in an algorithm such as A* at test time.
Experiments show that PHIL reduces the number of explored nodes compared to state-of-the-art methods on benchmark datasets by 58.5% on average.
arXiv Detail & Related papers (2022-12-07T22:28:00Z) - Reinforced Continual Learning for Graphs [18.64268861430314]
This paper proposes a graph continual learning strategy that combines the architecture-based and memory-based approaches.
It is numerically validated with several graph continual learning benchmark problems in both task-incremental learning and class-incremental learning settings.
arXiv Detail & Related papers (2022-09-04T07:49:59Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - Graph Self-Supervised Learning: A Survey [73.86209411547183]
Self-supervised learning (SSL) has become a promising and trending learning paradigm for graph data.
We present a timely and comprehensive review of the existing approaches which employ SSL techniques for graph data.
arXiv Detail & Related papers (2021-02-27T03:04:21Z) - World Model as a Graph: Learning Latent Landmarks for Planning [12.239590266108115]
Planning is a hallmark of human intelligence.
One prominent framework, Model-Based RL, learns a world model and plans using step-by-step virtual rollouts.
We propose to learn graph-structured world models composed of sparse, multi-step transitions.
arXiv Detail & Related papers (2020-11-25T02:49:21Z) - Self-supervised Learning on Graphs: Deep Insights and New Direction [66.78374374440467]
Self-supervised learning (SSL) aims to create domain specific pretext tasks on unlabeled data.
There are increasing interests in generalizing deep learning to the graph domain in the form of graph neural networks (GNNs)
arXiv Detail & Related papers (2020-06-17T20:30:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.