Related papers: Goal-Space Planning with Subgoal Models

Goal-Space Planning with Subgoal Models

URL: http://arxiv.org/abs/2206.02902v5
Date: Tue, 27 Feb 2024 06:15:53 GMT
Title: Goal-Space Planning with Subgoal Models
Authors: Chunlok Lo, Kevin Roice, Parham Mohammad Panahi, Scott Jordan, Adam White, Gabor Mihucz, Farzane Aminmansour, Martha White
Abstract summary: This paper investigates a new approach to model-based reinforcement learning using background planning. We show that our GSP algorithm can propagate value from an abstract space in a manner that helps a variety of base learners learn significantly faster in different domains.
Score: 18.43265820052893
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives, such as Double DQN, even though the former uses significantly more memory and computation. The fundamental problem is that learned models can be inaccurate and often generate invalid states, especially when iterated many steps. In this paper, we avoid this limitation by constraining background planning to a set of (abstract) subgoals and learning only local, subgoal-conditioned models. This goal-space planning (GSP) approach is more computationally efficient, naturally incorporates temporal abstraction for faster long-horizon planning and avoids learning the transition dynamics entirely. We show that our GSP algorithm can propagate value from an abstract space in a manner that helps a variety of base learners learn significantly faster in different domains.

Related papers

Integrating Reinforcement Learning, Action Model Learning, and Numeric Planning for Tackling Complex Tasks [12.281688043929996]
Automated Planning algorithms require a model of the domain that specifies the preconditions and effects of each action. It remains unclear whether learning a numeric domain model and planning is an effective approach for numeric planning environments. In this work, we explore the benefits of learning a numeric domain model and compare it with alternative model-free solutions.
arXiv Detail & Related papers (2025-02-18T16:26:21Z)
Decision Trees That Remember: Gradient-Based Learning of Recurrent Decision Trees with Memory [1.4487264853431878]
We introduce ReMeDe Trees, a novel recurrent DT architecture that integrates an internal memory mechanism, similar to RNNs, to learn long-term dependencies in sequential data. Our model learns hard, axis-aligned decision rules for both output generation and state updates, optimizing them efficiently via gradient descent.
arXiv Detail & Related papers (2025-02-06T13:11:50Z)
A New View on Planning in Online Reinforcement Learning [19.35031543927374]
This paper investigates a new approach to model-based reinforcement learning using background planning. We show that our GSP algorithm can propagate value from an abstract space in a manner that helps a variety of base learners learn significantly faster in different domains.
arXiv Detail & Related papers (2024-06-03T17:45:19Z)
Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning. Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset. We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU) We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
PDSketch: Integrated Planning Domain Programming and Learning [86.07442931141637]
We present a new domain definition language, named PDSketch. It allows users to flexibly define high-level structures in the transition models. Details of the transition model will be filled in by trainable neural networks.
arXiv Detail & Related papers (2023-03-09T18:54:12Z)
SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement Learning [18.37286885057802]
We propose an algorithm combining learning and planning to exploit a previously unusable class of incomplete models. This combines the strengths of symbolic planning and neural learning approaches in a novel way that outperforms competing methods on variations of taxi world and Minecraft.
arXiv Detail & Related papers (2022-03-09T22:55:53Z)
Visual Learning-based Planning for Continuous High-Dimensional POMDPs [81.16442127503517]
Visual Tree Search (VTS) is a learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. VTS bridges offline model training and online planning by utilizing a set of deep generative observation models to predict and evaluate the likelihood of image observations in a Monte Carlo tree search planner. We show that VTS is robust to different observation noises and, since it utilizes online, model-based planning, can adapt to different reward structures without the need to re-train.
arXiv Detail & Related papers (2021-12-17T11:53:31Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Model-Based Reinforcement Learning via Latent-Space Collocation [110.04005442935828]
We argue that it is easier to solve long-horizon tasks by planning sequences of states rather than just actions. We adapt the idea of collocation, which has shown good results on long-horizon tasks in optimal control literature, to the image-based setting by utilizing learned latent state space models.
arXiv Detail & Related papers (2021-06-24T17:59:18Z)
World Model as a Graph: Learning Latent Landmarks for Planning [12.239590266108115]
Planning is a hallmark of human intelligence. One prominent framework, Model-Based RL, learns a world model and plans using step-by-step virtual rollouts. We propose to learn graph-structured world models composed of sparse, multi-step transitions.
arXiv Detail & Related papers (2020-11-25T02:49:21Z)
PLOP: Learning without Forgetting for Continual Semantic Segmentation [44.49799311137856]
Continual learning for semantic segmentation (CSS) is an emerging trend that consists in updating an old model by sequentially adding new classes. In this paper, we propose Local POD, a multi-scale pooling distillation scheme that preserves long- and short-range spatial relationships at feature level. We also design an entropy-based pseudo-labelling of the background w.r.t. classes predicted by the old model to deal with background shift and avoid catastrophic forgetting of the old classes.
arXiv Detail & Related papers (2020-11-23T13:35:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.