Creativity of AI: Hierarchical Planning Model Learning for Facilitating
Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2112.09836v2
- Date: Fri, 7 Jul 2023 17:09:00 GMT
- Title: Creativity of AI: Hierarchical Planning Model Learning for Facilitating
Deep Reinforcement Learning
- Authors: Hankz Hankui Zhuo, Shuting Deng, Mu Jin, Zhihao Ma, Kebing Jin, Chen
Chen, Chao Yu
- Abstract summary: We introduce a novel deep reinforcement learning framework with symbolic options.
Our framework features a loop training procedure, which enables guiding the improvement of policy.
We conduct experiments on two domains, Montezuma's Revenge and Office World, respectively.
- Score: 19.470693909025798
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Despite of achieving great success in real-world applications, Deep
Reinforcement Learning (DRL) is still suffering from three critical issues,
i.e., data efficiency, lack of the interpretability and transferability. Recent
research shows that embedding symbolic knowledge into DRL is promising in
addressing those challenges. Inspired by this, we introduce a novel deep
reinforcement learning framework with symbolic options. Our framework features
a loop training procedure, which enables guiding the improvement of policy by
planning with planning models (including action models and hierarchical task
network models) and symbolic options learned from interactive trajectories
automatically. The learned symbolic options alleviate the dense requirement of
expert domain knowledge and provide inherent interpretability of policies.
Moreover, the transferability and data efficiency can be further improved by
planning with the symbolic planning models. To validate the effectiveness of
our framework, we conduct experiments on two domains, Montezuma's Revenge and
Office World, respectively. The results demonstrate the comparable performance,
improved data efficiency, interpretability and transferability.
Related papers
- A Novel Neural-symbolic System under Statistical Relational Learning [50.747658038910565]
We propose a general bi-level probabilistic graphical reasoning framework called GBPGR.
In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models.
Our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.
arXiv Detail & Related papers (2023-09-16T09:15:37Z) - Reinforcement Learning in Robotic Motion Planning by Combined
Experience-based Planning and Self-Imitation Learning [7.919213739992465]
High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks.
We propose self-imitation learning by planning plus (SILP+) algorithm, which embeds experience-based planning into the learning architecture.
Various experimental results show that SILP+ achieves better training efficiency higher and more stable success rate in complex motion planning tasks.
arXiv Detail & Related papers (2023-06-11T19:47:46Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Learning Temporally Extended Skills in Continuous Domains as Symbolic
Actions for Planning [2.642698101441705]
Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents.
We introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic abstraction of the environment's state for planning.
arXiv Detail & Related papers (2022-07-11T17:13:10Z) - Critic PI2: Master Continuous Planning via Policy Improvement with Path
Integrals and Deep Actor-Critic Reinforcement Learning [23.25444331531546]
Tree-based planning methods have enjoyed huge success in discrete domains, such as chess and Go.
In this paper, we present Critic PI2, which combines the benefits from trajectory optimization, deep actor-critic learning, and model-based reinforcement learning.
Our work opens a new direction toward learning the components of a model-based planning system and how to use them.
arXiv Detail & Related papers (2020-11-13T04:14:40Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Bridging Imagination and Reality for Model-Based Deep Reinforcement
Learning [72.18725551199842]
We propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD)
It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories.
We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.
arXiv Detail & Related papers (2020-10-23T03:22:01Z) - Delta Schema Network in Model-based Reinforcement Learning [125.99533416395765]
This work is devoted to unresolved problems of Artificial General Intelligence - the inefficiency of transfer learning.
We are expanding the schema networks method which allows to extract the logical relationships between objects and actions from the environment data.
We present algorithms for training a Delta Network (DSN), predicting future states of the environment and planning actions that will lead to positive reward.
arXiv Detail & Related papers (2020-06-17T15:58:25Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.