Hierarchical and Partially Observable Goal-driven Policy Learning with
Goals Relational Graph
- URL: http://arxiv.org/abs/2103.01350v1
- Date: Mon, 1 Mar 2021 23:21:46 GMT
- Title: Hierarchical and Partially Observable Goal-driven Policy Learning with
Goals Relational Graph
- Authors: Xin Ye and Yezhou Yang
- Abstract summary: We present a novel two-layer hierarchical learning approach equipped with a Goals Graph (GRG)
Our GRG captures the underlying relations of all goals in the goal space through a Dirichlet-categorical that process.
Our experimental results show that our approach exhibits superior generalization on both unseen environments and new goals.
- Score: 21.260858893505183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel two-layer hierarchical reinforcement learning approach
equipped with a Goals Relational Graph (GRG) for tackling the partially
observable goal-driven task, such as goal-driven visual navigation. Our GRG
captures the underlying relations of all goals in the goal space through a
Dirichlet-categorical process that facilitates: 1) the high-level network
raising a sub-goal towards achieving a designated final goal; 2) the low-level
network towards an optimal policy; and 3) the overall system generalizing
unseen environments and goals. We evaluate our approach with two settings of
partially observable goal-driven tasks -- a grid-world domain and a robotic
object search task. Our experimental results show that our approach exhibits
superior generalization performance on both unseen environments and new goals.
Related papers
- Imitating Graph-Based Planning with Goal-Conditioned Policies [72.61631088613048]
We present a self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy.
We empirically show that our method can significantly boost the sample-efficiency of the existing goal-conditioned RL methods.
arXiv Detail & Related papers (2023-03-20T14:51:10Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation.
We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional.
We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Visual Goal-Directed Meta-Learning with Contextual Planning Networks [0.0]
We introduce contextual planning networks (CPN) to generalize to new goals and tasks on the first attempt.
We evaluate CPN along with several other approaches adapted for zero-shot goal-directed meta-learning.
arXiv Detail & Related papers (2021-11-18T19:11:01Z) - Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning [15.33496710690063]
We propose goal-aware cross-entropy (GACE) loss, that can be utilized in a self-supervised way.
We then devise goal-discriminative attention networks (GDAN) which utilize the goal-relevant information to focus on the given instruction.
arXiv Detail & Related papers (2021-10-25T14:24:39Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Sub-Goal Trees -- a Framework for Goal-Based Reinforcement Learning [20.499747716864686]
Many AI problems, in robotics and other domains, are goal-based, essentially seeking trajectories leading to various goal states.
We propose a new RL framework, derived from a dynamic programming equation for the all pairs shortest path (APSP) problem.
We show that this approach has computational benefits for both standard and approximate dynamic programming.
arXiv Detail & Related papers (2020-02-27T12:32:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.