Understanding and Controlling a Maze-Solving Policy Network
- URL: http://arxiv.org/abs/2310.08043v1
- Date: Thu, 12 Oct 2023 05:33:54 GMT
- Title: Understanding and Controlling a Maze-Solving Policy Network
- Authors: Ulisse Mini, Peli Grietzer, Mrinank Sharma, Austin Meek, Monte
MacDiarmid, Alexander Matt Turner
- Abstract summary: We study a pretrained reinforcement learning policy that solves mazes by navigating to a range of target squares.
We find this network pursues multiple context-dependent goals, and we identify circuits within the network that correspond to one of these goals.
We show that this network contains redundant, distributed, and retargetable goal representations, shedding light on the nature of goal-direction in trained policy networks.
- Score: 44.19448448073822
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To understand the goals and goal representations of AI systems, we carefully
study a pretrained reinforcement learning policy that solves mazes by
navigating to a range of target squares. We find this network pursues multiple
context-dependent goals, and we further identify circuits within the network
that correspond to one of these goals. In particular, we identified eleven
channels that track the location of the goal. By modifying these channels,
either with hand-designed interventions or by combining forward passes, we can
partially control the policy. We show that this network contains redundant,
distributed, and retargetable goal representations, shedding light on the
nature of goal-direction in trained policy networks.
Related papers
- What Planning Problems Can A Relational Neural Network Solve? [91.53684831950612]
We present a circuit complexity analysis for relational neural networks representing policies for planning problems.
We show that there are three general classes of planning problems, in terms of the growth of circuit width and depth.
We also illustrate the utility of this analysis for designing neural networks for policy learning.
arXiv Detail & Related papers (2023-12-06T18:47:28Z) - Imitating Graph-Based Planning with Goal-Conditioned Policies [72.61631088613048]
We present a self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy.
We empirically show that our method can significantly boost the sample-efficiency of the existing goal-conditioned RL methods.
arXiv Detail & Related papers (2023-03-20T14:51:10Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning [15.33496710690063]
We propose goal-aware cross-entropy (GACE) loss, that can be utilized in a self-supervised way.
We then devise goal-discriminative attention networks (GDAN) which utilize the goal-relevant information to focus on the given instruction.
arXiv Detail & Related papers (2021-10-25T14:24:39Z) - Hierarchical and Partially Observable Goal-driven Policy Learning with
Goals Relational Graph [21.260858893505183]
We present a novel two-layer hierarchical learning approach equipped with a Goals Graph (GRG)
Our GRG captures the underlying relations of all goals in the goal space through a Dirichlet-categorical that process.
Our experimental results show that our approach exhibits superior generalization on both unseen environments and new goals.
arXiv Detail & Related papers (2021-03-01T23:21:46Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.