Related papers: Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision

Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision

URL: http://arxiv.org/abs/2206.11733v1
Date: Thu, 23 Jun 2022 14:29:36 GMT
Title: Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision
Authors: Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Karteek Alahari
Abstract summary: We propose a novel method for training such a goal-conditioned agent without any external rewards or any domain knowledge. We use random walk to train a reachability network that predicts the similarity between two states. This reachability network is then used in building goal memory containing past observations that are diverse and well-balanced. All the components are kept updated throughout training as the agent discovers and learns new goals.
Score: 21.72567982148215
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning a diverse set of skills by interacting with an environment without any external supervision is an important challenge. In particular, obtaining a goal-conditioned agent that can reach any given state is useful in many applications. We propose a novel method for training such a goal-conditioned agent without any external rewards or any domain knowledge. We use random walk to train a reachability network that predicts the similarity between two states. This reachability network is then used in building goal memory containing past observations that are diverse and well-balanced. Finally, we train a goal-conditioned policy network with goals sampled from the goal memory and reward it by the reachability network and the goal memory. All the components are kept updated throughout training as the agent discovers and learns new goals. We apply our method to a continuous control navigation and robotic manipulation tasks.

Related papers

Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups. We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z)
Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation. We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional. We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z)
Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning [71.52722621691365]
Building generalizable goal-conditioned agents from rich observations is a key to reinforcement learning (RL) solving real world problems. We propose a new form of state abstraction called goal-conditioned bisimulation. We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in simulation manipulation tasks.
arXiv Detail & Related papers (2022-04-27T17:00:11Z)
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset. We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z)
GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning [21.661530291654692]
We propose a framework that allows agents to autonomously identify and ignore noisy distracting regions. Our framework can be combined with any state-of-the-art novelty seeking goal exploration approaches.
arXiv Detail & Related papers (2020-08-10T19:50:06Z)
Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors. In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency. We propose setting up an automatic curriculum for goals that the agent needs to solve. We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
Generating Automatic Curricula via Self-Supervised Active Domain Randomization [11.389072560141388]
We extend the self-play framework to jointly learn a goal and environment curriculum. Our method generates a coupled goal-task curriculum, where agents learn through progressively more difficult tasks and environment variations. Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.
arXiv Detail & Related papers (2020-02-18T22:45:29Z)
Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal. In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals. We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.