Walk the Random Walk: Learning to Discover and Reach Goals Without
Supervision
- URL: http://arxiv.org/abs/2206.11733v1
- Date: Thu, 23 Jun 2022 14:29:36 GMT
- Title: Walk the Random Walk: Learning to Discover and Reach Goals Without
Supervision
- Authors: Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Karteek Alahari
- Abstract summary: We propose a novel method for training such a goal-conditioned agent without any external rewards or any domain knowledge.
We use random walk to train a reachability network that predicts the similarity between two states.
This reachability network is then used in building goal memory containing past observations that are diverse and well-balanced.
All the components are kept updated throughout training as the agent discovers and learns new goals.
- Score: 21.72567982148215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning a diverse set of skills by interacting with an environment without
any external supervision is an important challenge. In particular, obtaining a
goal-conditioned agent that can reach any given state is useful in many
applications. We propose a novel method for training such a goal-conditioned
agent without any external rewards or any domain knowledge. We use random walk
to train a reachability network that predicts the similarity between two
states. This reachability network is then used in building goal memory
containing past observations that are diverse and well-balanced. Finally, we
train a goal-conditioned policy network with goals sampled from the goal memory
and reward it by the reachability network and the goal memory. All the
components are kept updated throughout training as the agent discovers and
learns new goals. We apply our method to a continuous control navigation and
robotic manipulation tasks.
Related papers
- Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation.
We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional.
We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z) - Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning [71.52722621691365]
Building generalizable goal-conditioned agents from rich observations is a key to reinforcement learning (RL) solving real world problems.
We propose a new form of state abstraction called goal-conditioned bisimulation.
We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in simulation manipulation tasks.
arXiv Detail & Related papers (2022-04-27T17:00:11Z) - Actionable Models: Unsupervised Offline Reinforcement Learning of
Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset.
We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z) - GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep
Reinforcement Learning [21.661530291654692]
We propose a framework that allows agents to autonomously identify and ignore noisy distracting regions.
Our framework can be combined with any state-of-the-art novelty seeking goal exploration approaches.
arXiv Detail & Related papers (2020-08-10T19:50:06Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Generating Automatic Curricula via Self-Supervised Active Domain
Randomization [11.389072560141388]
We extend the self-play framework to jointly learn a goal and environment curriculum.
Our method generates a coupled goal-task curriculum, where agents learn through progressively more difficult tasks and environment variations.
Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.
arXiv Detail & Related papers (2020-02-18T22:45:29Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.