Novelty Search in Representational Space for Sample Efficient
Exploration
- URL: http://arxiv.org/abs/2009.13579v3
- Date: Fri, 15 Apr 2022 16:11:46 GMT
- Title: Novelty Search in Representational Space for Sample Efficient
Exploration
- Authors: Ruo Yu Tao, Vincent Fran\c{c}ois-Lavet, Joelle Pineau
- Abstract summary: We present a new approach for efficient exploration which leverages a low-dimensional encoding of the environment learned with a combination of model-based and model-free objectives.
Our approach uses intrinsic rewards that are based on the distance of nearest neighbors in the low dimensional representational space to gauge novelty.
We then leverage these intrinsic rewards for sample-efficient exploration with planning routines in representational space for hard exploration tasks with sparse rewards.
- Score: 38.2027946450689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new approach for efficient exploration which leverages a
low-dimensional encoding of the environment learned with a combination of
model-based and model-free objectives. Our approach uses intrinsic rewards that
are based on the distance of nearest neighbors in the low dimensional
representational space to gauge novelty. We then leverage these intrinsic
rewards for sample-efficient exploration with planning routines in
representational space for hard exploration tasks with sparse rewards. One key
element of our approach is the use of information theoretic principles to shape
our representations in a way so that our novelty reward goes beyond pixel
similarity. We test our approach on a number of maze tasks, as well as a
control problem and show that our exploration approach is more sample-efficient
compared to strong baselines.
Related papers
- Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement
Learning [20.0888026410406]
We show that counts can be derived by averaging samples from the Rademacher distribution.
We show that our method is significantly more effective at deducing ground-truth visitation counts than previous work.
arXiv Detail & Related papers (2023-06-05T18:56:48Z) - Self-supervised Sequential Information Bottleneck for Robust Exploration
in Deep Reinforcement Learning [28.75574762244266]
In this work, we introduce the sequential information bottleneck objective for learning compressed and temporally coherent representations.
For efficient exploration in noisy environments, we further construct intrinsic rewards that capture task-relevant state novelty.
arXiv Detail & Related papers (2022-09-12T15:41:10Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - Pure Exploration in Kernel and Neural Bandits [90.23165420559664]
We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms.
To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space.
arXiv Detail & Related papers (2021-06-22T19:51:59Z) - MADE: Exploration via Maximizing Deviation from Explored Regions [48.49228309729319]
In online reinforcement learning (RL), efficient exploration remains challenging in high-dimensional environments with sparse rewards.
We propose a new exploration approach via textitmaximizing the deviation of the occupancy of the next policy from the explored regions.
Our approach significantly improves sample efficiency over state-of-the-art methods.
arXiv Detail & Related papers (2021-06-18T17:57:00Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - Latent World Models For Intrinsically Motivated Exploration [140.21871701134626]
We present a self-supervised representation learning method for image-based observations.
We consider episodic and life-long uncertainties to guide the exploration of partially observable environments.
arXiv Detail & Related papers (2020-10-05T19:47:04Z) - RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated
Environments [15.736899098702972]
We propose a novel type of intrinsic reward which encourages the agent to take actions that lead to significant changes in its learned state representation.
We evaluate our method on multiple challenging procedurally-generated tasks in MiniGrid.
arXiv Detail & Related papers (2020-02-27T18:03:16Z) - Long-Term Visitation Value for Deep Exploration in Sparse Reward
Reinforcement Learning [34.38011902445557]
Reinforcement learning with sparse rewards is still an open challenge.
We present a novel approach that plans exploration actions far into the future by using a long-term visitation count.
Contrary to existing methods which use models of reward and dynamics, our approach is off-policy and model-free.
arXiv Detail & Related papers (2020-01-01T01:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.