Related papers: On the Importance of Exploration for Generalization in Reinforcement Learning

On the Importance of Exploration for Generalization in Reinforcement Learning

URL: http://arxiv.org/abs/2306.05483v1
Date: Thu, 8 Jun 2023 18:07:02 GMT
Title: On the Importance of Exploration for Generalization in Reinforcement Learning
Authors: Yiding Jiang, J. Zico Kolter, Roberta Raileanu
Abstract summary: We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty. Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
Score: 89.63074327328765
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing approaches for improving generalization in deep reinforcement learning (RL) have mostly focused on representation learning, neglecting RL-specific aspects such as exploration. We hypothesize that the agent's exploration strategy plays a key role in its ability to generalize to new environments. Through a series of experiments in a tabular contextual MDP, we show that exploration is helpful not only for efficiently finding the optimal policy for the training environments but also for acquiring knowledge that helps decision making in unseen environments. Based on these observations, we propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high epistemic uncertainty through an ensemble of Q-value distributions. Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter, two benchmarks for generalization in RL with high-dimensional observations. The open-sourced implementation can be found at https://github.com/facebookresearch/ede .

Related papers

On Efficient Bayesian Exploration in Model-Based Reinforcement Learning [0.24578723416255752]
We address the challenge of data-efficient exploration in reinforcement learning by examining existing principled, information-theoretic approaches to intrinsic motivation.<n>We prove that exploration bonuses naturally signal epistemic information gains and converge to zero once the agent becomes sufficiently certain about the environment's dynamics and rewards.<n>We then outline a general framework - Predictive Trajectory Sampling with Bayesian Exploration (PTS-BE) - which integrates model-based planning with information-theoretic bonuses to achieve sample-efficient deep exploration.
arXiv Detail & Related papers (2025-07-03T14:03:47Z)
Efficient Exploration in Deep Reinforcement Learning: A Novel Bayesian Actor-Critic Algorithm [0.195804735329484]
Reinforcement learning (RL) and Deep Reinforcement Learning (DRL) have the potential to disrupt and are already changing the way we interact with the world. One of the key indicators of their applicability is their ability to scale and work in real-world scenarios.
arXiv Detail & Related papers (2024-08-19T14:50:48Z)
Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE) RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies. We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z)
Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient. To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z)
DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards [2.09711130126031]
Exploration is a fundamental aspect of reinforcement learning (RL), and its effectiveness is a deciding factor in the performance of RL algorithms. Recent studies have shown the effectiveness of encouraging exploration with intrinsic rewards estimated from novelties in observations. We propose DEIR, a novel method in which we theoretically derive an intrinsic reward with a conditional mutual information term.
arXiv Detail & Related papers (2023-04-21T06:39:38Z)
Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy. In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks. We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z)
Case-based Reasoning for Better Generalization in Text-Adventure Games [15.652823459179048]
We propose a general method inspired by case-based reasoning to train agents and generalize out of the training distribution. Our experiments show that the proposed approach consistently improves existing methods, obtains good out-of-distribution generalization, and achieves new state-of-the-art results on widely used environments.
arXiv Detail & Related papers (2021-10-16T04:51:34Z)
Exploration in Deep Reinforcement Learning: A Comprehensive Survey [24.252352133705735]
Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning (MARL) have achieved significant success across a wide range of domains, such as game AI, autonomous vehicles, robotics and finance. DRL and deep MARL agents are widely known to be sample-inefficient and millions of interactions are usually needed even for relatively simple game settings. This paper provides a comprehensive survey on existing exploration methods in DRL and deep MARL.
arXiv Detail & Related papers (2021-09-14T13:16:33Z)
Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning [114.07623388322048]
We discuss how the standard goal-conditioned RL (GCRL) is encapsulated by the objective variational empowerment. Our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.
arXiv Detail & Related papers (2021-06-02T18:12:26Z)
Reannealing of Decaying Exploration Based On Heuristic Measure in Deep Q-Network [82.20059754270302]
We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed. We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.
arXiv Detail & Related papers (2020-09-29T20:40:00Z)
Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents. We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks. This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
Intrinsic Exploration as Multi-Objective RL [29.124322674133]
Intrinsic motivation enables reinforcement learning (RL) agents to explore when rewards are very sparse. We propose a framework based on multi-objective RL where both exploration and exploitation are being optimized as separate objectives. This formulation brings the balance between exploration and exploitation at a policy level, resulting in advantages over traditional methods.
arXiv Detail & Related papers (2020-04-06T02:37:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.