Exploration in Deep Reinforcement Learning: A Comprehensive Survey
- URL: http://arxiv.org/abs/2109.06668v2
- Date: Wed, 15 Sep 2021 17:25:20 GMT
- Title: Exploration in Deep Reinforcement Learning: A Comprehensive Survey
- Authors: Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Jianye Hao,
Zhaopeng Meng and Peng Liu
- Abstract summary: Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning (MARL) have achieved significant success across a wide range of domains, such as game AI, autonomous vehicles, robotics and finance.
DRL and deep MARL agents are widely known to be sample-inefficient and millions of interactions are usually needed even for relatively simple game settings.
This paper provides a comprehensive survey on existing exploration methods in DRL and deep MARL.
- Score: 24.252352133705735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning
(MARL) have achieved significant success across a wide range of domains, such
as game AI, autonomous vehicles, robotics and finance. However, DRL and deep
MARL agents are widely known to be sample-inefficient and millions of
interactions are usually needed even for relatively simple game settings, thus
preventing the wide application in real-industry scenarios. One bottleneck
challenge behind is the well-known exploration problem, i.e., how to
efficiently explore the unknown environments and collect informative
experiences that could benefit the policy learning most.
In this paper, we conduct a comprehensive survey on existing exploration
methods in DRL and deep MARL for the purpose of providing understandings and
insights on the critical problems and solutions. We first identify several key
challenges to achieve efficient exploration, which most of the exploration
methods aim at addressing. Then we provide a systematic survey of existing
approaches by classifying them into two major categories: uncertainty-oriented
exploration and intrinsic motivation-oriented exploration. The essence of
uncertainty-oriented exploration is to leverage the quantification of the
epistemic and aleatoric uncertainty to derive efficient exploration. By
contrast, intrinsic motivation-oriented exploration methods usually incorporate
different reward agnostic information for intrinsic exploration guidance.
Beyond the above two main branches, we also conclude other exploration methods
which adopt sophisticated techniques but are difficult to be classified into
the above two categories. In addition, we provide a comprehensive empirical
comparison of exploration methods for DRL on a set of commonly used benchmarks.
Finally, we summarize the open problems of exploration in DRL and deep MARL and
point out a few future directions.
Related papers
- Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration [46.938186139700804]
This paper introduces LEMAE, choosing to channel informative task-relevant guidance from a knowledgeable Large Language Model (LLM) for Efficient Multi-Agent Exploration.
Specifically, we ground linguistic knowledge from LLM into symbolic key states, that are critical for task fulfillment, in a discriminative manner at low inference costs.
Benefiting from diminishing redundant explorations, LEMAE outperforms existing SOTA approaches on the challenging benchmarks (e.g., SMAC and MPE) by a large margin, achieving a 10x acceleration in certain scenarios.
arXiv Detail & Related papers (2024-10-03T14:21:23Z) - Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - WESE: Weak Exploration to Strong Exploitation for LLM Agents [95.6720931773781]
This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE) to enhance LLM agents in solving open-world interactive tasks.
WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge.
A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task.
arXiv Detail & Related papers (2024-04-11T03:31:54Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - Strangeness-driven Exploration in Multi-Agent Reinforcement Learning [0.0]
We introduce a new exploration method with the strangeness that can be easily incorporated into any centralized training and decentralized execution (CTDE)-based MARL algorithms.
The exploration bonus is obtained from the strangeness and the proposed exploration method is not much affected by transitions commonly observed in MARL tasks.
arXiv Detail & Related papers (2022-12-27T11:08:49Z) - Deep Intrinsically Motivated Exploration in Continuous Control [0.0]
In continuous systems, exploration is often performed through undirected strategies in which parameters of the networks or selected actions are perturbed by random noise.
We adapt existing theories on animal motivational systems into the reinforcement learning paradigm and introduce a novel directed exploration strategy.
Our framework extends to larger and more diverse state spaces, dramatically improves the baselines, and outperforms the undirected strategies significantly.
arXiv Detail & Related papers (2022-10-01T14:52:16Z) - Long-Term Exploration in Persistent MDPs [68.8204255655161]
We propose an exploration method called Rollback-Explore (RbExplore)
In this paper, we propose an exploration method called Rollback-Explore (RbExplore), which utilizes the concept of the persistent Markov decision process.
We test our algorithm in the hard-exploration Prince of Persia game, without rewards and domain knowledge.
arXiv Detail & Related papers (2021-09-21T13:47:04Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - Intrinsic Exploration as Multi-Objective RL [29.124322674133]
Intrinsic motivation enables reinforcement learning (RL) agents to explore when rewards are very sparse.
We propose a framework based on multi-objective RL where both exploration and exploitation are being optimized as separate objectives.
This formulation brings the balance between exploration and exploitation at a policy level, resulting in advantages over traditional methods.
arXiv Detail & Related papers (2020-04-06T02:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.