Related papers: Information Content Exploration

Information Content Exploration

URL: http://arxiv.org/abs/2310.06777v1
Date: Tue, 10 Oct 2023 16:51:32 GMT
Title: Information Content Exploration
Authors: Jacob Chmura, Hasham Burhani, Xiao Qi Shi
Abstract summary: We propose a new intrinsic reward that systemically quantifies exploratory behavior and promotes state coverage. We show that our information theoretic reward induces efficient exploration and outperforms in various games.
Score: 1.7034813545878589
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sparse reward environments are known to be challenging for reinforcement learning agents. In such environments, efficient and scalable exploration is crucial. Exploration is a means by which an agent gains information about the environment. We expand on this topic and propose a new intrinsic reward that systemically quantifies exploratory behavior and promotes state coverage by maximizing the information content of a trajectory taken by an agent. We compare our method to alternative exploration based intrinsic reward techniques, namely Curiosity Driven Learning and Random Network Distillation. We show that our information theoretic reward induces efficient exploration and outperforms in various games, including Montezuma Revenge, a known difficult task for reinforcement learning. Finally, we propose an extension that maximizes information content in a discretely compressed latent space which boosts sample efficiency and generalizes to continuous state spaces.

Related papers

An Information-Geometric Approach to Artificial Curiosity [49.1574468325115]
We show that intrinsic rewards should depend on the agent's information about the environment, remaining to the representation of the information. We show that invariance under congruent Markov morphisms and the agent-environment interaction, uniquely constrains intrinsic rewards to concave functions of the reciprocal occupancy. This framework provides important constraints to the engineering of intrinsic reward while integrating foundational exploration methods into a single, cohesive model.
arXiv Detail & Related papers (2025-04-08T18:04:15Z)
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization [91.80034860399677]
Reinforcement learning algorithms aim to balance exploiting the current best strategy with exploring new options that could lead to higher rewards. We introduce a framework, MaxInfoRL, for balancing intrinsic and extrinsic exploration. We show that our approach achieves sublinear regret in the simplified setting of multi-armed bandits.
arXiv Detail & Related papers (2024-12-16T18:59:53Z)
Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE) RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies. We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z)
Exploration Unbound [26.27811928866858]
A sequential decision-making agent balances between exploring to gain new knowledge and exploiting current knowledge to maximize immediate reward. We offer a simple, quintessential example of such a complex environment. In this environment, rewards are unbounded and an agent can always increase the rate at which rewards accumulate by exploring to learn more.
arXiv Detail & Related papers (2024-07-16T21:14:43Z)
WESE: Weak Exploration to Strong Exploitation for LLM Agents [95.6720931773781]
This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE) to enhance LLM agents in solving open-world interactive tasks. WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge. A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task.
arXiv Detail & Related papers (2024-04-11T03:31:54Z)
On the Importance of Exploration for Generalization in Reinforcement Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty. Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z)
Successor-Predecessor Intrinsic Exploration [18.440869985362998]
We focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards. We propose Successor-Predecessor Intrinsic Exploration (SPIE), an exploration algorithm based on a novel intrinsic reward combining prospective and retrospective information. We show that SPIE yields more efficient and ethologically plausible exploratory behaviour in environments with sparse rewards and bottleneck states than competing methods.
arXiv Detail & Related papers (2023-05-24T16:02:51Z)
Self-supervised Sequential Information Bottleneck for Robust Exploration in Deep Reinforcement Learning [28.75574762244266]
In this work, we introduce the sequential information bottleneck objective for learning compressed and temporally coherent representations. For efficient exploration in noisy environments, we further construct intrinsic rewards that capture task-relevant state novelty.
arXiv Detail & Related papers (2022-09-12T15:41:10Z)
GAN-based Intrinsic Exploration For Sample Efficient Reinforcement Learning [0.0]
We propose a Geneversarative Adversarial Network-based Intrinsic Reward Module that learns the distribution of the observed states and sends an intrinsic reward that is computed as high for states that are out of distribution. We evaluate our approach in Super Mario Bros for a no reward setting and in Montezuma's Revenge for a sparse reward setting and show that our approach is indeed capable of exploring efficiently.
arXiv Detail & Related papers (2022-06-28T19:16:52Z)
Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning [5.40729975786985]
This paper explores the idea of combining exploration with auxiliary task learning using General Value Functions (GVFs) and a directed exploration strategy. We provide a simple way to learn options (sequences of actions) instead of having to handcraft them, and demonstrate the performance advantage in three navigation tasks.
arXiv Detail & Related papers (2022-03-02T05:14:11Z)
Information is Power: Intrinsic Control via Information Capture [110.3143711650806]
We argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model. This objective induces an agent to both gather information about its environment, corresponding to reducing uncertainty, and to gain control over its environment, corresponding to reducing the unpredictability of future world states.
arXiv Detail & Related papers (2021-12-07T18:50:42Z)
Long-Term Exploration in Persistent MDPs [68.8204255655161]
We propose an exploration method called Rollback-Explore (RbExplore) In this paper, we propose an exploration method called Rollback-Explore (RbExplore), which utilizes the concept of the persistent Markov decision process. We test our algorithm in the hard-exploration Prince of Persia game, without rewards and domain knowledge.
arXiv Detail & Related papers (2021-09-21T13:47:04Z)
Focus on Impact: Indoor Exploration with Intrinsic Motivation [45.97756658635314]
In this work, we propose to train a model with a purely intrinsic reward signal to guide exploration. We include a neural-based density model and replace the traditional count-based regularization with an estimated pseudo-count of previously visited states. We also show that a robot equipped with the proposed approach seamlessly adapts to point-goal navigation and real-world deployment.
arXiv Detail & Related papers (2021-09-14T18:00:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.