ELDEN: Exploration via Local Dependencies
- URL: http://arxiv.org/abs/2310.08702v1
- Date: Thu, 12 Oct 2023 20:20:21 GMT
- Title: ELDEN: Exploration via Local Dependencies
- Authors: Jiaheng Hu, Zizhao Wang, Peter Stone, Roberto Martin-Martin
- Abstract summary: We present ELDEN, Exploration via Local DepENdencies, a novel intrinsic reward that encourages the discovery of new interactions between entities.
We evaluate the performance of ELDEN on four different domains with complex dependencies, ranging from 2D grid worlds to 3D robotic tasks.
- Score: 37.44189774149647
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Tasks with large state space and sparse rewards present a longstanding
challenge to reinforcement learning. In these tasks, an agent needs to explore
the state space efficiently until it finds a reward. To deal with this problem,
the community has proposed to augment the reward function with intrinsic
reward, a bonus signal that encourages the agent to visit interesting states.
In this work, we propose a new way of defining interesting states for
environments with factored state spaces and complex chained dependencies, where
an agent's actions may change the value of one entity that, in order, may
affect the value of another entity. Our insight is that, in these environments,
interesting states for exploration are states where the agent is uncertain
whether (as opposed to how) entities such as the agent or objects have some
influence on each other. We present ELDEN, Exploration via Local DepENdencies,
a novel intrinsic reward that encourages the discovery of new interactions
between entities. ELDEN utilizes a novel scheme -- the partial derivative of
the learned dynamics to model the local dependencies between entities
accurately and computationally efficiently. The uncertainty of the predicted
dependencies is then used as an intrinsic reward to encourage exploration
toward new interactions. We evaluate the performance of ELDEN on four different
domains with complex dependencies, ranging from 2D grid worlds to 3D robotic
tasks. In all domains, ELDEN correctly identifies local dependencies and learns
successful policies, significantly outperforming previous state-of-the-art
exploration methods.
Related papers
- Self-Localized Collaborative Perception [49.86110931859302]
We propose$mathttCoBEVGlue$, a novel self-localized collaborative perception system.
$mathttCoBEVGlue$ is a novel spatial alignment module, which provides the relative poses between agents.
$mathttCoBEVGlue$ achieves state-of-the-art detection performance under arbitrary localization noises and attacks.
arXiv Detail & Related papers (2024-06-18T15:26:54Z) - Imagine, Initialize, and Explore: An Effective Exploration Method in
Multi-Agent Reinforcement Learning [27.81925751697255]
We propose a novel method for efficient multi-agent exploration in complex scenarios.
We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively.
By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important underexplored regions.
arXiv Detail & Related papers (2024-02-28T01:45:01Z) - Successor-Predecessor Intrinsic Exploration [18.440869985362998]
We focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards.
We propose Successor-Predecessor Intrinsic Exploration (SPIE), an exploration algorithm based on a novel intrinsic reward combining prospective and retrospective information.
We show that SPIE yields more efficient and ethologically plausible exploratory behaviour in environments with sparse rewards and bottleneck states than competing methods.
arXiv Detail & Related papers (2023-05-24T16:02:51Z) - Information is Power: Intrinsic Control via Information Capture [110.3143711650806]
We argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
This objective induces an agent to both gather information about its environment, corresponding to reducing uncertainty, and to gain control over its environment, corresponding to reducing the unpredictability of future world states.
arXiv Detail & Related papers (2021-12-07T18:50:42Z) - Episodic Multi-agent Reinforcement Learning with Curiosity-Driven
Exploration [40.87053312548429]
We introduce a novel Episodic Multi-agent reinforcement learning with Curiosity-driven exploration, called EMC.
We use prediction errors of individual Q-values as intrinsic rewards for coordinated exploration and utilize episodic memory to exploit explored informative experience to boost policy training.
arXiv Detail & Related papers (2021-11-22T07:34:47Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Focus on Impact: Indoor Exploration with Intrinsic Motivation [45.97756658635314]
In this work, we propose to train a model with a purely intrinsic reward signal to guide exploration.
We include a neural-based density model and replace the traditional count-based regularization with an estimated pseudo-count of previously visited states.
We also show that a robot equipped with the proposed approach seamlessly adapts to point-goal navigation and real-world deployment.
arXiv Detail & Related papers (2021-09-14T18:00:07Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z) - InfoBot: Transfer and Exploration via the Information Bottleneck [105.28380750802019]
A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed.
We propose to learn about decision states from prior experience.
We find that this simple mechanism effectively identifies decision states, even in partially observed settings.
arXiv Detail & Related papers (2019-01-30T15:33:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.