Deep Intrinsically Motivated Exploration in Continuous Control
- URL: http://arxiv.org/abs/2210.00293v1
- Date: Sat, 1 Oct 2022 14:52:16 GMT
- Title: Deep Intrinsically Motivated Exploration in Continuous Control
- Authors: Baturay Saglam, Suleyman S. Kozat
- Abstract summary: In continuous systems, exploration is often performed through undirected strategies in which parameters of the networks or selected actions are perturbed by random noise.
We adapt existing theories on animal motivational systems into the reinforcement learning paradigm and introduce a novel directed exploration strategy.
Our framework extends to larger and more diverse state spaces, dramatically improves the baselines, and outperforms the undirected strategies significantly.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In continuous control, exploration is often performed through undirected
strategies in which parameters of the networks or selected actions are
perturbed by random noise. Although the deep setting of undirected exploration
has been shown to improve the performance of on-policy methods, they introduce
an excessive computational complexity and are known to fail in the off-policy
setting. The intrinsically motivated exploration is an effective alternative to
the undirected strategies, but they are usually studied for discrete action
domains. In this paper, we investigate how intrinsic motivation can effectively
be combined with deep reinforcement learning in the control of continuous
systems to obtain a directed exploratory behavior. We adapt the existing
theories on animal motivational systems into the reinforcement learning
paradigm and introduce a novel and scalable directed exploration strategy. The
introduced approach, motivated by the maximization of the value function's
error, can benefit from a collected set of experiences by extracting useful
information and unify the intrinsic exploration motivations in the literature
under a single exploration objective. An extensive set of empirical studies
demonstrate that our framework extends to larger and more diverse state spaces,
dramatically improves the baselines, and outperforms the undirected strategies
significantly.
Related papers
- Action abstractions for amortized sampling [49.384037138511246]
We propose an approach to incorporate the discovery of action abstractions, or high-level actions, into the policy optimization process.
Our approach involves iteratively extracting action subsequences commonly used across many high-reward trajectories and chunking' them into a single action that is added to the action space.
arXiv Detail & Related papers (2024-10-19T19:22:50Z) - State-Novelty Guided Action Persistence in Deep Reinforcement Learning [7.05832012052375]
We propose a novel method to dynamically adjust the action persistence based on the current exploration status of the state space.
Our method can be seamlessly integrated into various basic exploration strategies to incorporate temporal persistence.
arXiv Detail & Related papers (2024-09-09T08:34:22Z) - Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - Variable-Agnostic Causal Exploration for Reinforcement Learning [56.52768265734155]
We introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL)
Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms.
It constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion.
arXiv Detail & Related papers (2024-07-17T09:45:27Z) - Never Explore Repeatedly in Multi-Agent Reinforcement Learning [40.35950679063337]
We propose a dynamic reward scaling approach to combat "revisitation"
We show enhanced performance in demanding environments like Google Research Football and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2023-08-19T05:27:48Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Intrinsic Motivation in Model-based Reinforcement Learning: A Brief
Review [77.34726150561087]
This review considers the existing methods for determining intrinsic motivation based on the world model obtained by the agent.
The proposed unified framework describes the architecture of agents using a world model and intrinsic motivation to improve learning.
arXiv Detail & Related papers (2023-01-24T15:13:02Z) - Reannealing of Decaying Exploration Based On Heuristic Measure in Deep
Q-Network [82.20059754270302]
We propose an algorithm based on the idea of reannealing, that aims at encouraging exploration only when it is needed.
We perform an illustrative case study showing that it has potential to both accelerate training and obtain a better policy.
arXiv Detail & Related papers (2020-09-29T20:40:00Z) - Intrinsic Exploration as Multi-Objective RL [29.124322674133]
Intrinsic motivation enables reinforcement learning (RL) agents to explore when rewards are very sparse.
We propose a framework based on multi-objective RL where both exploration and exploitation are being optimized as separate objectives.
This formulation brings the balance between exploration and exploitation at a policy level, resulting in advantages over traditional methods.
arXiv Detail & Related papers (2020-04-06T02:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.