ACDER: Augmented Curiosity-Driven Experience Replay
- URL: http://arxiv.org/abs/2011.08027v1
- Date: Mon, 16 Nov 2020 15:27:15 GMT
- Title: ACDER: Augmented Curiosity-Driven Experience Replay
- Authors: Boyao Li, Tao Lu, Jiayi Li, Ning Lu, Yinghao Cai, Shuo Wang
- Abstract summary: We propose a novel method called Augmented Curiosity-Driven Experience Replay (ACDER)
ACDER uses a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully.
Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push.
- Score: 16.755555854030412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploration in environments with sparse feedback remains a challenging
research problem in reinforcement learning (RL). When the RL agent explores the
environment randomly, it results in low exploration efficiency, especially in
robotic manipulation tasks with high dimensional continuous state and action
space. In this paper, we propose a novel method, called Augmented
Curiosity-Driven Experience Replay (ACDER), which leverages (i) a new
goal-oriented curiosity-driven exploration to encourage the agent to pursue
novel and task-relevant states more purposefully and (ii) the dynamic initial
states selection as an automatic exploratory curriculum to further improve the
sample-efficiency. Our approach complements Hindsight Experience Replay (HER)
by introducing a new way to pursue valuable states. Experiments conducted on
four challenging robotic manipulation tasks with binary rewards, including
Reach, Push, Pick&Place and Multi-step Push. The empirical results show that
our proposed method significantly outperforms existing methods in the first
three basic tasks and also achieves satisfactory performance in multi-step
robotic task learning.
Related papers
- O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey.
Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects.
We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z) - Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - Contrastive Initial State Buffer for Reinforcement Learning [25.849626996870526]
In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples.
We introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent in the environment.
We validate our approach on two complex robotic tasks without relying on any prior information about the environment.
arXiv Detail & Related papers (2023-09-18T13:26:40Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Is Curiosity All You Need? On the Utility of Emergent Behaviours from
Curious Exploration [20.38772636693469]
We argue that merely using curiosity for fast environment exploration or as a bonus reward for a specific task does not harness the full potential of this technique.
We propose to shift the focus towards retaining the behaviours which emerge during curiosity-based learning.
arXiv Detail & Related papers (2021-09-17T15:28:25Z) - BeBold: Exploration Beyond the Boundary of Explored Regions [66.88415950549556]
In this paper, we propose the regulated difference of inverse visitation counts as a simple but effective criterion for intrinsic reward (IR)
The criterion helps the agent explore Beyond the Boundary of explored regions and mitigates common issues in count-based methods, such as short-sightedness and detachment.
The resulting method, BeBold, solves the 12 most challenging procedurally-generated tasks in MiniGrid with just 120M environment steps, without any curriculum learning.
arXiv Detail & Related papers (2020-12-15T21:26:54Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z) - RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated
Environments [15.736899098702972]
We propose a novel type of intrinsic reward which encourages the agent to take actions that lead to significant changes in its learned state representation.
We evaluate our method on multiple challenging procedurally-generated tasks in MiniGrid.
arXiv Detail & Related papers (2020-02-27T18:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.