Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning
- URL: http://arxiv.org/abs/2010.11917v2
- Date: Fri, 23 Apr 2021 18:27:28 GMT
- Title: Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning
- Authors: Annie S. Chen, HyunJi Nam, Suraj Nair, Chelsea Finn
- Abstract summary: Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
- Score: 63.552788688544254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning from diverse offline datasets is a promising path towards learning
general purpose robotic agents. However, a core challenge in this paradigm lies
in collecting large amounts of meaningful data, while not depending on a human
in the loop for data collection. One way to address this challenge is through
task-agnostic exploration, where an agent attempts to explore without a
task-specific reward function, and collect data that can be useful for any
downstream task. While these approaches have shown some promise in simple
domains, they often struggle to explore the relevant regions of the state space
in more challenging settings, such as vision based robotic manipulation. This
challenge stems from an objective that encourages exploring everything in a
potentially vast state space. To mitigate this challenge, we propose to focus
exploration on the important parts of the state space using weak human
supervision. Concretely, we propose an exploration technique, Batch Exploration
with Examples (BEE), that explores relevant regions of the state-space, guided
by a modest number of human provided images of important states. These human
provided images only need to be collected once at the beginning of data
collection and can be collected in a matter of minutes, allowing us to scalably
collect diverse datasets, which can then be combined with any batch RL
algorithm. We find that BEE is able to tackle challenging vision-based
manipulation tasks both in simulation and on a real Franka robot, and observe
that compared to task-agnostic and weakly-supervised exploration techniques, it
(1) interacts more than twice as often with relevant objects, and (2) improves
downstream task performance when used in conjunction with offline RL.
Related papers
- Label-Efficient 3D Object Detection For Road-Side Units [10.663986706501188]
Collaborative perception can enhance the perception of autonomous vehicles via deep information fusion with intelligent roadside units (RSU)
The data-hungry nature of these methods creates a major hurdle for their real-world deployment, particularly due to the need for annotated RSU data.
We devise a label-efficient object detection method for RSU based on unsupervised object discovery.
arXiv Detail & Related papers (2024-04-09T12:29:16Z) - Imagine, Initialize, and Explore: An Effective Exploration Method in
Multi-Agent Reinforcement Learning [27.81925751697255]
We propose a novel method for efficient multi-agent exploration in complex scenarios.
We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively.
By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important underexplored regions.
arXiv Detail & Related papers (2024-02-28T01:45:01Z) - Transfer learning with generative models for object detection on limited datasets [1.4999444543328293]
In some fields, such as marine biology, it is necessary to have correctly labeled bounding boxes around each object.
We propose a transfer learning framework that is valid for a generic scenario.
Our results pave the way for new generative AI-based protocols for machine learning applications in various domains.
arXiv Detail & Related papers (2024-02-09T21:17:31Z) - Accelerating Exploration with Unlabeled Prior Data [66.43995032226466]
We study how prior data without reward labels may be used to guide and accelerate exploration for an agent solving a new sparse reward task.
We propose a simple approach that learns a reward model from online experience, labels the unlabeled prior data with optimistic rewards, and then uses it concurrently alongside the online data for downstream policy and critic optimization.
arXiv Detail & Related papers (2023-11-09T00:05:17Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection [4.534713782093219]
A novel end-to-end transformer-based framework (FGAHOI) is proposed to alleviate the above problems.
FGAHOI comprises three dedicated components namely, multi-scale sampling (MSS), hierarchical spatial-aware merging (HSAM) and task-aware merging mechanism (TAM)
arXiv Detail & Related papers (2023-01-08T03:53:50Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.