Related papers: Automating Reinforcement Learning with Example-based Resets

Automating Reinforcement Learning with Example-based Resets

URL: http://arxiv.org/abs/2204.02041v2
Date: Wed, 6 Apr 2022 02:21:58 GMT
Title: Automating Reinforcement Learning with Example-based Resets
Authors: Jigang Kim, J. hyeon Park, Daesol Cho and H. Jin Kim
Abstract summary: Existing reinforcement learning algorithms assume an episodic setting in which the agent resets to a fixed initial state distribution at the end of each episode. We propose an extension to conventional reinforcement learning towards greater autonomy by introducing an additional agent that learns to reset in a self-supervised manner. We apply our method to learn from scratch on a suite of simulated and real-world continuous control tasks and demonstrate that the reset agent successfully learns to reduce manual resets.
Score: 19.86233948960312
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Deep reinforcement learning has enabled robots to learn motor skills from environmental interactions with minimal to no prior knowledge. However, existing reinforcement learning algorithms assume an episodic setting, in which the agent resets to a fixed initial state distribution at the end of each episode, to successfully train the agents from repeated trials. Such reset mechanism, while trivial for simulated tasks, can be challenging to provide for real-world robotics tasks. Resets in robotic systems often require extensive human supervision and task-specific workarounds, which contradicts the goal of autonomous robot learning. In this paper, we propose an extension to conventional reinforcement learning towards greater autonomy by introducing an additional agent that learns to reset in a self-supervised manner. The reset agent preemptively triggers a reset to prevent manual resets and implicitly imposes a curriculum for the forward agent. We apply our method to learn from scratch on a suite of simulated and real-world continuous control tasks and demonstrate that the reset agent successfully learns to reduce manual resets whilst also allowing the forward policy to improve gradually over time.

Related papers

Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning without Task-Specific Knowledge [25.168236693829783]
A significant bottleneck in applying current reinforcement learning algorithms to real-world scenarios is the need to reset the environment between every episode. We propose a novel ARL algorithm that can generate a curriculum adaptive to the agent's learning progress without task-specific knowledge.
arXiv Detail & Related papers (2023-11-15T18:40:10Z)
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z)
Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states. VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z)
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems. This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z)
Continual Learning of Control Primitives: Skill Discovery via Reset-Games [128.36174682118488]
We show how a single method can allow an agent to acquire skills with minimal supervision. We do this by exploiting the insight that the need to "reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of "reset-skills"
arXiv Detail & Related papers (2020-11-10T18:07:44Z)
Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection. We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted. In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning [5.482532589225552]
We present a behaviour-based reinforcement learning approach, inspired by Brook's subsumption architecture. Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer. Our approach learns the pick and place task in 8,000 episodes, which represents a drastic reduction in the number of training episodes required by an end-to-end approach.
arXiv Detail & Related papers (2020-01-22T11:49:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.