Related papers: Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots

Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots

URL: http://arxiv.org/abs/2203.12759v1
Date: Wed, 23 Mar 2022 23:05:28 GMT
Title: Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots
Authors: Yufeng Yuan, Rupam Mahmood
Abstract summary: We show that when learning updates are expensive, the performance of sequential learning diminishes and is outperformed by asynchronous learning by a substantial margin. Our system learns in real-time to reach and track visual targets from pixels within two hours of experience and does so directly using real robots.
Score: 2.3061446605472558
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An oft-ignored challenge of real-world reinforcement learning is that the real world does not pause when agents make learning updates. As standard simulated environments do not address this real-time aspect of learning, most available implementations of RL algorithms process environment interactions and learning updates sequentially. As a consequence, when such implementations are deployed in the real world, they may make decisions based on significantly delayed observations and not act responsively. Asynchronous learning has been proposed to solve this issue, but no systematic comparison between sequential and asynchronous reinforcement learning was conducted using real-world environments. In this work, we set up two vision-based tasks with a robotic arm, implement an asynchronous learning system that extends a previous architecture, and compare sequential and asynchronous reinforcement learning across different action cycle times, sensory data dimensions, and mini-batch sizes. Our experiments show that when the time cost of learning updates increases, the action cycle time in sequential implementation could grow excessively long, while the asynchronous implementation can always maintain an appropriate action cycle time. Consequently, when learning updates are expensive, the performance of sequential learning diminishes and is outperformed by asynchronous learning by a substantial margin. Our system learns in real-time to reach and track visual targets from pixels within two hours of experience and does so directly using real robots, learning completely from scratch.

Related papers

Scalable Strategies for Continual Learning with Replay [0.0]
We show that replay can play a foundational role in continual learning, allowing models to reconcile new information with past knowledge.<n>In practice, however, replay is quite unscalable, doubling the cost of continual learning when applied naively.<n>We introduce consolidation, a phasic approach to replay which leads to up to 55% less replay samples being needed for a given performance target.<n>Then, we propose sequential merging, an offshoot of task arithmetic which is tailored to the continual learning setting and is shown to work well in combination with replay.
arXiv Detail & Related papers (2025-05-18T18:23:50Z)
Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback [59.768119380109084]
This paper introduces an interactive continual learning paradigm where AI models dynamically learn new skills from real-time human feedback.<n>We propose RiCL, a Reinforced interactive Continual Learning framework leveraging Large Language Models (LLMs)<n>Our RiCL approach substantially outperforms existing combinations of state-of-the-art online continual learning and noisy-label learning methods.
arXiv Detail & Related papers (2025-05-15T03:22:03Z)
Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference [22.106900089984318]
Realtime environments change even as agents perform action inference and learning. Recent advances in machine learning involve larger neural networks with longer inference times. We present an analysis of lower bounds on regret in realtime reinforcement learning.
arXiv Detail & Related papers (2024-12-18T21:43:40Z)
Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature. We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate. We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z)
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks. Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth. Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z)
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z)
Continual Predictive Learning from Videos [100.27176974654559]
We study a new continual learning problem in the context of video prediction. We propose the continual predictive learning (CPL) approach, which learns a mixture world model via predictive experience replay. We construct two new benchmarks based on RoboNet and KTH, in which different tasks correspond to different physical robotic environments or human actions.
arXiv Detail & Related papers (2022-04-12T08:32:26Z)
Learning Without a Global Clock: Asynchronous Learning in a Physics-Driven Learning Network [1.3124513975412255]
We show that desynchronizing the learning process does not degrade performance for a variety of tasks in an idealized simulation. We draw an analogy between asynchronicity and mini-batching in gradient descent, and show that they have similar effects on the learning process.
arXiv Detail & Related papers (2022-01-10T05:38:01Z)
Multi-task Learning with Attention for End-to-end Autonomous Driving [5.612688040565424]
We propose a novel multi-task attention-aware network in the conditional imitation learning framework. This does not only improve the success rate of standard benchmarks, but also the ability to react to traffic lights.
arXiv Detail & Related papers (2021-04-21T20:34:57Z)
A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels. We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
Human-in-the-Loop Methods for Data-Driven and Reinforcement Learning Systems [0.8223798883838329]
This research investigates how to integrate human interaction modalities to the reinforcement learning loop. Results show that the reward signal that is learned based upon human interaction accelerates the rate of learning of reinforcement learning algorithms.
arXiv Detail & Related papers (2020-08-30T17:28:18Z)
DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics [44.62475518267084]
We present a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots.
arXiv Detail & Related papers (2020-05-13T09:29:40Z)
Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection. We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted. In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.