Related papers: R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics

R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics

URL: http://arxiv.org/abs/2308.15039v2
Date: Fri, 15 Sep 2023 04:00:38 GMT
Title: R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics
Authors: Zexin Li, Aritra Samanta, Yufei Li, Andrea Soltoggio, Hyoseung Kim and Cong Liu
Abstract summary: This paper presents R3, a holistic solution for managing timing, memory, and algorithm performance in on-device real-time DRL training. R3 employs (i) a deadline-driven feedback loop with dynamic batch sizing for optimizing timing, (ii) efficient memory management to reduce memory footprint and allow larger replay buffer sizes, and (iii) a runtime coordinator guided by runtime analysis and a runtime profiler for adjusting memory resource reservations.
Score: 9.2327813168753
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous robotic systems, like autonomous vehicles and robotic search and rescue, require efficient on-device training for continuous adaptation of Deep Reinforcement Learning (DRL) models in dynamic environments. This research is fundamentally motivated by the need to understand and address the challenges of on-device real-time DRL, which involves balancing timing and algorithm performance under memory constraints, as exposed through our extensive empirical studies. This intricate balance requires co-optimizing two pivotal parameters of DRL training -- batch size and replay buffer size. Configuring these parameters significantly affects timing and algorithm performance, while both (unfortunately) require substantial memory allocation to achieve near-optimal performance. This paper presents R^3, a holistic solution for managing timing, memory, and algorithm performance in on-device real-time DRL training. R^3 employs (i) a deadline-driven feedback loop with dynamic batch sizing for optimizing timing, (ii) efficient memory management to reduce memory footprint and allow larger replay buffer sizes, and (iii) a runtime coordinator guided by heuristic analysis and a runtime profiler for dynamically adjusting memory resource reservations. These components collaboratively tackle the trade-offs in on-device DRL training, improving timing and algorithm performance while minimizing the risk of out-of-memory (OOM) errors. We implemented and evaluated R^3 extensively across various DRL frameworks and benchmarks on three hardware platforms commonly adopted by autonomous robotic systems. Additionally, we integrate R^3 with a popular realistic autonomous car simulator to demonstrate its real-world applicability. Evaluation results show that R^3 achieves efficacy across diverse platforms, ensuring consistent latency performance and timing predictability with minimal overhead.

Related papers

DreamerV3 for Traffic Signal Control: Hyperparameter Tuning and Performance [4.962905815955427]
Reinforcement learning (RL) has evolved into a widely investigated technology for the development of smart TSC strategies. The DreamerV3 algorithm presents compelling properties for policy learning. In this paper, a corridor TSC model is trained using the DreamerV3 algorithm to explore the benefits of world models for TSC strategy learning.
arXiv Detail & Related papers (2025-03-04T05:02:46Z)
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution [114.61347672265076]
Development of MLLMs for real-world robots is challenging due to the typically limited computation and memory capacities available on robotic platforms. We propose a Dynamic Early-Exit Framework for Robotic Vision-Language-Action Model (DeeR) that automatically adjusts the size of the activated MLLM. DeeR demonstrates significant reductions in computational costs of LLM by 5.2-6.5x and GPU memory of LLM by 2-6x without compromising performance.
arXiv Detail & Related papers (2024-11-04T18:26:08Z)
Communication- and Computation-Efficient Distributed Submodular Optimization in Robot Mesh Networks [2.8936428431504164]
We provide a communication- and computation-efficient method for distributed submodular optimization in robot mesh networks. Our method, Resource-Aware distributed Greedy (RAG), introduces a new distributed optimization paradigm. RAG's decision-time scales linearly with the network size, while state-of-the-art near-optimal submodular optimization algorithms scale cubically.
arXiv Detail & Related papers (2024-07-15T01:25:39Z)
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment. We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation. These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z)
Scaling Studies for Efficient Parameter Search and Parallelism for Large Language Model Pre-training [2.875838666718042]
We focus on parallel and distributed machine learning algorithm development, specifically for optimizing the data processing and pre-training of a set of 5 encoder-decoder LLMs. We performed a fine-grained study to quantify the relationships between three ML methods, specifically exploring Microsoft DeepSpeed Zero Redundancy stages.
arXiv Detail & Related papers (2023-10-09T02:22:00Z)
Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks. It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping. It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z)
Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion [0.7106986689736827]
Training deep reinforcement learning models is compute and memory intensive. observation space quantization reduces overall memory costs by as much as 4.2x without impacting learning performance.
arXiv Detail & Related papers (2022-10-14T19:14:47Z)
Effective Multi-User Delay-Constrained Scheduling with Deep Recurrent Reinforcement Learning [28.35473469490186]
Multi-user delay constrained scheduling is important in many real-world applications including wireless communication, live streaming, and cloud computing. We propose a deep reinforcement learning (DRL) algorithm, named Recurrent Softmax Delayed Deep Double Deterministic Policy Gradient ($mathttRSD4$) $mathttRSD4$ guarantees resource and delay constraints by Lagrangian dual and delay-sensitive queues, respectively. It also efficiently tackles partial observability with a memory mechanism enabled by the recurrent neural network (RNN) and introduces user-level decomposition and node-level
arXiv Detail & Related papers (2022-08-30T08:44:15Z)
GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction. These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization. We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z)
Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution. Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x. We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z)
Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices. We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time. Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z)
Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks [30.61220416710614]
A-Advantage-Actor-Critic (A3C) learning is known to quickly adapt to dynamic scenarios with less data and Residual Recurrent Neural Network (R2N2) to quickly update model parameters. We use the R2N2 architecture to capture a large number of host and task parameters together with temporal patterns to provide efficient scheduling decisions. Experiments conducted on real-world data set show a significant improvement in terms of energy consumption, response time, ServiceLevelAgreement and running cost by 14.4%, 7.74%, 31.9%, and 4.64%, respectively.
arXiv Detail & Related papers (2020-09-01T13:36:34Z)
Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks [44.40722828581203]
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users. A deep reinforcement learning (DRL) based solution is proposed, which includes the following components. A preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy.
arXiv Detail & Related papers (2020-01-24T23:01:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.