Related papers: Delay-Aware Model-Based Reinforcement Learning for Continuous Control

Delay-Aware Model-Based Reinforcement Learning for Continuous Control

URL: http://arxiv.org/abs/2005.05440v1
Date: Mon, 11 May 2020 21:13:37 GMT
Title: Delay-Aware Model-Based Reinforcement Learning for Continuous Control
Authors: Baiming Chen, Mengdi Xu, Liang Li, Ding Zhao
Abstract summary: Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process. We develop a delay-aware model-based reinforcement learning framework.
Score: 22.92068095246967
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the learned system models without learning effort. Experiments with the Gym and MuJoCo platforms show that the proposed delay-aware model-based algorithm is more efficient in training and transferable between systems with various durations of delay compared with off-policy model-free reinforcement learning methods. Codes available at: https://github.com/baimingc/dambrl.

Related papers

Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models. They rely on sequential denoising steps during sample generation. We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
Delay-Aware Hierarchical Federated Learning [7.292078085289465]
The paper introduces delay-aware hierarchical federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training. During global synchronization, the cloud server consolidates local models with an outdated global model using a convex control algorithm. Numerical evaluations show DFL's superior performance in terms of faster global model, reduced convergence resource, and evaluations against communication delays.
arXiv Detail & Related papers (2023-03-22T09:23:29Z)
Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks [56.91063444859008]
Federated Learning (FL) is a collaborative machine learning framework that combines on-device training and server-based aggregation. We propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems. We show that an age-aware'' aggregation weighting design can significantly improve the learning performance in an asynchronous FL setting.
arXiv Detail & Related papers (2022-12-14T17:33:01Z)
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback [59.125047512495456]
We introduce a novel gated recurrent unit (GRU) with a weighted time-delay feedback mechanism. We show that $tau$-GRU can converge faster and generalize better than state-of-the-art recurrent units and gated recurrent architectures.
arXiv Detail & Related papers (2022-12-01T02:26:34Z)
Improving the Performance of Robust Control through Event-Triggered Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem. We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z)
Unit-Modulus Wireless Federated Learning Via Penalty Alternating Minimization [64.76619508293966]
Wireless federated learning (FL) is an emerging machine learning paradigm that trains a global parametric model from distributed datasets via wireless communications. This paper proposes a wireless FL framework, which uploads local model parameters and computes global model parameters via wireless communications.
arXiv Detail & Related papers (2021-08-31T08:19:54Z)
Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays [10.484851004093919]
This paper formally describes the notion of Markov Decision Processes (MDPs) with delays. We show that delayed MDPs can be transformed into equivalent standard MDPs (without delays) with significantly simplified cost structure. We employ this equivalence to derive a model-free Delay-Resolved RL framework and show that even a simple RL algorithm built upon this framework achieves near-optimal rewards in environments with delays in actions and observations.
arXiv Detail & Related papers (2021-08-17T10:45:55Z)
Continuous-Time Model-Based Reinforcement Learning [4.427447378048202]
We propose a continuous-time MBRL framework based on a novel actor-critic method. We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems.
arXiv Detail & Related papers (2021-02-09T11:30:19Z)
Fast-Convergent Federated Learning [82.32029953209542]
Federated learning is a promising solution for distributing machine learning tasks through modern networks of mobile devices. We propose a fast-convergent federated learning algorithm, called FOLB, which performs intelligent sampling of devices in each round of model training.
arXiv Detail & Related papers (2020-07-26T14:37:51Z)
Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives. It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise. We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments [23.301322095357808]
Action and observation delays exist prevalently in the real-world cyber-physical systems. This paper proposes a novel framework to deal with delays as well as the non-stationary training issue of multi-agent tasks. Experiments are conducted in multi-agent particle environments including cooperative communication, cooperative navigation, and competitive experiments.
arXiv Detail & Related papers (2020-05-11T21:21:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.