Delay-Aware Model-Based Reinforcement Learning for Continuous Control
- URL: http://arxiv.org/abs/2005.05440v1
- Date: Mon, 11 May 2020 21:13:37 GMT
- Title: Delay-Aware Model-Based Reinforcement Learning for Continuous Control
- Authors: Baiming Chen, Mengdi Xu, Liang Li, Ding Zhao
- Abstract summary: Action delays degrade the performance of reinforcement learning in many real-world systems.
This paper proposes a formal definition of delay-aware Markov Decision Process.
We develop a delay-aware model-based reinforcement learning framework.
- Score: 22.92068095246967
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Action delays degrade the performance of reinforcement learning in many
real-world systems. This paper proposes a formal definition of delay-aware
Markov Decision Process and proves it can be transformed into standard MDP with
augmented states using the Markov reward process. We develop a delay-aware
model-based reinforcement learning framework that can incorporate the
multi-step delay into the learned system models without learning effort.
Experiments with the Gym and MuJoCo platforms show that the proposed
delay-aware model-based algorithm is more efficient in training and
transferable between systems with various durations of delay compared with
off-policy model-free reinforcement learning methods. Codes available at:
https://github.com/baimingc/dambrl.
Related papers
- Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Delay-Aware Hierarchical Federated Learning [7.292078085289465]
The paper introduces delay-aware hierarchical federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training.
During global synchronization, the cloud server consolidates local models with an outdated global model using a convex control algorithm.
Numerical evaluations show DFL's superior performance in terms of faster global model, reduced convergence resource, and evaluations against communication delays.
arXiv Detail & Related papers (2023-03-22T09:23:29Z) - Scheduling and Aggregation Design for Asynchronous Federated Learning
over Wireless Networks [56.91063444859008]
Federated Learning (FL) is a collaborative machine learning framework that combines on-device training and server-based aggregation.
We propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems.
We show that an age-aware'' aggregation weighting design can significantly improve the learning performance in an asynchronous FL setting.
arXiv Detail & Related papers (2022-12-14T17:33:01Z) - Gated Recurrent Neural Networks with Weighted Time-Delay Feedback [59.125047512495456]
We introduce a novel gated recurrent unit (GRU) with a weighted time-delay feedback mechanism.
We show that $tau$-GRU can converge faster and generalize better than state-of-the-art recurrent units and gated recurrent architectures.
arXiv Detail & Related papers (2022-12-01T02:26:34Z) - Improving the Performance of Robust Control through Event-Triggered
Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem.
We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z) - Unit-Modulus Wireless Federated Learning Via Penalty Alternating
Minimization [64.76619508293966]
Wireless federated learning (FL) is an emerging machine learning paradigm that trains a global parametric model from distributed datasets via wireless communications.
This paper proposes a wireless FL framework, which uploads local model parameters and computes global model parameters via wireless communications.
arXiv Detail & Related papers (2021-08-31T08:19:54Z) - Revisiting State Augmentation methods for Reinforcement Learning with
Stochastic Delays [10.484851004093919]
This paper formally describes the notion of Markov Decision Processes (MDPs) with delays.
We show that delayed MDPs can be transformed into equivalent standard MDPs (without delays) with significantly simplified cost structure.
We employ this equivalence to derive a model-free Delay-Resolved RL framework and show that even a simple RL algorithm built upon this framework achieves near-optimal rewards in environments with delays in actions and observations.
arXiv Detail & Related papers (2021-08-17T10:45:55Z) - Continuous-Time Model-Based Reinforcement Learning [4.427447378048202]
We propose a continuous-time MBRL framework based on a novel actor-critic method.
We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems.
arXiv Detail & Related papers (2021-02-09T11:30:19Z) - Fast-Convergent Federated Learning [82.32029953209542]
Federated learning is a promising solution for distributing machine learning tasks through modern networks of mobile devices.
We propose a fast-convergent federated learning algorithm, called FOLB, which performs intelligent sampling of devices in each round of model training.
arXiv Detail & Related papers (2020-07-26T14:37:51Z) - Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and
Competitive Environments [23.301322095357808]
Action and observation delays exist prevalently in the real-world cyber-physical systems.
This paper proposes a novel framework to deal with delays as well as the non-stationary training issue of multi-agent tasks.
Experiments are conducted in multi-agent particle environments including cooperative communication, cooperative navigation, and competitive experiments.
arXiv Detail & Related papers (2020-05-11T21:21:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.