Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent
Control
- URL: http://arxiv.org/abs/2009.09361v1
- Date: Sun, 20 Sep 2020 06:11:42 GMT
- Title: Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent
Control
- Authors: Qingrui Zhang, Hao Dong, Wei Pan
- Abstract summary: In decentralized multi-agent control, systems are complex with unknown or highly uncertain dynamics.
Deep reinforcement learning (DRL) is promising to learn the controller/policy from data without the knowing system dynamics.
Existing multi-agent reinforcement learning (MARL) algorithms cannot ensure the closed-loop stability of a multi-agent system.
We propose a new MARL algorithm for decentralized multi-agent control with a stability guarantee.
- Score: 3.3788926259119645
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Decentralized multi-agent control has broad applications, ranging from
multi-robot cooperation to distributed sensor networks. In decentralized
multi-agent control, systems are complex with unknown or highly uncertain
dynamics, where traditional model-based control methods can hardly be applied.
Compared with model-based control in control theory, deep reinforcement
learning (DRL) is promising to learn the controller/policy from data without
the knowing system dynamics. However, to directly apply DRL to decentralized
multi-agent control is challenging, as interactions among agents make the
learning environment non-stationary. More importantly, the existing multi-agent
reinforcement learning (MARL) algorithms cannot ensure the closed-loop
stability of a multi-agent system from a control-theoretic perspective, so the
learned control polices are highly possible to generate abnormal or dangerous
behaviors in real applications. Hence, without stability guarantee, the
application of the existing MARL algorithms to real multi-agent systems is of
great concern, e.g., UAVs, robots, and power systems, etc. In this paper, we
aim to propose a new MARL algorithm for decentralized multi-agent control with
a stability guarantee. The new MARL algorithm, termed as a multi-agent
soft-actor critic (MASAC), is proposed under the well-known framework of
"centralized-training-with-decentralized-execution". The closed-loop stability
is guaranteed by the introduction of a stability constraint during the policy
improvement in our MASAC algorithm. The stability constraint is designed based
on Lyapunov's method in control theory. To demonstrate the effectiveness, we
present a multi-agent navigation example to show the efficiency of the proposed
MASAC algorithm.
Related papers
- Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - Decentralized Event-Triggered Online Learning for Safe Consensus of
Multi-Agent Systems with Gaussian Process Regression [3.405252606286664]
This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamics.
For continuous enhancement in predictive performance, a data-efficient online learning strategy with a decentralized event-triggered mechanism is proposed.
To demonstrate the efficacy of the proposed learning-based controller, a comparative analysis is conducted, contrasting it with both conventional distributed control laws and offline learning methodologies.
arXiv Detail & Related papers (2024-02-05T16:41:17Z) - Effective Multi-Agent Deep Reinforcement Learning Control with Relative
Entropy Regularization [6.441951360534903]
Multi-Agent Continuous Dynamic Policy Gradient (MACDPP) was proposed to tackle the issues of limited capability and sample efficiency in various scenarios controlled by multiple agents.
It alleviates the inconsistency of multiple agents' policy updates by introducing the relative entropy regularization to the Training with Decentralized Execution (CTDE) framework with the Actor-Critic (AC) structure.
arXiv Detail & Related papers (2023-09-26T07:38:19Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - A Regret Minimization Approach to Multi-Agent Control [24.20403443262127]
We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances.
We give a reduction from any (standard) regret minimizing control method to a distributed algorithm.
We show that the distributed method is robust to failure and to adversarial perturbations in the dynamics.
arXiv Detail & Related papers (2022-01-28T14:57:59Z) - Relative Distributed Formation and Obstacle Avoidance with Multi-agent
Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL)
Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - Enforcing robust control guarantees within neural network policies [76.00287474159973]
We propose a generic nonlinear control policy class, parameterized by neural networks, that enforces the same provable robustness criteria as robust control.
We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.
arXiv Detail & Related papers (2020-11-16T17:14:59Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.