Learning Cooperation and Online Planning Through Simulation and Graph
Convolutional Network
- URL: http://arxiv.org/abs/2110.08480v1
- Date: Sat, 16 Oct 2021 05:54:32 GMT
- Title: Learning Cooperation and Online Planning Through Simulation and Graph
Convolutional Network
- Authors: Rafid Ameer Mahmud, Fahim Faisal, Saaduddin Mahmud, Md. Mosaddek Khan
- Abstract summary: We introduce a simulation based online planning algorithm, that we call SiCLOP, for multi-agent cooperative environments.
Specifically, SiCLOP tailors Monte Carlo Tree Search (MCTS) and uses Coordination Graph (CG) and Graph Neural Network (GCN) to learn cooperation.
It also improves scalability through an effective pruning of action space.
- Score: 5.505634045241288
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Multi-agent Markov Decision Process (MMDP) has been an effective way of
modelling sequential decision making algorithms for multi-agent cooperative
environments. A number of algorithms based on centralized and decentralized
planning have been developed in this domain. However, dynamically changing
environment, coupled with exponential size of the state and joint action space,
make it difficult for these algorithms to provide both efficiency and
scalability. Recently, Centralized planning algorithm FV-MCTS-MP and
decentralized planning algorithm \textit{Alternate maximization with
Behavioural Cloning} (ABC) have achieved notable performance in solving MMDPs.
However, they are not capable of adapting to dynamically changing environments
and accounting for the lack of communication among agents, respectively.
Against this background, we introduce a simulation based online planning
algorithm, that we call SiCLOP, for multi-agent cooperative environments.
Specifically, SiCLOP tailors Monte Carlo Tree Search (MCTS) and uses
Coordination Graph (CG) and Graph Neural Network (GCN) to learn cooperation and
provides real time solution of a MMDP problem. It also improves scalability
through an effective pruning of action space. Additionally, unlike FV-MCTS-MP
and ABC, SiCLOP supports transfer learning, which enables learned agents to
operate in different environments. We also provide theoretical discussion about
the convergence property of our algorithm within the context of multi-agent
settings. Finally, our extensive empirical results show that SiCLOP
significantly outperforms the state-of-the-art online planning algorithms.
Related papers
- Online Parallel Multi-Task Relationship Learning via Alternating Direction Method of Multipliers [37.859185005986056]
Online multi-task learning (OMTL) enhances streaming data processing by leveraging the inherent relations among multiple tasks.
This study proposes a novel OMTL framework based on the alternating direction multiplier method (ADMM), a recent breakthrough in optimization suitable for the distributed computing environment.
arXiv Detail & Related papers (2024-11-09T10:20:13Z) - Performance-Aware Self-Configurable Multi-Agent Networks: A Distributed Submodular Approach for Simultaneous Coordination and Network Design [3.5527561584422465]
We present AlterNAting COordination and Network-Design Algorithm (Anaconda)
Anaconda is a scalable algorithm that also enjoys near-optimality guarantees.
We demonstrate in simulated scenarios of area monitoring and compare it with a state-of-the-art algorithm.
arXiv Detail & Related papers (2024-09-02T18:11:33Z) - Intelligent Hybrid Resource Allocation in MEC-assisted RAN Slicing Network [72.2456220035229]
We aim to maximize the SSR for heterogeneous service demands in the cooperative MEC-assisted RAN slicing system.
We propose a recurrent graph reinforcement learning (RGRL) algorithm to intelligently learn the optimal hybrid RA policy.
arXiv Detail & Related papers (2024-05-02T01:36:13Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - DESTRESS: Computation-Optimal and Communication-Efficient Decentralized
Nonconvex Finite-Sum Optimization [43.31016937305845]
Internet-of-things, networked sensing, autonomous systems and federated learning call for decentralized algorithms for finite-sum optimizations.
We develop DEcentralized STochastic REcurSive methodDESTRESS for non finite-sum optimization.
Detailed theoretical and numerical comparisons show that DESTRESS improves upon prior decentralized algorithms.
arXiv Detail & Related papers (2021-10-04T03:17:41Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - Scalable Anytime Planning for Multi-Agent MDPs [37.69939216970677]
We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration.
Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection.
arXiv Detail & Related papers (2021-01-12T22:50:17Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding
Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed.
An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed.
We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.