Related papers: Cooperative Actor-Critic via TD Error Aggregation

Cooperative Actor-Critic via TD Error Aggregation

URL: http://arxiv.org/abs/2207.12533v1
Date: Mon, 25 Jul 2022 21:10:39 GMT
Title: Cooperative Actor-Critic via TD Error Aggregation
Authors: Martin Figura, Yixuan Lin, Ji Liu, Vijay Gupta
Abstract summary: We introduce a decentralized actor-critic algorithm with TD error aggregation that does not violate privacy issues. We provide a convergence analysis under diminishing step size to verify that the agents maximize the team-average objective function.
Score: 12.211031907519827
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In decentralized cooperative multi-agent reinforcement learning, agents can aggregate information from one another to learn policies that maximize a team-average objective function. Despite the willingness to cooperate with others, the individual agents may find direct sharing of information about their local state, reward, and value function undesirable due to privacy issues. In this work, we introduce a decentralized actor-critic algorithm with TD error aggregation that does not violate privacy issues and assumes that communication channels are subject to time delays and packet dropouts. The cost we pay for making such weak assumptions is an increased communication burden for every agent as measured by the dimension of the transmitted data. Interestingly, the communication burden is only quadratic in the graph size, which renders the algorithm applicable in large networks. We provide a convergence analysis under diminishing step size to verify that the agents maximize the team-average objective function.

Related papers

Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation [59.01527054553122]
Decentralised agents can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. We introduce function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method. We additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood.
arXiv Detail & Related papers (2024-08-21T13:32:46Z)
Compressed Regression over Adaptive Networks [58.79251288443156]
We derive the performance achievable by a network of distributed agents that solve, adaptively and in the presence of communication constraints, a regression problem. We devise an optimized allocation strategy where the parameters necessary for the optimization can be learned online by the agents.
arXiv Detail & Related papers (2023-04-07T13:41:08Z)
Towards True Lossless Sparse Communication in Multi-Agent Systems [1.911678487931003]
Communication enables agents to cooperate to achieve their goals. Recent work in learning sparse individualized communication suffers from high variance during training. We use the information bottleneck to reframe sparsity as a representation learning problem.
arXiv Detail & Related papers (2022-11-30T20:43:34Z)
Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs. Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z)
Private and Byzantine-Proof Cooperative Decision-Making [15.609414012418043]
The cooperative bandit problem is a multi-agent decision problem involving a group of agents that interact simultaneously with a multi-armed bandit. In this paper, we investigate the bandit problem under two settings - (a) when the agents wish to make their communication private with respect to the action sequence, and (b) when the agents can be byzantine. We provide upper-confidence bound algorithms that obtain optimal regret while being (a) differentially-private and (b) private. Our decentralized algorithms require no information about the network of connectivity between agents, making them scalable to large dynamic systems.
arXiv Detail & Related papers (2022-05-27T18:03:54Z)
Secure Distributed/Federated Learning: Prediction-Privacy Trade-Off for Multi-Agent System [4.190359509901197]
In the big data era, performing inference within the distributed and federated learning (DL and FL) frameworks, the central server needs to process a large amount of data. Considering the decentralized computing topology, privacy has become a first-class concern. We study the textitprivacy-aware server to multi-agent assignment problem subject to information processing constraints associated with each agent.
arXiv Detail & Related papers (2022-04-24T19:19:20Z)
Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints. We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z)
Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN) Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot. We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z)
BayGo: Joint Bayesian Learning and Information-Aware Graph Optimization [48.30183416069897]
BayGo is a novel fully decentralized joint Bayesian learning and graph optimization framework. We show that our framework achieves faster convergence and higher accuracy compared to fully-connected and star topology graphs.
arXiv Detail & Related papers (2020-11-09T11:16:55Z)
Multi-Agent Decentralized Belief Propagation on Graphs [0.0]
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs) We propose a decentralized belief propagation algorithm for the problem. Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.
arXiv Detail & Related papers (2020-11-06T18:16:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.