Related papers: Resilient Consensus-based Multi-agent Reinforcement Learning

Resilient Consensus-based Multi-agent Reinforcement Learning

URL: http://arxiv.org/abs/2111.06776v1
Date: Fri, 12 Nov 2021 15:38:01 GMT
Title: Resilient Consensus-based Multi-agent Reinforcement Learning
Authors: Martin Figura, Yixuan Lin, Ji Liu, Vijay Gupta
Abstract summary: We consider a fully decentralized network, where each agent receives a local reward and observes global state and action. We show that in the presence of Byzantine agents, whose estimation and communication strategies are completely arbitrary, the estimates of the cooperative agents converge to a bounded consensus value with probability one. We prove that the policy of the cooperative agents converges with probability one to a bounded neighborhood around a local maximizer of their team-average objective function.
Score: 22.774403531759592
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial attacks during training can strongly influence the performance of multi-agent reinforcement learning algorithms. It is, thus, highly desirable to augment existing algorithms such that the impact of adversarial attacks on cooperative networks is eliminated, or at least bounded. In this work, we consider a fully decentralized network, where each agent receives a local reward and observes the global state and action. We propose a resilient consensus-based actor-critic algorithm, whereby each agent estimates the team-average reward and value function, and communicates the associated parameter vectors to its immediate neighbors. We show that in the presence of Byzantine agents, whose estimation and communication strategies are completely arbitrary, the estimates of the cooperative agents converge to a bounded consensus value with probability one, provided that there are at most $H$ Byzantine agents in the neighborhood of each cooperative agent and the network is $(2H+1)$-robust. Furthermore, we prove that the policy of the cooperative agents converges with probability one to a bounded neighborhood around a local maximizer of their team-average objective function under the assumption that the policies of the adversarial agents asymptotically become stationary.

Related papers

Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches. We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
On the Hardness of Decentralized Multi-Agent Policy Evaluation under Byzantine Attacks [12.696705862929337]
We study a fully-decentralized multi-agent policy evaluation problem in the presence of up to $f$ faulty agents. In particular, we focus on the so-called Byzantine faulty model with model poisoning setting.
arXiv Detail & Related papers (2024-09-19T16:27:08Z)
Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception. We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception. We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z)
Byzantine-Resilient Decentralized Multi-Armed Bandits [25.499420566469098]
We develop an algorithm that fuses an information mixing step among agents with a truncation of inconsistent and extreme values. This framework can be used to model attackers in computer networks, instigators of offensive content into recommender systems, or manipulators of financial markets.
arXiv Detail & Related papers (2023-10-11T09:09:50Z)
Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server. $alpha$-fraction of agents are adversarial and can report arbitrary fake information. We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z)
Depthwise Convolution for Multi-Agent Communication with Enhanced Mean-Field Approximation [9.854975702211165]
We propose a new method based on local communication learning to tackle the multi-agent RL (MARL) challenge. First, we design a new communication protocol that exploits the ability of depthwise convolution to efficiently extract local relations. Second, we introduce the mean-field approximation into our method to reduce the scale of agent interactions.
arXiv Detail & Related papers (2022-03-06T07:42:43Z)
Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints. We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z)
Modeling the Interaction between Agents in Cooperative Multi-Agent Reinforcement Learning [2.9360071145551068]
We propose a novel cooperative MARL algorithm named as interactive actor-critic(IAC) IAC models the interaction of agents from perspectives of policy and value function. We extend the value decomposition methods to continuous control tasks and evaluate IAC on benchmark tasks including classic control and multi-agent particle environments.
arXiv Detail & Related papers (2021-02-10T01:58:28Z)
Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning [12.676356746752893]
Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm. We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept. We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments.
arXiv Detail & Related papers (2021-01-18T05:52:22Z)
Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly. In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously. In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z)
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities. Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.