Ubiquitous Distributed Deep Reinforcement Learning at the Edge:
Analyzing Byzantine Agents in Discrete Action Spaces
- URL: http://arxiv.org/abs/2008.07863v1
- Date: Tue, 18 Aug 2020 11:25:39 GMT
- Title: Ubiquitous Distributed Deep Reinforcement Learning at the Edge:
Analyzing Byzantine Agents in Discrete Action Spaces
- Authors: Wenshuai Zhao, Jorge Pe\~na Queralta, Li Qingqing, Tomi Westerlund
- Abstract summary: This paper discusses some of the challenges in multi-agent distributed deep reinforcement learning that can occur in the presence of byzantine or malfunctioning agents.
We show how wrong discrete actions can significantly affect the collaborative learning effort.
Experiments are carried out in a simulation environment using the Atari testbed for the discrete action spaces, and advantage actor-critic (A2C) for the distributed multi-agent training.
- Score: 0.06554326244334865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The integration of edge computing in next-generation mobile networks is
bringing low-latency and high-bandwidth ubiquitous connectivity to a myriad of
cyber-physical systems. This will further boost the increasing intelligence
that is being embedded at the edge in various types of autonomous systems,
where collaborative machine learning has the potential to play a significant
role. This paper discusses some of the challenges in multi-agent distributed
deep reinforcement learning that can occur in the presence of byzantine or
malfunctioning agents. As the simulation-to-reality gap gets bridged, the
probability of malfunctions or errors must be taken into account. We show how
wrong discrete actions can significantly affect the collaborative learning
effort. In particular, we analyze the effect of having a fraction of agents
that might perform the wrong action with a given probability. We study the
ability of the system to converge towards a common working policy through the
collaborative learning process based on the number of experiences from each of
the agents to be aggregated for each policy update, together with the fraction
of wrong actions from agents experiencing malfunctions. Our experiments are
carried out in a simulation environment using the Atari testbed for the
discrete action spaces, and advantage actor-critic (A2C) for the distributed
multi-agent training.
Related papers
- Staged Reinforcement Learning for Complex Tasks through Decomposed
Environments [4.883558259729863]
We discuss two methods that approximate RL problems to real problems.
In the context of traffic junction simulations, we demonstrate that, if we can decompose a complex task into multiple sub-tasks, solving these tasks first can be advantageous.
From a multi-agent perspective, we introduce a training structuring mechanism that exploits the use of experience learned under the popular paradigm called Centralised Training Decentralised Execution (CTDE)
arXiv Detail & Related papers (2023-11-05T19:43:23Z) - Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation
Learning [13.060023718506917]
imitation learning (IL) is a problem of learning to mimic expert behaviors from demonstrations in cooperative multi-agent systems.
We introduce a novel multi-agent IL algorithm designed to address these challenges.
Our approach enables the centralized learning by leveraging mixing networks to aggregate decentralized Q functions.
arXiv Detail & Related papers (2023-10-10T17:11:20Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Decentralized Adversarial Training over Graphs [55.28669771020857]
The vulnerability of machine learning models to adversarial attacks has been attracting considerable attention in recent years.
This work studies adversarial training over graphs, where individual agents are subjected to varied strength perturbation space.
arXiv Detail & Related papers (2023-03-23T15:05:16Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and
Learning Mean-Field Control [23.494528616672024]
We use state-of-the-art mean-field control techniques to convert many-agent swarm control into classical single-agent control of distributions.
Here, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior.
arXiv Detail & Related papers (2022-09-15T16:15:04Z) - Coach-assisted Multi-Agent Reinforcement Learning Framework for
Unexpected Crashed Agents [120.91291581594773]
We present a formal formulation of a cooperative multi-agent reinforcement learning system with unexpected crashes.
We propose a coach-assisted multi-agent reinforcement learning framework, which introduces a virtual coach agent to adjust the crash rate during training.
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
arXiv Detail & Related papers (2022-03-16T08:22:45Z) - ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via
Convex Relaxation [32.091346776897744]
Cyber-physical attacks can challenge the robustness of multiagent reinforcement learning.
We propose a minimax MARL approach to infer the worst-case policy update of other agents.
arXiv Detail & Related papers (2021-09-14T16:18:35Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Towards Closing the Sim-to-Real Gap in Collaborative Multi-Robot Deep
Reinforcement Learning [0.06554326244334865]
We analyze how multi-agent reinforcement learning can bridge the gap to reality in distributed multi-robot systems.
We introduce the effect of sensing, calibration, and accuracy mismatches in distributed reinforcement learning.
We discuss on how both the different types of perturbances and how the number of agents experiencing those perturbances affect the collaborative learning effort.
arXiv Detail & Related papers (2020-08-18T11:57:33Z) - Multi-Agent Interactions Modeling with Correlated Policies [53.38338964628494]
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework.
We develop a Decentralized Adrial Imitation Learning algorithm with Correlated policies (CoDAIL)
Various experiments demonstrate that CoDAIL can better regenerate complex interactions close to the demonstrators.
arXiv Detail & Related papers (2020-01-04T17:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.