Ubiquitous Distributed Deep Reinforcement Learning at the Edge:
Analyzing Byzantine Agents in Discrete Action Spaces
- URL: http://arxiv.org/abs/2008.07863v1
- Date: Tue, 18 Aug 2020 11:25:39 GMT
- Title: Ubiquitous Distributed Deep Reinforcement Learning at the Edge:
Analyzing Byzantine Agents in Discrete Action Spaces
- Authors: Wenshuai Zhao, Jorge Pe\~na Queralta, Li Qingqing, Tomi Westerlund
- Abstract summary: This paper discusses some of the challenges in multi-agent distributed deep reinforcement learning that can occur in the presence of byzantine or malfunctioning agents.
We show how wrong discrete actions can significantly affect the collaborative learning effort.
Experiments are carried out in a simulation environment using the Atari testbed for the discrete action spaces, and advantage actor-critic (A2C) for the distributed multi-agent training.
- Score: 0.06554326244334865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The integration of edge computing in next-generation mobile networks is
bringing low-latency and high-bandwidth ubiquitous connectivity to a myriad of
cyber-physical systems. This will further boost the increasing intelligence
that is being embedded at the edge in various types of autonomous systems,
where collaborative machine learning has the potential to play a significant
role. This paper discusses some of the challenges in multi-agent distributed
deep reinforcement learning that can occur in the presence of byzantine or
malfunctioning agents. As the simulation-to-reality gap gets bridged, the
probability of malfunctions or errors must be taken into account. We show how
wrong discrete actions can significantly affect the collaborative learning
effort. In particular, we analyze the effect of having a fraction of agents
that might perform the wrong action with a given probability. We study the
ability of the system to converge towards a common working policy through the
collaborative learning process based on the number of experiences from each of
the agents to be aggregated for each policy update, together with the fraction
of wrong actions from agents experiencing malfunctions. Our experiments are
carried out in a simulation environment using the Atari testbed for the
discrete action spaces, and advantage actor-critic (A2C) for the distributed
multi-agent training.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.