Accelerating Distributed Online Meta-Learning via Multi-Agent
Collaboration under Limited Communication
- URL: http://arxiv.org/abs/2012.08660v2
- Date: Sat, 19 Dec 2020 19:26:22 GMT
- Title: Accelerating Distributed Online Meta-Learning via Multi-Agent
Collaboration under Limited Communication
- Authors: Sen Lin, Mehmet Dedeoglu and Junshan Zhang
- Abstract summary: We propose a multi-agent online meta-learning framework and cast it as an equivalent two-level nested online convex optimization (OCO) problem.
By characterizing the upper bound of the agent-task-averaged regret, we show that the performance of multi-agent online meta-learning depends heavily on how much an agent can benefit from the distributed network-level OCO for meta-model updates via limited communication.
We show that a factor of $sqrt1/N$ speedup over the optimal single-agent regret $O(sqrtT)$ after $
- Score: 24.647993999787992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online meta-learning is emerging as an enabling technique for achieving edge
intelligence in the IoT ecosystem. Nevertheless, to learn a good meta-model for
within-task fast adaptation, a single agent alone has to learn over many tasks,
and this is the so-called 'cold-start' problem. Observing that in a multi-agent
network the learning tasks across different agents often share some model
similarity, we ask the following fundamental question: "Is it possible to
accelerate the online meta-learning across agents via limited communication and
if yes how much benefit can be achieved? " To answer this question, we propose
a multi-agent online meta-learning framework and cast it as an equivalent
two-level nested online convex optimization (OCO) problem. By characterizing
the upper bound of the agent-task-averaged regret, we show that the performance
of multi-agent online meta-learning depends heavily on how much an agent can
benefit from the distributed network-level OCO for meta-model updates via
limited communication, which however is not well understood. To tackle this
challenge, we devise a distributed online gradient descent algorithm with
gradient tracking where each agent tracks the global gradient using only one
communication step with its neighbors per iteration, and it results in an
average regret $O(\sqrt{T/N})$ per agent, indicating that a factor of
$\sqrt{1/N}$ speedup over the optimal single-agent regret $O(\sqrt{T})$ after
$T$ iterations, where $N$ is the number of agents. Building on this sharp
performance speedup, we next develop a multi-agent online meta-learning
algorithm and show that it can achieve the optimal task-average regret at a
faster rate of $O(1/\sqrt{NT})$ via limited communication, compared to
single-agent online meta-learning. Extensive experiments corroborate the
theoretic results.
Related papers
- Attention Graph for Multi-Robot Social Navigation with Deep
Reinforcement Learning [0.0]
We present MultiSoc, a new method for learning multi-agent socially aware navigation strategies using deep reinforcement learning (RL)
Inspired by recent works on multi-agent deep RL, our method leverages graph-based representation of agent interactions, combining the positions and fields of view of entities (pedestrians and agents)
Our method learns faster than social navigation deep RL mono-agent techniques, and enables efficient multi-agent implicit coordination in challenging crowd navigation with multiple heterogeneous humans.
arXiv Detail & Related papers (2024-01-31T15:24:13Z) - Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with
Agent Team Optimization [59.39113350538332]
Large language model (LLM) agents have been shown effective on a wide range of tasks, and by ensembling multiple LLM agents, their performances could be further improved.
Existing approaches employ a fixed set of agents to interact with each other in a static architecture.
We build a framework named Dynamic LLM-Agent Network ($textbfDyLAN$) for LLM-agent collaboration on complicated tasks like reasoning and code generation.
arXiv Detail & Related papers (2023-10-03T16:05:48Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement
Learning with Actor Rectification [74.10976684469435]
offline reinforcement learning (RL) algorithms can be transferred to multi-agent settings directly.
We propose a simple yet effective method, Offline Multi-Agent RL with Actor Rectification (OMAR), to tackle this critical challenge.
OMAR significantly outperforms strong baselines with state-of-the-art performance in multi-agent continuous control benchmarks.
arXiv Detail & Related papers (2021-11-22T13:27:42Z) - Provably Efficient Multi-Agent Reinforcement Learning with Fully
Decentralized Communication [3.5450828190071655]
Distributed exploration reduces sampling complexity in reinforcement learning.
We show that group performance can be significantly improved when each agent uses a decentralized message-passing protocol.
We show that incorporating more agents and more information sharing into the group learning scheme speeds up convergence to the optimal policy.
arXiv Detail & Related papers (2021-10-14T14:27:27Z) - Decentralized Cooperative Multi-Agent Reinforcement Learning with
Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams.
We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm.
We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z) - Regret Bounds for Decentralized Learning in Cooperative Multi-Agent
Dynamical Systems [3.9599054392856488]
quadratic analysis is challenging in Multi-Agent Reinforcement Learning (MARL)
We propose a MARL algorithm based on the construction of an auxiliary single-agent LQ problem.
We show that our algorithm provides a $tildeO(sqrtT)$ regret bound.
arXiv Detail & Related papers (2020-01-27T23:37:41Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.