Convergence Rates for Localized Actor-Critic in Networked Markov
Potential Games
- URL: http://arxiv.org/abs/2303.04865v2
- Date: Sat, 8 Jul 2023 23:39:34 GMT
- Title: Convergence Rates for Localized Actor-Critic in Networked Markov
Potential Games
- Authors: Zhaoyi Zhou, Zaiwei Chen, Yiheng Lin, and Adam Wierman
- Abstract summary: We introduce a class of networked Markov potential games in which agents are associated with nodes in a network.
Each agent has its own local potential function, and the reward of each agent depends only on the states and actions of the agents within a neighborhood.
This is the first finite-sample bound for multi-agent competitive games that does not depend on the number of agents.
- Score: 12.704529528199062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a class of networked Markov potential games in which agents are
associated with nodes in a network. Each agent has its own local potential
function, and the reward of each agent depends only on the states and actions
of the agents within a neighborhood. In this context, we propose a localized
actor-critic algorithm. The algorithm is scalable since each agent uses only
local information and does not need access to the global state. Further, the
algorithm overcomes the curse of dimensionality through the use of function
approximation. Our main results provide finite-sample guarantees up to a
localization error and a function approximation error. Specifically, we achieve
an $\tilde{\mathcal{O}}(\tilde{\epsilon}^{-4})$ sample complexity measured by
the averaged Nash regret. This is the first finite-sample bound for multi-agent
competitive games that does not depend on the number of agents.
Related papers
- Refined Sample Complexity for Markov Games with Independent Linear Function Approximation [49.5660193419984]
Markov Games (MG) is an important model for Multi-Agent Reinforcement Learning (MARL)
This paper first refines the AVLPR framework by Wang et al. (2023), with an insight of designing pessimistic estimation of the sub-optimality gap.
We give the first algorithm that tackles the curse of multi-agents, attains the optimal $O(T-1/2) convergence rate, and avoids $textpoly(A_max)$ dependency simultaneously.
arXiv Detail & Related papers (2024-02-11T01:51:15Z) - Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs [49.71319907864573]
In this paper, we propose multi-agent skill discovery which enables the ease of decomposition.
Our key idea is to approximate the joint state space as a Kronecker graph, based on which we can directly estimate its Fiedler vector.
Considering that directly computing the Laplacian spectrum is intractable for tasks with infinite-scale state spaces, we further propose a deep learning extension of our method.
arXiv Detail & Related papers (2023-07-21T14:53:12Z) - Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with
General Utilities [12.104551746465932]
We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints.
Our algorithm converges to a first-order stationary point (FOSP) at the rate of $mathcalOleft(T-2/3right)$.
In the sample-based setting, we demonstrate that, with high probability, our algorithm requires $widetildemathcalOleft(epsilon-3.5right)$ samples to achieve an $epsilon$-FOSP.
arXiv Detail & Related papers (2023-05-27T20:08:35Z) - Multi-Agent congestion cost minimization with linear function
approximation [0.0]
This work considers multiple agents traversing a network from a source node to the goal node.
Agents' objective is to find a path to the goal node with minimum overall cost in a decentralized way.
We propose a novel multi-agent congestion cost minimization algorithm.
arXiv Detail & Related papers (2023-01-26T08:45:44Z) - Representation Learning for General-sum Low-rank Markov Games [63.119870889883224]
We study multi-agent general-sum Markov games with nonlinear function approximation.
We focus on low-rank Markov games whose transition matrix admits a hidden low-rank structure on top of an unknown non-linear representation.
arXiv Detail & Related papers (2022-10-30T22:58:22Z) - Asymptotic Convergence of Deep Multi-Agent Actor-Critic Algorithms [0.6961253535504979]
We present sufficient conditions that ensure convergence of the multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm.
It is an example of one of the most popular paradigms of Deep Reinforcement Learning (DeepRL) for tackling continuous action spaces: the actor-critic paradigm.
arXiv Detail & Related papers (2022-01-03T10:33:52Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Towards General Function Approximation in Zero-Sum Markov Games [126.58493169301012]
This paper considers two-player zero-sum finite-horizon Markov games with simultaneous moves.
Provably efficient algorithms for both decoupled and coordinated settings are developed.
arXiv Detail & Related papers (2021-07-30T15:25:13Z) - Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$.
We analyze the trade-off between the number of communication rounds and the computational effort of each agent.
Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.