Convergence Rates for Localized Actor-Critic in Networked Markov
Potential Games
- URL: http://arxiv.org/abs/2303.04865v2
- Date: Sat, 8 Jul 2023 23:39:34 GMT
- Title: Convergence Rates for Localized Actor-Critic in Networked Markov
Potential Games
- Authors: Zhaoyi Zhou, Zaiwei Chen, Yiheng Lin, and Adam Wierman
- Abstract summary: We introduce a class of networked Markov potential games in which agents are associated with nodes in a network.
Each agent has its own local potential function, and the reward of each agent depends only on the states and actions of the agents within a neighborhood.
This is the first finite-sample bound for multi-agent competitive games that does not depend on the number of agents.
- Score: 12.704529528199062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a class of networked Markov potential games in which agents are
associated with nodes in a network. Each agent has its own local potential
function, and the reward of each agent depends only on the states and actions
of the agents within a neighborhood. In this context, we propose a localized
actor-critic algorithm. The algorithm is scalable since each agent uses only
local information and does not need access to the global state. Further, the
algorithm overcomes the curse of dimensionality through the use of function
approximation. Our main results provide finite-sample guarantees up to a
localization error and a function approximation error. Specifically, we achieve
an $\tilde{\mathcal{O}}(\tilde{\epsilon}^{-4})$ sample complexity measured by
the averaged Nash regret. This is the first finite-sample bound for multi-agent
competitive games that does not depend on the number of agents.
Related papers
- Cooperative Multi-Agent Constrained Stochastic Linear Bandits [2.099922236065961]
A network of $N$ agents communicate locally to minimize their collective regret while keeping their expected cost under a specified threshold $tau$.
We propose a safe distributed upper confidence bound algorithm, so called textitMA-OPLB, and establish a high probability bound on its $T$-round regret.
We show that our regret bound is of order $ mathcalOleft(fracdtau-c_0fraclog(NT)2sqrtNsqrtTlog (1/|lambda|)
arXiv Detail & Related papers (2024-10-22T19:34:53Z) - Classification with a Network of Partially Informative Agents: Enabling Wise Crowds from Individually Myopic Classifiers [1.3417382097912094]
We consider the problem of classification with a (peer-to-peer) network of heterogeneous and partially informative agents.
We propose an iterative algorithm that uses the posterior probabilities of the local and updates each agent's local belief on all the possible classes.
We show that under certain assumptions, the beliefs on the true class converge to oneally almost surely.
arXiv Detail & Related papers (2024-09-30T04:59:51Z) - Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation [59.01527054553122]
Decentralised agents can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system.
We introduce function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method.
We additionally provide new algorithms that allow agents to estimate the global empirical distribution based on a local neighbourhood.
arXiv Detail & Related papers (2024-08-21T13:32:46Z) - Refined Sample Complexity for Markov Games with Independent Linear Function Approximation [49.5660193419984]
Markov Games (MG) is an important model for Multi-Agent Reinforcement Learning (MARL)
This paper first refines the AVLPR framework by Wang et al. (2023), with an insight of designing pessimistic estimation of the sub-optimality gap.
We give the first algorithm that tackles the curse of multi-agents, attains the optimal $O(T-1/2) convergence rate, and avoids $textpoly(A_max)$ dependency simultaneously.
arXiv Detail & Related papers (2024-02-11T01:51:15Z) - Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs [49.71319907864573]
In this paper, we propose multi-agent skill discovery which enables the ease of decomposition.
Our key idea is to approximate the joint state space as a Kronecker graph, based on which we can directly estimate its Fiedler vector.
Considering that directly computing the Laplacian spectrum is intractable for tasks with infinite-scale state spaces, we further propose a deep learning extension of our method.
arXiv Detail & Related papers (2023-07-21T14:53:12Z) - Multi-Agent congestion cost minimization with linear function
approximation [0.0]
This work considers multiple agents traversing a network from a source node to the goal node.
Agents' objective is to find a path to the goal node with minimum overall cost in a decentralized way.
We propose a novel multi-agent congestion cost minimization algorithm.
arXiv Detail & Related papers (2023-01-26T08:45:44Z) - Representation Learning for General-sum Low-rank Markov Games [63.119870889883224]
We study multi-agent general-sum Markov games with nonlinear function approximation.
We focus on low-rank Markov games whose transition matrix admits a hidden low-rank structure on top of an unknown non-linear representation.
arXiv Detail & Related papers (2022-10-30T22:58:22Z) - Asymptotic Convergence of Deep Multi-Agent Actor-Critic Algorithms [0.6961253535504979]
We present sufficient conditions that ensure convergence of the multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm.
It is an example of one of the most popular paradigms of Deep Reinforcement Learning (DeepRL) for tackling continuous action spaces: the actor-critic paradigm.
arXiv Detail & Related papers (2022-01-03T10:33:52Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$.
We analyze the trade-off between the number of communication rounds and the computational effort of each agent.
Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.