Sample and Communication-Efficient Decentralized Actor-Critic Algorithms
with Finite-Time Analysis
- URL: http://arxiv.org/abs/2109.03699v1
- Date: Wed, 8 Sep 2021 15:02:21 GMT
- Title: Sample and Communication-Efficient Decentralized Actor-Critic Algorithms
with Finite-Time Analysis
- Authors: Ziyi Chen, Yi Zhou, Rongrong Chen, Shaofeng Zou
- Abstract summary: Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems.
We develop two decentralized AC and natural AC (NAC) algorithms that are private, and sample and communication-efficient.
- Score: 27.21581944906418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Actor-critic (AC) algorithms have been widely adopted in decentralized
multi-agent systems to learn the optimal joint control policy. However,
existing decentralized AC algorithms either do not preserve the privacy of
agents or are not sample and communication-efficient. In this work, we develop
two decentralized AC and natural AC (NAC) algorithms that are private, and
sample and communication-efficient. In both algorithms, agents share noisy
information to preserve privacy and adopt mini-batch updates to improve sample
and communication efficiency. Particularly for decentralized NAC, we develop a
decentralized Markovian SGD algorithm with an adaptive mini-batch size to
efficiently compute the natural policy gradient. Under Markovian sampling and
linear function approximation, we prove the proposed decentralized AC and NAC
algorithms achieve the state-of-the-art sample complexities
$\mathcal{O}\big(\epsilon^{-2}\ln(\epsilon^{-1})\big)$ and
$\mathcal{O}\big(\epsilon^{-3}\ln(\epsilon^{-1})\big)$, respectively, and the
same small communication complexity
$\mathcal{O}\big(\epsilon^{-1}\ln(\epsilon^{-1})\big)$. Numerical experiments
demonstrate that the proposed algorithms achieve lower sample and communication
complexities than the existing decentralized AC algorithm.
Related papers
- Fully First-Order Methods for Decentralized Bilevel Optimization [17.20330936572045]
This paper focuses on decentralized bilevel optimization (DSBO) where agents only communicate with their neighbors.
We propose Decentralized Gradient Descent and Ascent with Gradient Tracking (DSGDA-GT), a novel algorithm that only requires first-order oracles that are much cheaper than second-order oracles widely adopted in existing works.
arXiv Detail & Related papers (2024-10-25T06:11:43Z) - Breaking the Curse of Multiagency: Provably Efficient Decentralized
Multi-Agent RL with Function Approximation [44.051717720483595]
This paper presents the first line of MARL algorithms that provably resolve the curse of multiagency approximation.
In exchange for learning a weaker version of CCEs, this algorithm applies to a wider range of problems under generic function approximation.
Our algorithm always outputs Markov CCEs, and an optimal rate of $widetildemathcalO(epsilon-2)$ for finding $epsilon$-optimal solutions.
arXiv Detail & Related papers (2023-02-13T18:59:25Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Finite-Time Analysis of Fully Decentralized Single-Timescale
Actor-Critic [4.94128206910124]
We introduce a fully decentralized Actor-Critic (AC) algorithm, where actor, critic, and global reward estimator are updated in an alternating manner.
We show that our algorithm has sample complexity of $tildemathcalO(epsilon-2)$ under Markovian sampling.
We also provide a local action privacy-preserving version of our algorithm and its analysis.
arXiv Detail & Related papers (2022-06-12T13:14:14Z) - Learning to Coordinate in Multi-Agent Systems: A Coordinated
Actor-Critic Algorithm and Finite-Time Guarantees [43.10380224532313]
We study the emergence of coordinated behavior by autonomous agents using an actor-critic (AC) algorithm.
We propose and analyze a class of coordinated actor-critic algorithms (CAC) in which individually parametrized policies have a it shared part and a it personalized part.
This work provides the first finite-sample guarantee for decentralized AC algorithm with partially personalized policies.
arXiv Detail & Related papers (2021-10-11T20:26:16Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with
Near-Optimal Sample Complexity and Communication Complexity [13.100926925535578]
We develop two decentralized TD with correction (TDC) algorithms for multi-agent off-policy TD learning.
Our algorithms preserve full privacy of the actions, policies and rewards of the agents, and adopt mini-batch sampling to reduce the sampling variance and communication frequency.
arXiv Detail & Related papers (2021-03-24T12:48:08Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms [58.57004511121862]
This paper characterizes the convergence rate and sample complexity of AC and NAC under Markovian sampling.
We show that AC and NAC attain orderwise performance improvement over PG and NPG under infinite horizon due to the incorporation of critic.
arXiv Detail & Related papers (2020-04-27T17:11:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.