Towards Understanding Cooperative Multi-Agent Q-Learning with Value
Factorization
- URL: http://arxiv.org/abs/2006.00587v5
- Date: Sun, 31 Oct 2021 06:21:48 GMT
- Title: Towards Understanding Cooperative Multi-Agent Q-Learning with Value
Factorization
- Authors: Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang
- Abstract summary: We formalize a multi-agent fitted Q-iteration framework for analyzing factorized multi-agent Q-learning.
Through further analysis, we find that on-policy training or richer joint value function classes can improve its local or global convergence properties.
- Score: 28.89692989420673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Value factorization is a popular and promising approach to scaling up
multi-agent reinforcement learning in cooperative settings, which balances the
learning scalability and the representational capacity of value functions.
However, the theoretical understanding of such methods is limited. In this
paper, we formalize a multi-agent fitted Q-iteration framework for analyzing
factorized multi-agent Q-learning. Based on this framework, we investigate
linear value factorization and reveal that multi-agent Q-learning with this
simple decomposition implicitly realizes a powerful counterfactual credit
assignment, but may not converge in some settings. Through further analysis, we
find that on-policy training or richer joint value function classes can improve
its local or global convergence properties, respectively. Finally, to support
our theoretical implications in practical realization, we conduct an empirical
analysis of state-of-the-art deep multi-agent Q-learning algorithms on didactic
examples and a broad set of StarCraft II unit micromanagement tasks.
Related papers
- An Empirical Investigation of Value-Based Multi-objective Reinforcement
Learning for Stochastic Environments [1.26404863283601]
This paper examines the factors influencing the frequency with which value-based MORL Q-learning algorithms learn the SER-optimal policy.
We highlight the critical impact of the noisy Q-value estimates issue on the stability and convergence of these algorithms.
arXiv Detail & Related papers (2024-01-06T08:43:08Z) - Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation
Learning [13.060023718506917]
imitation learning (IL) is a problem of learning to mimic expert behaviors from demonstrations in cooperative multi-agent systems.
We introduce a novel multi-agent IL algorithm designed to address these challenges.
Our approach enables the centralized learning by leveraging mixing networks to aggregate decentralized Q functions.
arXiv Detail & Related papers (2023-10-10T17:11:20Z) - A Unifying Perspective on Multi-Calibration: Game Dynamics for
Multi-Objective Learning [63.20009081099896]
We provide a unifying framework for the design and analysis of multicalibrated predictors.
We exploit connections to game dynamics to achieve state-of-the-art guarantees for a diverse set of multicalibration learning problems.
arXiv Detail & Related papers (2023-02-21T18:24:17Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Residual Q-Networks for Value Function Factorizing in Multi-Agent
Reinforcement Learning [0.0]
We propose a novel concept of Residual Q-Networks (RQNs) for Multi-Agent Reinforcement Learning (MARL)
The RQN learns to transform the individual Q-value trajectories in a way that preserves the Individual-Global-Max criteria (IGM)
The proposed method converges faster, with increased stability and shows robust performance in a wider family of environments.
arXiv Detail & Related papers (2022-05-30T16:56:06Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z) - QTRAN++: Improved Value Transformation for Cooperative Multi-Agent
Reinforcement Learning [70.382101956278]
QTRAN is a reinforcement learning algorithm capable of learning the largest class of joint-action value functions.
Despite its strong theoretical guarantee, it has shown poor empirical performance in complex environments.
We propose a substantially improved version, coined QTRAN++.
arXiv Detail & Related papers (2020-06-22T05:08:36Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z) - Multi-Agent Determinantal Q-Learning [39.79718674655209]
We propose multi-agent determinantal Q-learning. Q-DPP promotes agents to acquire diverse behavioral models.
We demonstrate that Q-DPP generalizes major solutions including VDN, QMIX, and QTRAN on decentralizable cooperative tasks.
arXiv Detail & Related papers (2020-06-02T09:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.