Vairiational Stochastic Games
- URL: http://arxiv.org/abs/2503.06037v1
- Date: Sat, 08 Mar 2025 03:21:23 GMT
- Title: Vairiational Stochastic Games
- Authors: Zhiyu Zhao, Haifeng Zhang,
- Abstract summary: We propose a novel variational inference framework tailored to decentralized multi-agent systems.<n>Our framework addresses the challenges posed by non-stationarity and unaligned agent objectives.<n>We demonstrate theoretical convergence guarantees for the proposed decentralized algorithms.
- Score: 1.6703448188585752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Control as Inference (CAI) framework has successfully transformed single-agent reinforcement learning (RL) by reframing control tasks as probabilistic inference problems. However, the extension of CAI to multi-agent, general-sum stochastic games (SGs) remains underexplored, particularly in decentralized settings where agents operate independently without centralized coordination. In this paper, we propose a novel variational inference framework tailored to decentralized multi-agent systems. Our framework addresses the challenges posed by non-stationarity and unaligned agent objectives, proving that the resulting policies form an $\epsilon$-Nash equilibrium. Additionally, we demonstrate theoretical convergence guarantees for the proposed decentralized algorithms. Leveraging this framework, we instantiate multiple algorithms to solve for Nash equilibrium, mean-field Nash equilibrium, and correlated equilibrium, with rigorous theoretical convergence analysis.
Related papers
- Efficient and Scalable Deep Reinforcement Learning for Mean Field Control Games [16.62770187749295]
Mean Field Control Games (MFCGs) provide a powerful theoretical framework for analyzing systems of infinitely many interacting agents.
This paper presents a scalable deep Reinforcement Learning (RL) approach to approximate equilibrium solutions of MFCGs.
arXiv Detail & Related papers (2024-12-28T02:04:53Z) - Convergence of Decentralized Actor-Critic Algorithm in General-sum Markov Games [3.8779763612314633]
We study the properties of learning algorithms in general-sum Markov games.<n>In particular, we focus on a decentralized algorithm where each agent adopts an actor-critic learning dynamic.
arXiv Detail & Related papers (2024-09-06T20:49:11Z) - Robust Adversarial Reinforcement Learning via Bounded Rationality
Curricula [23.80052541774509]
Adversarial Reinforcement Learning trains a protagonist against destabilizing forces exercised by an adversary in a competitive zero-sum Markov game.
Finding Nash equilibria requires facing complex saddle point optimization problems, which can be prohibitive to solve.
We propose a novel approach for adversarial RL based on entropy regularization to ease the complexity of the saddle point optimization problem.
arXiv Detail & Related papers (2023-11-03T00:00:32Z) - Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - Networked Communication for Decentralised Agents in Mean-Field Games [59.01527054553122]
We introduce networked communication to the mean-field game framework.
We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases.
arXiv Detail & Related papers (2023-06-05T10:45:39Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - Pareto Actor-Critic for Equilibrium Selection in Multi-Agent
Reinforcement Learning [18.20664209675016]
This work focuses on equilibrium selection in no-conflict multi-agent games.
Pareto Actor-Critic (Pareto-AC) is an actor-critic algorithm that maximises the returns of all agents.
arXiv Detail & Related papers (2022-09-28T18:14:34Z) - Monotonic Improvement Guarantees under Non-stationarity for
Decentralized PPO [66.5384483339413]
We present a new monotonic improvement guarantee for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL)
We show that a trust region constraint can be effectively enforced in a principled way by bounding independent ratios based on the number of agents in training.
arXiv Detail & Related papers (2022-01-31T20:39:48Z) - Finite-Sample Analysis of Decentralized Q-Learning for Stochastic Games [3.441021278275805]
Learning in games is arguably the most standard and fundamental setting in multi-agent reinforcement learning (MARL)
We establish the finite-sample complexity of fully decentralized Q-learning algorithms in a significant class of general approximation games (SGs)
We focus on the practical while challenging setting of fully decentralized MARL, where neither the rewards nor the actions of other agents can be observed by each agent.
arXiv Detail & Related papers (2021-12-15T03:33:39Z) - Calibration of Shared Equilibria in General Sum Partially Observable
Markov Games [15.572157454411533]
We consider a general sum partially observable Markov game where agents of different types share a single policy network.
This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena of such equilibria to real-world targets.
arXiv Detail & Related papers (2020-06-23T15:14:20Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.