Approximate Nash Equilibrium Learning for n-Player Markov Games in
Dynamic Pricing
- URL: http://arxiv.org/abs/2207.06492v3
- Date: Sat, 2 Mar 2024 14:54:10 GMT
- Title: Approximate Nash Equilibrium Learning for n-Player Markov Games in
Dynamic Pricing
- Authors: Larkin Liu
- Abstract summary: We investigate Nash equilibrium learning in a competitive Markov Game (MG) environment.
We develop a new model-free method to find approximate Nash equilibria.
We demonstrate that an approximate Nash equilibrium can be learned, particularly in the dynamic pricing domain.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate Nash equilibrium learning in a competitive Markov Game (MG)
environment, where multiple agents compete, and multiple Nash equilibria can
exist. In particular, for an oligopolistic dynamic pricing environment, exact
Nash equilibria are difficult to obtain due to the curse-of-dimensionality. We
develop a new model-free method to find approximate Nash equilibria.
Gradient-free black box optimization is then applied to estimate $\epsilon$,
the maximum reward advantage of an agent unilaterally deviating from any joint
policy, and to also estimate the $\epsilon$-minimizing policy for any given
state. The policy-$\epsilon$ correspondence and the state to
$\epsilon$-minimizing policy are represented by neural networks, the latter
being the Nash Policy Net. During batch update, we perform Nash Q learning on
the system, by adjusting the action probabilities using the Nash Policy Net. We
demonstrate that an approximate Nash equilibrium can be learned, particularly
in the dynamic pricing domain where exact solutions are often intractable.
Related papers
- On Tractable $Φ$-Equilibria in Non-Concave Games [53.212133025684224]
Non-concave games introduce significant game-theoretic and optimization challenges.
We show that when $Phi$ is finite, there exists an efficient uncoupled learning algorithm that converges to the corresponding $Phi$-equilibria.
We also show that Online Gradient Descent can efficiently approximate $Phi$-equilibria in non-trivial regimes.
arXiv Detail & Related papers (2024-03-13T01:51:30Z) - Optimistic Policy Gradient in Multi-Player Markov Games with a Single
Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller.
Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z) - Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability [11.793922711718645]
We consider decentralized learning for zero-sum games, where players only see their information and are to actions and payoffs of the opponent.
Previous works demonstrated convergence to a Nash equilibrium in this setting using double time-scale algorithms under strong reachability assumptions.
Our contribution is a rational and convergent algorithm, utilizing Tsallis-entropy regularization in a value-iteration-based algorithm.
arXiv Detail & Related papers (2023-12-13T09:31:30Z) - Smooth Nash Equilibria: Algorithms and Complexity [38.08108978808664]
A fundamental shortcoming of the concept of Nash equilibrium is its computational intractability.
In a $sigma$-smooth Nash equilibrium, players only need to achieve utility at least as high as their best deviation to a $sigma$-smooth strategy.
We show that both weak and strong $sigma$-smooth Nash equilibria have superior computational properties to Nash equilibria.
arXiv Detail & Related papers (2023-09-21T16:22:07Z) - A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning [53.83345471268163]
We investigate learning the equilibria in non-stationary multi-agent systems.
We show how to test for various types of equilibria by a black-box reduction to single-agent learning.
arXiv Detail & Related papers (2023-06-12T23:48:24Z) - Differentiable Arbitrating in Zero-sum Markov Games [59.62061049680365]
We study how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating.
The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way.
We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level.
arXiv Detail & Related papers (2023-02-20T16:05:04Z) - Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution
Concept over Nash Equilibria [61.093297204685264]
An effective approach in multiagent reinforcement learning is to consider the learning process of agents and influence their future policies.
This new solution concept is general such that standard solution concepts, such as a Nash equilibrium, are special cases of active equilibria.
We analyze active equilibria from a game-theoretic perspective by closely studying examples where Nash equilibria are known.
arXiv Detail & Related papers (2022-10-28T14:45:39Z) - Operator Splitting for Learning to Predict Equilibria in Convex Games [26.92001486095397]
We introduce Nash Fixed Point Networks (N-FPNs), a class of neural networks that naturally output equilibria.
N-FPNs are compatible with the recently developed Jacobian-Free Backpropagation technique for training implicit networks.
Our experiments show N-FPNs are capable of scaling to problems orders of magnitude larger than existing learned game solvers.
arXiv Detail & Related papers (2021-06-02T02:55:46Z) - No-regret learning and mixed Nash equilibria: They do not mix [64.37511607254115]
We study the dynamics of "follow-the-regularized-leader" (FTRL)
We show that any Nash equilibrium which is not strict cannot be stable and attracting under FTRL.
This result has significant implications for predicting the outcome of a learning process.
arXiv Detail & Related papers (2020-10-19T13:49:06Z) - Learning Nash Equilibria in Zero-Sum Stochastic Games via
Entropy-Regularized Policy Approximation [18.35524179586723]
We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum games.
We propose a new Q-learning type algorithm that uses a sequence of entropy-regularized soft policies to approximate the Nash policy.
We prove that under certain conditions, by updating the regularized Q-function, the algorithm converges to a Nash equilibrium.
arXiv Detail & Related papers (2020-09-01T01:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.