Computing Pure-Strategy Nash Equilibria in a Two-Party Policy Competition: Existence and Algorithmic Approaches
- URL: http://arxiv.org/abs/2512.22552v1
- Date: Sat, 27 Dec 2025 10:44:32 GMT
- Title: Computing Pure-Strategy Nash Equilibria in a Two-Party Policy Competition: Existence and Algorithmic Approaches
- Authors: Chuang-Chieh Lin, Chi-Jen Lu, Po-An Chen, Chih-Chieh Hung,
- Abstract summary: We formulate two-party policy competition as a two-player non-cooperative game.<n>A policy's winning probability increases monotonically with its total utility across all voters.<n>A player's payoff is defined as the expected utility received by its supporters.
- Score: 1.2609533046091317
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We formulate two-party policy competition as a two-player non-cooperative game, generalizing Lin et al.'s work (2021). Each party selects a real-valued policy vector as its strategy from a compact subset of Euclidean space, and a voter's utility for a policy is given by the inner product with their preference vector. To capture the uncertainty in the competition, we assume that a policy's winning probability increases monotonically with its total utility across all voters, and we formalize this via an affine isotonic function. A player's payoff is defined as the expected utility received by its supporters. In this work, we first test and validate the isotonicity hypothesis through voting simulations. Next, we prove the existence of a pure-strategy Nash equilibrium (PSNE) in both one- and multi-dimensional settings. Although we construct a counterexample demonstrating the game's non-monotonicity, our experiments show that a decentralized gradient-based algorithm typically converges rapidly to an approximate PSNE. Finally, we present a grid-based search algorithm that finds an $ε$-approximate PSNE of the game in time polynomial in the input size and $1/ε$.
Related papers
- Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback [50.89125374999765]
We provide the first convergence guarantee for Optimistic Multiplicative Weights Update ($mathtOMWU$) in NLHF.<n>Our analysis identifies a novel marginal convergence behavior, where the probability of rarely played actions grows exponentially from exponentially small values.
arXiv Detail & Related papers (2025-12-31T12:08:29Z) - Multiplayer Nash Preference Optimization [79.15013211640566]
Reinforcement learning from human feedback (RLHF) has emerged as the standard paradigm for aligning large language models (LLMs) with human preferences.<n>Recent studies have reframed alignment as a two-player Nash game, giving rise to Nash learning from human feedback (NLHF)<n>We introduce Multiplayer Nash Preference Optimization (MNPO), a novel framework that generalizes NLHF to the multiplayer regime.
arXiv Detail & Related papers (2025-09-27T04:18:33Z) - COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences [57.70902561665269]
We propose a meta-algorithm, Convergent Meta Alignment Algorithm (COMAL), for language model alignment with general preferences.<n>We provide theoretical analysis that our meta-algorithm converges to an exact Nash policy in the last iterate and demonstrate its effectiveness on a range of synthetic and preference optimization datasets.
arXiv Detail & Related papers (2024-10-30T17:13:02Z) - Scalable and Independent Learning of Nash Equilibrium Policies in
$n$-Player Stochastic Games with Unknown Independent Chains [1.0878040851638]
We study games with independent chains and unknown transition matrices.
In this class of games, players control their own internal Markov chains whose transitions do not depend on the states/actions of other players.
We propose a fully decentralized mirror descent algorithm to learn an $epsilon$-NE policy.
arXiv Detail & Related papers (2023-12-04T03:04:09Z) - Global Nash Equilibrium in Non-convex Multi-player Game: Theory and
Algorithms [66.8634598612777]
We show that Nash equilibrium (NE) is acceptable to all players in a multi-player game.
We also show that no one can benefit unilaterally from the general theory step by step.
arXiv Detail & Related papers (2023-01-19T11:36:50Z) - Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
Markov Games with Structured Transitions [145.54544979467872]
We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions.
We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario.
Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
arXiv Detail & Related papers (2022-07-25T18:29:16Z) - Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic
Games with Independent Chains [2.132096006921048]
We consider a class of $n$-player games in which players have their own internal state/action spaces while they are coupled through their payoff functions.
For this class of games, we first show that finding a stationary Nash (NE) policy without any assumption on the reward functions is interactable.
We develop algorithms based on dual averaging and dual mirror descent, which converge to the set of $epsilon$-NE policies.
arXiv Detail & Related papers (2022-01-28T16:27:21Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Independent Policy Gradient Methods for Competitive Reinforcement
Learning [62.91197073795261]
We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents.
We show that if both players run policy gradient methods in tandem, their policies will converge to a min-max equilibrium of the game, as long as their learning rates follow a two-timescale rule.
arXiv Detail & Related papers (2021-01-11T23:20:42Z) - Provable Fictitious Play for General Mean-Field Games [111.44976345867005]
We propose a reinforcement learning algorithm for stationary mean-field games.
The goal is to learn a pair of mean-field state and stationary policy that constitutes the Nash equilibrium.
arXiv Detail & Related papers (2020-10-08T18:46:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.