Specification-Guided Learning of Nash Equilibria with High Social
Welfare
- URL: http://arxiv.org/abs/2206.03348v1
- Date: Mon, 6 Jun 2022 16:06:31 GMT
- Title: Specification-Guided Learning of Nash Equilibria with High Social
Welfare
- Authors: Kishor Jothimurugan, Suguman Bansal, Osbert Bastani and Rajeev Alur
- Abstract summary: We propose a novel reinforcement learning framework for training joint policies that form a Nash equilibrium.
We show that our algorithm computes equilibrium policies with high social welfare, whereas state-of-the-art baselines either fail to compute Nash equilibria or compute ones with comparatively lower social welfare.
- Score: 21.573746897846114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning has been shown to be an effective strategy for
automatically training policies for challenging control problems. Focusing on
non-cooperative multi-agent systems, we propose a novel reinforcement learning
framework for training joint policies that form a Nash equilibrium. In our
approach, rather than providing low-level reward functions, the user provides
high-level specifications that encode the objective of each agent. Then, guided
by the structure of the specifications, our algorithm searches over policies to
identify one that provably forms an $\epsilon$-Nash equilibrium (with high
probability). Importantly, it prioritizes policies in a way that maximizes
social welfare across all agents. Our empirical evaluation demonstrates that
our algorithm computes equilibrium policies with high social welfare, whereas
state-of-the-art baselines either fail to compute Nash equilibria or compute
ones with comparatively lower social welfare.
Related papers
- Actions Speak What You Want: Provably Sample-Efficient Reinforcement
Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks [94.07688076435818]
We study reinforcement learning for learning a Quantal Stackelberg Equilibrium (QSE) in an episodic Markov game with a leader-follower structure.
Our algorithms are based on (i) learning the quantal response model via maximum likelihood estimation and (ii) model-free or model-based RL for solving the leader's decision making problem.
arXiv Detail & Related papers (2023-07-26T10:24:17Z) - PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm [28.48626438603237]
PACER consists of a distributional critic, an actor and a sample-based encourager.
Push-forward operator is leveraged in both the critic and actor to model the return distributions and policies respectively.
A sample-based utility value policy gradient is established for the push-forward policy update.
arXiv Detail & Related papers (2023-06-11T09:45:31Z) - Balancing policy constraint and ensemble size in uncertainty-based
offline reinforcement learning [7.462336024223669]
We study the role of policy constraints as a mechanism for regulating uncertainty.
By incorporating behavioural cloning into policy updates, we show that sufficient penalisation can be achieved with a much smaller ensemble size.
We show how such an approach can facilitate stable online fine tuning, allowing for continued policy improvement while avoiding severe performance drops.
arXiv Detail & Related papers (2023-03-26T13:03:11Z) - Welfare and Fairness in Multi-objective Reinforcement Learning [1.5763562007908967]
We study fair multi-objective reinforcement learning in which an agent must learn a policy that simultaneously achieves high reward on multiple dimensions.
We show that our algorithm is provably convergent, and we demonstrate experimentally that our approach outperforms techniques based on linear scalarization.
arXiv Detail & Related papers (2022-11-30T01:40:59Z) - Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution
Concept over Nash Equilibria [61.093297204685264]
An effective approach in multiagent reinforcement learning is to consider the learning process of agents and influence their future policies.
This new solution concept is general such that standard solution concepts, such as a Nash equilibrium, are special cases of active equilibria.
We analyze active equilibria from a game-theoretic perspective by closely studying examples where Nash equilibria are known.
arXiv Detail & Related papers (2022-10-28T14:45:39Z) - Learning Stabilizing Policies in Stochastic Control Systems [20.045860624444494]
We study the effectiveness of jointly learning a policy together with a martingale certificate that proves its stability using a single learning algorithm.
Our results suggest that some form of pre-training of the policy is required for the joint optimization to repair and verify the policy successfully.
arXiv Detail & Related papers (2022-05-24T11:38:22Z) - Constructing a Good Behavior Basis for Transfer using Generalized Policy
Updates [63.58053355357644]
We study the problem of learning a good set of policies, so that when combined together, they can solve a wide variety of unseen reinforcement learning tasks.
We show theoretically that having access to a specific set of diverse policies, which we call a set of independent policies, can allow for instantaneously achieving high-level performance.
arXiv Detail & Related papers (2021-12-30T12:20:46Z) - Building a Foundation for Data-Driven, Interpretable, and Robust Policy
Design using the AI Economist [67.08543240320756]
We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations.
We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
arXiv Detail & Related papers (2021-08-06T01:30:41Z) - On Information Asymmetry in Competitive Multi-Agent Reinforcement
Learning: Convergence and Optimality [78.76529463321374]
We study the system of interacting non-cooperative two Q-learning agents.
We show that this information asymmetry can lead to a stable outcome of population learning.
arXiv Detail & Related papers (2020-10-21T11:19:53Z) - Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions [80.49176924360499]
We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
arXiv Detail & Related papers (2020-07-05T16:41:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.