On the Equilibrium Elicitation of Markov Games Through Information
Design
- URL: http://arxiv.org/abs/2102.07152v1
- Date: Sun, 14 Feb 2021 13:30:06 GMT
- Title: On the Equilibrium Elicitation of Markov Games Through Information
Design
- Authors: Tao Zhang, Quanyan Zhu
- Abstract summary: We study how the craft of payoffrelevant environmental signals solely can influence the behaviors of intelligent agents.
An obedient principle is established which states that it is without loss of generality to focus on the direct information design.
A new framework for information design is proposed based on an approach of maximizing the optimal slack variables.
- Score: 32.37168850559519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work considers a novel information design problem and studies how the
craft of payoff-relevant environmental signals solely can influence the
behaviors of intelligent agents. The agents' strategic interactions are
captured by an incomplete-information Markov game, in which each agent first
selects one environmental signal from multiple signal sources as additional
payoff-relevant information and then takes an action. There is a rational
information designer (designer) who possesses one signal source and aims to
control the equilibrium behaviors of the agents by designing the information
structure of her signals sent to the agents. An obedient principle is
established which states that it is without loss of generality to focus on the
direct information design when the information design incentivizes each agent
to select the signal sent by the designer, such that the design process avoids
the predictions of the agents' strategic selection behaviors. We then introduce
the design protocol given a goal of the designer referred to as obedient
implementability (OIL) and characterize the OIL in a class of obedient perfect
Bayesian Markov Nash equilibria (O-PBME). A new framework for information
design is proposed based on an approach of maximizing the optimal slack
variables. Finally, we formulate the designer's goal selection problem and
characterize it in terms of information design by establishing a relationship
between the O-PBME and the Bayesian Markov correlated equilibria, in which we
build upon the revelation principle in classic information design in economics.
The proposed approach can be applied to elicit desired behaviors of multi-agent
systems in competing as well as cooperating settings and be extended to
heterogeneous stochastic games in the complete- and the incomplete-information
environments.
Related papers
- Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z) - Information Design in Multi-Agent Reinforcement Learning [61.140924904755266]
Reinforcement learning (RL) is inspired by the way human infants and animals learn from the environment.
Research in computational economics distills two ways to influence others directly: by providing tangible goods (mechanism design) and by providing information (information design)
arXiv Detail & Related papers (2023-05-08T07:52:15Z) - Sequential Bayesian Optimization for Adaptive Informative Path Planning
with Multimodal Sensing [34.86734745942814]
We consider the problem of an agent equipped with multiple sensors, each with different sensing accuracy and energy costs.
The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments.
We formulate the AIPPMS problem as a belief Markov decision process with Gaussian process beliefs and solve it using a sequential Bayesian optimization approach with online planning.
arXiv Detail & Related papers (2022-09-16T00:50:36Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Sequential Information Design: Markov Persuasion Process and Its
Efficient Reinforcement Learning [156.5667417159582]
This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs)
Planning in MPPs faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender.
We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles.
arXiv Detail & Related papers (2022-02-22T05:41:43Z) - The Value of Information When Deciding What to Learn [21.945359614094503]
This work builds upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014)
We offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.
arXiv Detail & Related papers (2021-10-26T19:23:12Z) - A Consciousness-Inspired Planning Agent for Model-Based Reinforcement
Learning [104.3643447579578]
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state.
The design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
arXiv Detail & Related papers (2021-06-03T19:35:19Z) - Informational Design of Dynamic Multi-Agent System [32.37168850559519]
We study how the craft of payoffrelevant environmental signals solely can influence the behaviors of intelligent agents.
An obedient principle is established which states that it is without loss of generality to focus on the direct information design.
A framework is proposed based on an approach which we refer to as the fixed-point alignment that incentivizes the agents to choose the signal sent by the principal.
arXiv Detail & Related papers (2021-05-07T03:46:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.