On the Equilibrium Elicitation of Markov Games Through Information
Design
- URL: http://arxiv.org/abs/2102.07152v1
- Date: Sun, 14 Feb 2021 13:30:06 GMT
- Title: On the Equilibrium Elicitation of Markov Games Through Information
Design
- Authors: Tao Zhang, Quanyan Zhu
- Abstract summary: We study how the craft of payoffrelevant environmental signals solely can influence the behaviors of intelligent agents.
An obedient principle is established which states that it is without loss of generality to focus on the direct information design.
A new framework for information design is proposed based on an approach of maximizing the optimal slack variables.
- Score: 32.37168850559519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work considers a novel information design problem and studies how the
craft of payoff-relevant environmental signals solely can influence the
behaviors of intelligent agents. The agents' strategic interactions are
captured by an incomplete-information Markov game, in which each agent first
selects one environmental signal from multiple signal sources as additional
payoff-relevant information and then takes an action. There is a rational
information designer (designer) who possesses one signal source and aims to
control the equilibrium behaviors of the agents by designing the information
structure of her signals sent to the agents. An obedient principle is
established which states that it is without loss of generality to focus on the
direct information design when the information design incentivizes each agent
to select the signal sent by the designer, such that the design process avoids
the predictions of the agents' strategic selection behaviors. We then introduce
the design protocol given a goal of the designer referred to as obedient
implementability (OIL) and characterize the OIL in a class of obedient perfect
Bayesian Markov Nash equilibria (O-PBME). A new framework for information
design is proposed based on an approach of maximizing the optimal slack
variables. Finally, we formulate the designer's goal selection problem and
characterize it in terms of information design by establishing a relationship
between the O-PBME and the Bayesian Markov correlated equilibria, in which we
build upon the revelation principle in classic information design in economics.
The proposed approach can be applied to elicit desired behaviors of multi-agent
systems in competing as well as cooperating settings and be extended to
heterogeneous stochastic games in the complete- and the incomplete-information
environments.
Related papers
- Information Design in Multi-Agent Reinforcement Learning [61.140924904755266]
Reinforcement learning (RL) is inspired by the way human infants and animals learn from the environment.
Research in computational economics distills two ways to influence others directly: by providing tangible goods (mechanism design) and by providing information (information design)
arXiv Detail & Related papers (2023-05-08T07:52:15Z) - Semantic Information Marketing in The Metaverse: A Learning-Based
Contract Theory Framework [68.8725783112254]
We address the problem of designing incentive mechanisms by a virtual service provider (VSP) to hire sensing IoT devices to sell their sensing data.
Due to the limited bandwidth, we propose to use semantic extraction algorithms to reduce the delivered data by the sensing IoT devices.
We propose a novel iterative contract design and use a new variant of multi-agent reinforcement learning (MARL) to solve the modelled multi-dimensional contract problem.
arXiv Detail & Related papers (2023-02-22T15:52:37Z) - Sequential Bayesian Optimization for Adaptive Informative Path Planning
with Multimodal Sensing [34.86734745942814]
We consider the problem of an agent equipped with multiple sensors, each with different sensing accuracy and energy costs.
The agent's goal is to explore the environment and gather information subject to its resource constraints in unknown, partially observable environments.
We formulate the AIPPMS problem as a belief Markov decision process with Gaussian process beliefs and solve it using a sequential Bayesian optimization approach with online planning.
arXiv Detail & Related papers (2022-09-16T00:50:36Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Sequential Information Design: Markov Persuasion Process and Its
Efficient Reinforcement Learning [156.5667417159582]
This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs)
Planning in MPPs faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender.
We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles.
arXiv Detail & Related papers (2022-02-22T05:41:43Z) - The Value of Information When Deciding What to Learn [21.945359614094503]
This work builds upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014)
We offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.
arXiv Detail & Related papers (2021-10-26T19:23:12Z) - A Consciousness-Inspired Planning Agent for Model-Based Reinforcement
Learning [104.3643447579578]
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state.
The design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
arXiv Detail & Related papers (2021-06-03T19:35:19Z) - Informational Design of Dynamic Multi-Agent System [32.37168850559519]
We study how the craft of payoffrelevant environmental signals solely can influence the behaviors of intelligent agents.
An obedient principle is established which states that it is without loss of generality to focus on the direct information design.
A framework is proposed based on an approach which we refer to as the fixed-point alignment that incentivizes the agents to choose the signal sent by the principal.
arXiv Detail & Related papers (2021-05-07T03:46:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.