Related papers: Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values

Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values

URL: http://arxiv.org/abs/2212.03041v1
Date: Tue, 6 Dec 2022 15:15:00 GMT
Title: Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values
Authors: Giorgio Angelotti and Natalia D\'iaz-Rodr\'iguez
Abstract summary: A quantitative assessment of the global importance of an agent in a team is as valuable as gold for strategists, decision-makers, and sports coaches. We propose a method to determine a Hierarchical Knowledge Graph of agents' policies and features in a Multi-Agent System. We test the proposed approach in a proof-of-case environment deploying both hardcoded policies and policies obtained via Deep Reinforcement Learning.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: A quantitative assessment of the global importance of an agent in a team is as valuable as gold for strategists, decision-makers, and sports coaches. Yet, retrieving this information is not trivial since in a cooperative task it is hard to isolate the performance of an individual from the one of the whole team. Moreover, it is not always clear the relationship between the role of an agent and his personal attributes. In this work we conceive an application of the Shapley analysis for studying the contribution of both agent policies and attributes, putting them on equal footing. Since the computational complexity is NP-hard and scales exponentially with the number of participants in a transferable utility coalitional game, we resort to exploiting a-priori knowledge about the rules of the game to constrain the relations between the participants over a graph. We hence propose a method to determine a Hierarchical Knowledge Graph of agents' policies and features in a Multi-Agent System. Assuming a simulator of the system is available, the graph structure allows to exploit dynamic programming to assess the importances in a much faster way. We test the proposed approach in a proof-of-case environment deploying both hardcoded policies and policies obtained via Deep Reinforcement Learning. The proposed paradigm is less computationally demanding than trivially computing the Shapley values and provides great insight not only into the importance of an agent in a team but also into the attributes needed to deploy the policy at its best.

Related papers

From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Learning Macroeconomic Policies through Dynamic Stackelberg Mean-Field Games [14.341143540616441]
We formulate a dynamic Stackelberg game: the government (leader) sets policies, and agents (followers) respond by optimizing their behavior over time.<n>As the number of agents increases, explicitly simulating all agent-agent and agent-government interactions becomes computationally infeasible.<n>We propose the Dynamic Stackelberg Mean Field Game framework, which approximates these complex interactions via agent-population and government-population couplings.
arXiv Detail & Related papers (2024-03-14T13:22:31Z)
Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks [94.07688076435818]
We study reinforcement learning for learning a Quantal Stackelberg Equilibrium (QSE) in an episodic Markov game with a leader-follower structure. Our algorithms are based on (i) learning the quantal response model via maximum likelihood estimation and (ii) model-free or model-based RL for solving the leader's decision making problem.
arXiv Detail & Related papers (2023-07-26T10:24:17Z)
Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf. We design a provably sample efficient algorithm that tailors the UCB algorithm to our model. Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z)
Learning From Good Trajectories in Offline Multi-Agent Reinforcement Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets. One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team. We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z)
Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming [0.0]
We show that reformulating an agent's policy to be conditional on the policies of its teammates inherently maximizes Mutual Information (MI) lower-bound when optimizing under Policy Gradient (PG) Our approach, InfoPG, outperforms baselines in learning emergent collaborative behaviors and sets the state-of-the-art in decentralized cooperative MARL tasks.
arXiv Detail & Related papers (2022-01-20T22:54:32Z)
Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning with Shapley Values [68.8204255655161]
This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values. Results could have implications for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints.
arXiv Detail & Related papers (2021-10-04T10:28:57Z)
Influence-based Reinforcement Learning for Intrinsically-motivated Agents [0.0]
We present an algorithmic framework of two reinforcement learning agents each with a different objective. We introduce a novel function approximation approach to assess the influence $F$ of a certain policy on others. Our method was evaluated on the suite of OpenAI gym tasks as well as cooperative and mixed scenarios.
arXiv Detail & Related papers (2021-08-28T05:36:10Z)
Stateful Strategic Regression [20.7177095411398]
We describe the Stackelberg equilibrium of the resulting game and provide novel algorithms for computing it. Our analysis reveals several intriguing insights about the role of multiple interactions in shaping the game's outcome. Most importantly, we show that with multiple rounds of interaction at her disposal, the principal is more effective at incentivizing the agent to accumulate effort in her desired direction.
arXiv Detail & Related papers (2021-06-07T17:46:29Z)
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State [35.69801203107371]
We design a simple reinforcement learning agent that can operate in any environment. The agent maintains only visitation counts and value estimates for each agent-state-action pair. There is no further dependence on the number of environment states or mixing times associated with other policies or statistics of history.
arXiv Detail & Related papers (2021-02-10T04:53:12Z)
Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly. In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously. In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.