Solving Graph-based Public Good Games with Tree Search and Imitation
Learning
- URL: http://arxiv.org/abs/2106.06762v1
- Date: Sat, 12 Jun 2021 12:46:44 GMT
- Title: Solving Graph-based Public Good Games with Tree Search and Imitation
Learning
- Authors: Victor-Alexandru Darvariu, Stephen Hailes, Mirco Musolesi
- Abstract summary: We adopt the perspective of a central planner with a global view of a network of self-interested agents and the goal of maximizing some desired property in the context of a best-shot public goods game.
Existing algorithms for this known NP-complete problem find solutions that are sub-optimal and cannot optimize for criteria other than social welfare.
Our proposed method directly exploits the correspondence between equilibria and the Maximal Independent Set (mIS) structural property of graphs.
- Score: 4.499055111059408
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Public goods games represent insightful settings for studying incentives for
individual agents to make contributions that, while costly for each of them,
benefit the wider society. In this work, we adopt the perspective of a central
planner with a global view of a network of self-interested agents and the goal
of maximizing some desired property in the context of a best-shot public goods
game. Existing algorithms for this known NP-complete problem find solutions
that are sub-optimal and cannot optimize for criteria other than social
welfare.
In order to efficiently solve public goods games, our proposed method
directly exploits the correspondence between equilibria and the Maximal
Independent Set (mIS) structural property of graphs. In particular, we define a
Markov Decision Process, which incrementally generates an mIS, and adopt a
planning method to search for equilibria, outperforming existing methods.
Furthermore, we devise an imitation learning technique that uses demonstrations
of the search to obtain a graph neural network parametrized policy which
quickly generalizes to unseen game instances. Our evaluation results show that
this policy is able to reach 99.5% of the performance of the planning method
while being approximately three orders of magnitude faster to evaluate on the
largest graphs tested. The methods presented in this work can be applied to a
large class of public goods games of potentially high societal impact.
Related papers
- Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning [55.65738319966385]
We propose a novel online algorithm, iterative Nash policy optimization (INPO)
Unlike previous methods, INPO bypasses the need for estimating the expected win rate for individual responses.
With an LLaMA-3-8B-based SFT model, INPO achieves a 42.6% length-controlled win rate on AlpacaEval 2.0 and a 37.8% win rate on Arena-Hard.
arXiv Detail & Related papers (2024-06-30T08:00:34Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via
Planning and Learning [46.354187895184154]
Multi-agent Pathfinding (MAPF) problem generally asks to find a set of conflict-free paths for a set of agents confined to a graph.
In this work, we investigate the decentralized MAPF setting, when the central controller that posses all the information on the agents' locations and goals is absent.
We focus on the practically important lifelong variant of MAPF, which involves continuously assigning new goals to the agents upon arrival to the previous ones.
arXiv Detail & Related papers (2023-10-02T13:51:32Z) - The Update-Equivalence Framework for Decision-Time Planning [78.44953498421854]
We introduce an alternative framework for decision-time planning that is not based on solving subgames, but rather on update equivalence.
We derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.
arXiv Detail & Related papers (2023-04-25T20:28:55Z) - Learning to Incentivize Information Acquisition: Proper Scoring Rules
Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf.
We design a provably sample efficient algorithm that tailors the UCB algorithm to our model.
Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z) - Using Knowledge Graphs for Performance Prediction of Modular
Optimization Algorithms [4.060078409841919]
We build a performance prediction model using a knowledge graph embedding-based methodology.
We show that a triple classification approach can correctly predict whether a given algorithm instance will be able to achieve a certain target precision.
arXiv Detail & Related papers (2023-01-24T09:28:57Z) - Sequential Information Design: Markov Persuasion Process and Its
Efficient Reinforcement Learning [156.5667417159582]
This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs)
Planning in MPPs faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender.
We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles.
arXiv Detail & Related papers (2022-02-22T05:41:43Z) - Probabilistic Case-based Reasoning for Open-World Knowledge Graph
Completion [59.549664231655726]
A case-based reasoning (CBR) system solves a new problem by retrieving cases' that are similar to the given problem.
In this paper, we demonstrate that such a system is achievable for reasoning in knowledge-bases (KBs)
Our approach predicts attributes for an entity by gathering reasoning paths from similar entities in the KB.
arXiv Detail & Related papers (2020-10-07T17:48:12Z) - Beyond Individualized Recourse: Interpretable and Interactive Summaries
of Actionable Recourses [14.626432428431594]
We propose a novel model framework called Actionable Recourse agnostic (AReS) to construct global counterfactual explanations.
We formulate a novel objective which simultaneously optimize for correctness of the recourses and interpretability of the explanations.
Our framework can provide decision makers with a comprehensive overview of recourses corresponding to any black box model.
arXiv Detail & Related papers (2020-09-15T15:14:08Z) - A General Large Neighborhood Search Framework for Solving Integer Linear
Programs [46.62993477453986]
We focus on solving integer programs, and ground our approach in the large neighborhood search (SLN) paradigm.
We show that our LNS framework can significantly outperform compared to state-of-the-art commercial solvers such as Gurobi.
arXiv Detail & Related papers (2020-03-29T23:08:14Z) - Goal-directed graph construction using reinforcement learning [3.291429094499946]
We formulate the construction of a graph as a decision-making process in which a central agent creates topologies by trial and error.
We propose an algorithm based on reinforcement learning and graph neural networks to learn graph construction and improvement strategies.
arXiv Detail & Related papers (2020-01-30T12:11:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.