Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs
- URL: http://arxiv.org/abs/2505.06319v1
- Date: Thu, 08 May 2025 21:12:34 GMT
- Title: Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs
- Authors: Zijian An, Lifeng Zhou,
- Abstract summary: Game-theoretic resource allocation on graphs (GRAG) is a problem modeled as a multi-step Colonel Blotto Game (MCBG)<n>We formulate the MCBG as a Markov Decision Process (MDP) and apply Reinforcement Learning (RL) methods, specifically Deep Q-Network (DQN) and Proximal Policy Optimization (PPO)<n>We evaluate RL performance across a variety of graph structures and initial resource distributions, comparing against random, greedy, and learned RL policies.
- Score: 9.369330148791201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Game-theoretic resource allocation on graphs (GRAG) involves two players competing over multiple steps to control nodes of interest on a graph, a problem modeled as a multi-step Colonel Blotto Game (MCBG). Finding optimal strategies is challenging due to the dynamic action space and structural constraints imposed by the graph. To address this, we formulate the MCBG as a Markov Decision Process (MDP) and apply Reinforcement Learning (RL) methods, specifically Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). To enforce graph constraints, we introduce an action-displacement adjacency matrix that dynamically generates valid action sets at each step. We evaluate RL performance across a variety of graph structures and initial resource distributions, comparing against random, greedy, and learned RL policies. Experimental results show that both DQN and PPO consistently outperform baseline strategies and converge to a balanced $50\%$ win rate when competing against the learned RL policy. Particularly, on asymmetric graphs, RL agents successfully exploit structural advantages and adapt their allocation strategies, even under disadvantageous initial resource distributions.
Related papers
- GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning [33.57411612551111]
We propose GraphRAG-R1, an adaptive GraphRAG framework by training LLMs with process-constrained outcome-based reinforcement learning (RL)<n>Our method can decompose complex problems, autonomously invoke retrieval tools, and perform effective reasoning.<n>Our framework can be flexibly integrated with various existing retrieval methods, consistently delivering performance improvements.
arXiv Detail & Related papers (2025-07-31T14:11:16Z) - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z) - HGFormer: A Hierarchical Graph Transformer Framework for Two-Stage Colonel Blotto Games via Reinforcement Learning [4.144893164317513]
Two-stage Colonel Blotto game represents a typical adversarial resource allocation problem.<n>We propose a hierarchical graph Transformer framework called HGformer.<n>Our approach enables efficient policy generation in large-scale adversarial environments.
arXiv Detail & Related papers (2025-06-10T08:51:18Z) - G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning [58.73279333365234]
Reinforcement Learning (RL) on synthetic graph-theoretic tasks can significantly scale graph reasoning abilities.<n>With RL on Erdos, G1 obtains substantial improvements in graph reasoning, where our finetuned 3B model even outperforms Qwen2.5-72B-Instruct (24x size)<n>Our findings offer an efficient, scalable path for building strong graph reasoners by finetuning LLMs with RL on graph-theoretic tasks.
arXiv Detail & Related papers (2025-05-24T04:33:41Z) - Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO [68.44918104224818]
Autoregressive image generation presents unique challenges distinct from Chain-of-Thought (CoT) reasoning.<n>This study provides the first comprehensive investigation of the GRPO and DPO algorithms in autoregressive image generation.<n>Our findings reveal that GRPO and DPO exhibit distinct advantages, and crucially, that reward models possessing stronger intrinsic generalization capabilities potentially enhance the generalization potential of the applied RL algorithms.
arXiv Detail & Related papers (2025-05-22T17:59:49Z) - Compile Scene Graphs with Reinforcement Learning [69.36723767339001]
Next-token prediction is the fundamental principle for training large language models (LLMs)<n>We introduce R1-SGG, a multimodal LLM (M-LLM) initially trained via supervised fine-tuning (SFT) on the scene graph dataset.<n>We design a set of graph-centric rewards, including three recall-based variants -- Hard Recall, Hard Recall+Relax, and Soft Recall.
arXiv Detail & Related papers (2025-04-18T10:46:22Z) - Paths to Equilibrium in Games [6.812247730094933]
We study sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning.
Our analysis reveals a counterintuitive insight that reward deteriorating strategic updates are key to driving play to equilibrium along a satisficing path.
arXiv Detail & Related papers (2024-03-26T19:58:39Z) - Optimal coordination of resources: A solution from reinforcement learning [6.0413802011767705]
The Minority Game (MG) is perhaps the simplest toy model to address this issue.<n>We introduce the reinforcement learning paradigm to MG, where individuals adjust decisions based on accumulated experience.<n>We find that this RL framework achieves optimal resource coordination when individuals balance the exploitation of experience with random exploration.
arXiv Detail & Related papers (2023-12-20T00:47:45Z) - Decoupled Prioritized Resampling for Offline RL [114.73666323173204]
We propose Offline Prioritized Experience Replay (OPER) for offline reinforcement learning.<n>OPER features a class of priority functions designed to prioritize highly-rewarding transitions, making them more frequently visited during training.<n>We show that this class of priority functions induce an improved behavior policy, and when constrained to this improved policy, a policy-constrained offline RL algorithm is likely to yield a better solution.
arXiv Detail & Related papers (2023-06-08T17:56:46Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Scalable Deep Reinforcement Learning Algorithms for Mean Field Games [60.550128966505625]
Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents.
Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods.
Existing algorithms to solve MFGs require the mixing of approximated quantities such as strategies or $q$-values.
We propose two methods to address this shortcoming. The first one learns a mixed strategy from distillation of historical data into a neural network and is applied to the Fictitious Play algorithm.
The second one is an online mixing method based on
arXiv Detail & Related papers (2022-03-22T18:10:32Z) - Designing Heterogeneous GNNs with Desired Permutation Properties for Wireless Resource Allocation [11.66835230109271]
Graph neural networks (GNNs) have been designed for learning a variety of wireless policies.<n>In this paper, we propose a systematic approach for the design to satisfy the desired permutation property.
arXiv Detail & Related papers (2022-03-08T08:02:54Z) - Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value Functions [7.6860514640178]
We propose a general computationally efficient distributed framework for cooperative multi-agent reinforcement learning (MARL)
We introduce three coupling graphs describing three types of inter-agent couplings in MARL.
We propose two distributed RL approaches based on local value-functions derived from the coupling graphs.
arXiv Detail & Related papers (2022-02-26T03:01:51Z) - Learning Two-Step Hybrid Policy for Graph-Based Interpretable
Reinforcement Learning [7.656272344163665]
We present a two-step hybrid reinforcement learning (RL) policy that is designed to generate interpretable and robust hierarchical policies on the RL problem with graph-based input.
This two-step hybrid policy presents human-friendly interpretations and achieves better performance in terms of generalization and robustness.
arXiv Detail & Related papers (2022-01-21T03:06:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.