Bayesian Optimization for Non-Cooperative Game-Based Radio Resource Management
- URL: http://arxiv.org/abs/2512.01245v1
- Date: Mon, 01 Dec 2025 03:44:43 GMT
- Title: Bayesian Optimization for Non-Cooperative Game-Based Radio Resource Management
- Authors: Yunchuan Zhang, Jiechen Chen, Junshuo Liu, Robert C. Qiu,
- Abstract summary: This paper considers formulating the resource allocation among spectrum sharing BSs as a non-cooperative game.<n>We propose PPR-UCB, a novel Bayesian optimization strategy that learns from sequential decision-evaluation pairs.<n>Experiments on downlink transmission power allocation in a multi-cell multi-antenna system demonstrate the efficiency of PPR-UCB.
- Score: 3.652142532307204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Radio resource management in modern cellular networks often calls for the optimization of complex utility functions that are potentially conflicting between different base stations (BSs). Coordinating the resource allocation strategies efficiently across BSs to ensure stable network service poses significant challenges, especially when each utility is accessible only via costly, black-box evaluations. This paper considers formulating the resource allocation among spectrum sharing BSs as a non-cooperative game, with the goal of aligning their allocation incentives toward a stable outcome. To address this challenge, we propose PPR-UCB, a novel Bayesian optimization (BO) strategy that learns from sequential decision-evaluation pairs to approximate pure Nash equilibrium (PNE) solutions. PPR-UCB applies martingale techniques to Gaussian process (GP) surrogates and constructs high probability confidence bounds for utilities uncertainty quantification. Experiments on downlink transmission power allocation in a multi-cell multi-antenna system demonstrate the efficiency of PPR-UCB in identifying effective equilibrium solutions within a few data samples.
Related papers
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources [113.33902847941941]
Variance-Aware Sampling (VAS) is a data selection strategy guided by Variance Promotion Score (VPS)<n>We release large-scale, carefully curated resources containing 1.6M long CoT cold-start data and 15k RL QA pairs.<n> Experiments across mathematical reasoning benchmarks demonstrate the effectiveness of both the curated data and the proposed VAS.
arXiv Detail & Related papers (2025-09-25T14:58:29Z) - Heterogeneous Resource Allocation for Ensuring End-to-End Quality of Service in Multi-hop Integrated Access and Backhaul Network [0.0]
Multi-hop integrated access and backhaul (IAB) architectures have emerged as a cost-effective solution for network densification.<n>dynamic time division duplex (D-TDD) is a promising solution to adapt to highly dynamic scenarios with asymmetric uplink and downlink traffic.<n>We decompose the integrated optimization problem (IOP) into sub-problems to reduce the solution space.<n>To achieve the system-wide solution, we propose a single-leader heterogeneous multi-follower Stackelberg-game-based resource allocation scheme.
arXiv Detail & Related papers (2025-04-04T16:29:08Z) - Efficient and Scalable Deep Reinforcement Learning for Mean Field Control Games [16.62770187749295]
Mean Field Control Games (MFCGs) provide a powerful theoretical framework for analyzing systems of infinitely many interacting agents.<n>This paper presents a scalable deep Reinforcement Learning (RL) approach to approximate equilibrium solutions of MFCGs.
arXiv Detail & Related papers (2024-12-28T02:04:53Z) - Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.<n>We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - Generative AI for O-RAN Slicing: A Semi-Supervised Approach with VAE and Contrastive Learning [5.1435595246496595]
This paper introduces a novel generative AI (GAI)-driven, unified semi-supervised learning architecture for optimizing resource allocation and network slicing in O-RAN.<n>Termed Generative Semi-Supervised VAE-Contrastive Learning, our approach maximizes the weighted user equipment (UE) throughput and allocates physical resource blocks (PRBs) to enhance the quality of service for eMBB and URLLC services.
arXiv Detail & Related papers (2024-01-16T22:23:27Z) - Joint User Association, Interference Cancellation and Power Control for
Multi-IRS Assisted UAV Communications [80.35959154762381]
Intelligent reflecting surface (IRS)-assisted unmanned aerial vehicle (UAV) communications are expected to alleviate the load of ground base stations in a cost-effective way.
Existing studies mainly focus on the deployment and resource allocation of a single IRS instead of multiple IRSs.
We propose a new optimization algorithm for joint IRS-user association, trajectory optimization of UAVs, successive interference cancellation (SIC) decoding order scheduling and power allocation.
arXiv Detail & Related papers (2023-12-08T01:57:10Z) - Distributed Optimization via Kernelized Multi-armed Bandits [6.04275169308491]
We model a distributed optimization problem as a multi-agent kernelized multi-armed bandit problem with a heterogeneous reward setting.
We present a fully decentralized algorithm, Multi-agent IGP-UCB (MA-IGP-UCB), which achieves a sub-linear regret bound for popular classes for kernels.
We also propose an extension, Multi-agent Delayed IGP-UCB (MAD-IGP-UCB) algorithm, which reduces the dependence of the regret bound on the number of agents in the network.
arXiv Detail & Related papers (2023-12-07T21:57:48Z) - Coverage and Capacity Optimization in STAR-RISs Assisted Networks: A
Machine Learning Approach [102.00221938474344]
A novel model is proposed for the coverage and capacity optimization of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) assisted networks.
A loss function-based update strategy is the core point, which is able to calculate weights for both loss functions of coverage and capacity by a min-norm solver at each update.
The numerical results demonstrate that the investigated update strategy outperforms the fixed weight-based MO algorithms.
arXiv Detail & Related papers (2022-04-13T13:52:22Z) - Learning Resilient Radio Resource Management Policies with Graph Neural
Networks [124.89036526192268]
We formulate a resilient radio resource management problem with per-user minimum-capacity constraints.
We show that we can parameterize the user selection and power control policies using a finite set of parameters.
Thanks to such adaptation, our proposed method achieves a superior tradeoff between the average rate and the 5th percentile rate.
arXiv Detail & Related papers (2022-03-07T19:40:39Z) - A Q-Learning-based Approach for Distributed Beam Scheduling in mmWave
Networks [18.22250038264899]
We consider the problem of distributed downlink beam scheduling and power allocation for millimeter-Wave (mmWave) cellular networks.
Multiple base stations belonging to different service operators share the same unlicensed spectrum with no central coordination or cooperation among them.
We propose a distributed scheduling approach to power allocation and adaptation for efficient interference management over the shared spectrum by modeling each BS as an independent Q-learning agent.
arXiv Detail & Related papers (2021-10-17T02:58:13Z) - Deep Reinforcement Learning Based Multidimensional Resource Management
for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency.
In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z) - Cooperative Multi-Agent Reinforcement Learning Based Distributed Dynamic
Spectrum Access in Cognitive Radio Networks [46.723006378363785]
Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization.
In this paper, we investigate the distributed DSA problem for multi-user in a typical cognitive radio network.
We employ the deep recurrent Q-network (DRQN) to address the partial observability of the state for each cognitive user.
arXiv Detail & Related papers (2021-06-17T06:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.