Distributed Proximal Policy Optimization for Contention-Based Spectrum
Access
- URL: http://arxiv.org/abs/2111.09420v1
- Date: Thu, 7 Oct 2021 00:54:03 GMT
- Title: Distributed Proximal Policy Optimization for Contention-Based Spectrum
Access
- Authors: Akash Doshi and Jeffrey G. Andrews
- Abstract summary: We develop a novel distributed implementation of a policy gradient method known as Proximal Policy Optimization.
In each time slot, a base station uses information from spectrum sensing and reception quality to autonomously decide whether or not to transmit on a given resource.
We find the proportional fairness reward accumulated by the policy gradient approach to be significantly higher than even a genie-aided adaptive energy detection threshold.
- Score: 40.99534735484468
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing number of wireless devices operating in unlicensed spectrum
motivates the development of intelligent adaptive approaches to spectrum access
that go beyond traditional carrier sensing. We develop a novel distributed
implementation of a policy gradient method known as Proximal Policy
Optimization modelled on a two stage Markov decision process that enables such
an intelligent approach, and still achieves decentralized contention-based
medium access. In each time slot, a base station (BS) uses information from
spectrum sensing and reception quality to autonomously decide whether or not to
transmit on a given resource, with the goal of maximizing proportional fairness
network-wide. Empirically, we find the proportional fairness reward accumulated
by the policy gradient approach to be significantly higher than even a
genie-aided adaptive energy detection threshold. This is further validated by
the improved sum and maximum user throughputs achieved by our approach.
Related papers
- Collaborative Ground-Space Communications via Evolutionary Multi-objective Deep Reinforcement Learning [113.48727062141764]
We propose a distributed collaborative beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications.
DCB treats the terminals that are unable to establish efficient direct connections with the low Earth orbit (LEO) satellites as distributed antennas.
We propose an evolutionary multi-objective deep reinforcement learning algorithm to obtain the desirable policies.
arXiv Detail & Related papers (2024-04-11T03:13:02Z) - Provable Offline Preference-Based Reinforcement Learning [95.00042541409901]
We investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback.
We consider the general reward setting where the reward can be defined over the whole trajectory.
We introduce a new single-policy concentrability coefficient, which can be upper bounded by the per-trajectory concentrability.
arXiv Detail & Related papers (2023-05-24T07:11:26Z) - Joint Power Allocation and Beamformer for mmW-NOMA Downlink Systems by
Deep Reinforcement Learning [0.0]
Joint power allocation and beamforming of mmW-NOMA systems is mandatory.
We have exploited Deep Reinforcement Learning (DRL) approach due to policy generation leading to an optimized sum-rate of users.
arXiv Detail & Related papers (2022-05-13T07:42:03Z) - Learning Resilient Radio Resource Management Policies with Graph Neural
Networks [124.89036526192268]
We formulate a resilient radio resource management problem with per-user minimum-capacity constraints.
We show that we can parameterize the user selection and power control policies using a finite set of parameters.
Thanks to such adaptation, our proposed method achieves a superior tradeoff between the average rate and the 5th percentile rate.
arXiv Detail & Related papers (2022-03-07T19:40:39Z) - A Q-Learning-based Approach for Distributed Beam Scheduling in mmWave
Networks [18.22250038264899]
We consider the problem of distributed downlink beam scheduling and power allocation for millimeter-Wave (mmWave) cellular networks.
Multiple base stations belonging to different service operators share the same unlicensed spectrum with no central coordination or cooperation among them.
We propose a distributed scheduling approach to power allocation and adaptation for efficient interference management over the shared spectrum by modeling each BS as an independent Q-learning agent.
arXiv Detail & Related papers (2021-10-17T02:58:13Z) - A Deep Reinforcement Learning Framework for Contention-Based Spectrum
Sharing [31.640828282666245]
We consider decentralized contention-based medium access for base stations operating on unlicensed shared spectrum.
We introduce a two-stage Markov decision process in each time slot that uses information from spectrum sensing and reception quality to make a medium access decision.
Our formulation provides decentralized inference, online adaptability and also caters to partial observability of the environment.
arXiv Detail & Related papers (2021-10-05T03:00:33Z) - Distributed Deep Reinforcement Learning for Adaptive Medium Access and
Modulation in Shared Spectrum [42.54329256803276]
We study decentralized contention-based medium access for base stations operating on unlicensed shared spectrum.
We devise a learning-based algorithm for both contention and adaptive modulation that attempts to maximize a network-wide downlink throughput objective.
Empirically, we find the (proportional fairness) reward accumulated by the policy gradient approach to be significantly higher than even a genie-aided adaptive energy detection threshold.
arXiv Detail & Related papers (2021-09-24T03:33:45Z) - Model-Free Learning of Optimal Deterministic Resource Allocations in
Wireless Systems via Action-Space Exploration [4.721069729610892]
We propose a technically grounded and scalable deterministic-dual gradient policy method for efficiently learning optimal parameterized resource allocation policies.
Our method not only efficiently exploits gradient availability of popular universal representations such as deep networks, but is also truly model-free, as it relies on consistent zeroth-order gradient approximations of associated random network services constructed via low-dimensional perturbations in action space.
arXiv Detail & Related papers (2021-08-23T18:26:16Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.