Model Based Residual Policy Learning with Applications to Antenna
Control
- URL: http://arxiv.org/abs/2211.08796v3
- Date: Mon, 11 Sep 2023 15:34:43 GMT
- Title: Model Based Residual Policy Learning with Applications to Antenna
Control
- Authors: Viktor Eriksson M\"ollerstedt, Alessio Russo, Maxime Bouton
- Abstract summary: Non-differentiable controllers and rule-based policies are widely used for controlling real systems such as telecommunication networks and robots.
Motivated by the antenna tilt control problem, we introduce Model-Based Residual Policy Learning (MBRPL), a practical reinforcement learning (RL) method.
- Score: 5.01069065110753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-differentiable controllers and rule-based policies are widely used for
controlling real systems such as telecommunication networks and robots.
Specifically, parameters of mobile network base station antennas can be
dynamically configured by these policies to improve users coverage and quality
of service. Motivated by the antenna tilt control problem, we introduce
Model-Based Residual Policy Learning (MBRPL), a practical reinforcement
learning (RL) method. MBRPL enhances existing policies through a model-based
approach, leading to improved sample efficiency and a decreased number of
interactions with the actual environment when compared to off-the-shelf RL
methods.To the best of our knowledge, this is the first paper that examines a
model-based approach for antenna control. Experimental results reveal that our
method delivers strong initial performance while improving sample efficiency
over previous RL methods, which is one step towards deploying these algorithms
in real networks.
Related papers
- A Deep Q-Network Based on Radial Basis Functions for Multi-Echelon
Inventory Management [6.149034764951798]
This paper addresses a multi-echelon inventory management problem with a complex network topology.
It develops a DRL model whose Q-network is based on radial basis functions.
It produces a better policy in the multi-echelon system and competitive performance in the serial system where the base-stock policy is optimal.
arXiv Detail & Related papers (2024-01-29T04:11:56Z) - MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot
Learning [52.101643259906915]
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations.
Existing model-based offline RL methods are not suitable for offline-to-online fine-tuning in high-dimensional domains.
We propose an on-policy model-based method that can efficiently reuse prior data through model-based value expansion and policy regularization.
arXiv Detail & Related papers (2024-01-06T21:04:31Z) - Reinforcement Learning with Model Predictive Control for Highway Ramp Metering [14.389086937116582]
This work explores the synergy between model-based and learning-based strategies to enhance traffic flow management.
The control problem is formulated as an RL task by crafting a suitable stage cost function.
An MPC-based RL approach, which leverages the MPC optimal problem as a function approximation for the RL algorithm, is proposed to learn to efficiently control an on-ramp.
arXiv Detail & Related papers (2023-11-15T09:50:54Z) - Statistically Efficient Variance Reduction with Double Policy Estimation
for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation.
We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z) - Model-based adaptation for sample efficient transfer in reinforcement
learning control of parameter-varying systems [1.8799681615947088]
We leverage ideas from model-based control to address the sample efficiency problem of reinforcement learning algorithms.
We demonstrate that our approach is more sample-efficient than fine-tuning with reinforcement learning alone.
arXiv Detail & Related papers (2023-05-20T10:11:09Z) - Efficient Domain Coverage for Vehicles with Second-Order Dynamics via
Multi-Agent Reinforcement Learning [9.939081691797858]
We present a reinforcement learning (RL) approach for the multi-agent efficient domain coverage problem involving agents with second-order dynamics.
Our proposed network architecture includes the incorporation of LSTM and self-attention, which allows the trained policy to adapt to a variable number of agents.
arXiv Detail & Related papers (2022-11-11T01:59:12Z) - Fully Decentralized Model-based Policy Optimization for Networked
Systems [23.46407780093797]
This work aims to improve data efficiency of multi-agent control by model-based learning.
We consider networked systems where agents are cooperative and communicate only locally with their neighbors.
In our method, each agent learns a dynamic model to predict future states and broadcast their predictions by communication, and then the policies are trained under the model rollouts.
arXiv Detail & Related papers (2022-07-13T23:52:14Z) - Learning Optimal Antenna Tilt Control Policies: A Contextual Linear
Bandit Approach [65.27783264330711]
Controlling antenna tilts in cellular networks is imperative to reach an efficient trade-off between network coverage and capacity.
We devise algorithms learning optimal tilt control policies from existing data.
We show that they can produce optimal tilt update policy using much fewer data samples than naive or existing rule-based learning algorithms.
arXiv Detail & Related papers (2022-01-06T18:24:30Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable.
We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions.
We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.