Learning Power Control from a Fixed Batch of Data
- URL: http://arxiv.org/abs/2008.02669v1
- Date: Wed, 5 Aug 2020 01:00:21 GMT
- Title: Learning Power Control from a Fixed Batch of Data
- Authors: Mohammad G. Khoshkholgh and Halim Yanikomeroglu
- Abstract summary: We exploit power control data, gathered from a monitored environment, for performing power control in an unexplored environment.
We adopt offline deep reinforcement learning, whereby the agent learns the policy to produce the transmission powers solely by using the data.
- Score: 28.618312473850974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address how to exploit power control data, gathered from a monitored
environment, for performing power control in an unexplored environment. We
adopt offline deep reinforcement learning, whereby the agent learns the policy
to produce the transmission powers solely by using the data. Experiments
demonstrate that despite discrepancies between the monitored and unexplored
environments, the agent successfully learns the power control very quickly,
even if the objective functions in the monitored and unexplored environments
are dissimilar. About one third of the collected data is sufficient to be of
high-quality and the rest can be from any sub-optimal algorithm.
Related papers
- Improving the Resilience of Quadrotors in Underground Environments by Combining Learning-based and Safety Controllers [22.566692834880396]
We train a normalizing flow-based prior over the environment, which provides a measure of how far out-of-distribution the quadrotor is at any given time.<n>We use this measure as a runtime monitor, allowing us to switch between a learning-based controller and a safe controller when we are sufficiently out-of-distribution.
arXiv Detail & Related papers (2025-09-02T20:22:54Z) - Automatic Reward Shaping from Confounded Offline Data [69.11672390876763]
Building on the well-celebrated Deep Q-Network (DQN), we propose a novel deep reinforcement learning algorithm robust to confounding biases in observed data.<n>We apply our method to twelve confounded Atari games, and find that it consistently dominates the standard DQN in all games where the observed input to the behavioral and target policies mismatch and unobserved confounders exist.
arXiv Detail & Related papers (2025-05-16T17:40:01Z) - DMC-VB: A Benchmark for Representation Learning for Control with Visual Distractors [13.700885996266457]
Learning from previously collected data via behavioral cloning or offline reinforcement learning (RL) is a powerful recipe for scaling generalist agents.
We present theDeepMind Control Visual Benchmark (DMC-VB), a dataset collected in the DeepMind Control Suite to evaluate the robustness of offline RL agents.
Accompanying our dataset, we propose three benchmarks to evaluate representation learning methods for pretraining, and carry out experiments on several recently proposed methods.
arXiv Detail & Related papers (2024-09-26T23:07:01Z) - PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models [55.080748327139176]
We introduce PerLDiff, a method for effective street view image generation that fully leverages perspective 3D geometric information.
Our results justify that our PerLDiff markedly enhances the precision of generation on the NuScenes and KITTI datasets.
arXiv Detail & Related papers (2024-07-08T16:46:47Z) - A Decentralized and Self-Adaptive Approach for Monitoring Volatile Edge Environments [40.96858640950632]
We propose DEMon, a decentralized self-adaptive monitoring system for edge.
We implement the proposed system as a lightweight and portable container-based system and evaluate it through experiments.
The results show that DEMon efficiently disseminates and retrieves the monitoring information, addressing the challenges of edge monitoring.
arXiv Detail & Related papers (2024-05-13T14:47:34Z) - CUDC: A Curiosity-Driven Unsupervised Data Collection Method with
Adaptive Temporal Distances for Offline Reinforcement Learning [62.58375643251612]
We propose a Curiosity-driven Unsupervised Data Collection (CUDC) method to expand feature space using adaptive temporal distances for task-agnostic data collection.
With this adaptive reachability mechanism in place, the feature representation can be diversified, and the agent can navigate itself to collect higher-quality data with curiosity.
Empirically, CUDC surpasses existing unsupervised methods in efficiency and learning performance in various downstream offline RL tasks of the DeepMind control suite.
arXiv Detail & Related papers (2023-12-19T14:26:23Z) - PID-Inspired Inductive Biases for Deep Reinforcement Learning in
Partially Observable Control Tasks [9.915787487970187]
We look at the PID controller's success shows that only summing and differencing are needed to accumulate information over time for many control tasks.
We propose two architectures for encoding history: one that directly uses PID features and another that extends these core ideas and can be used in arbitrary control tasks.
Going beyond tracking tasks, our policies achieve 1.7x better performance on average over previous state-of-the-art methods.
arXiv Detail & Related papers (2023-07-12T03:42:24Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - Denoised MDPs: Learning World Models Better Than the World Itself [94.74665254213588]
This work categorizes information out in the wild into four types based on controllability and relation with reward, and formulates useful information as that which is both controllable and reward-relevant.
Experiments on variants of DeepMind Control Suite and RoboDesk demonstrate superior performance of our denoised world model over using raw observations alone.
arXiv Detail & Related papers (2022-06-30T17:59:49Z) - Is Disentanglement enough? On Latent Representations for Controllable
Music Generation [78.8942067357231]
In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability.
The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
arXiv Detail & Related papers (2021-08-01T18:37:43Z) - Curious Representation Learning for Embodied Intelligence [81.21764276106924]
Self-supervised representation learning has achieved remarkable success in recent years.
Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn from environments.
We propose a framework, curious representation learning, which jointly learns a reinforcement learning policy and a visual representation model.
arXiv Detail & Related papers (2021-05-03T17:59:20Z) - Deep Actor-Critic Learning for Distributed Power Control in Wireless
Mobile Networks [5.930707872313038]
Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization.
We present a distributively executed continuous power control algorithm with the help of deep actor-critic learning.
We integrate the proposed power control algorithm to a time-slotted system where devices are mobile and channel conditions change rapidly.
arXiv Detail & Related papers (2020-09-14T18:29:12Z) - Defending Against Adversarial Attacks in Transmission- and
Distribution-level PMU Data [2.5365237338254816]
Phasor measurement units (PMUs) provide high-fidelity data that improve situation awareness of electric power grid operations.
As PMU data become more available and increasingly reliable, these devices are found in new roles within control systems.
We present a comprehensive analysis of multiple machine learning techniques to detect malicious data injection within PMU data streams.
arXiv Detail & Related papers (2020-08-20T18:44:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.