Near-optimal Deep Reinforcement Learning Policies from Data for Zone
Temperature Control
- URL: http://arxiv.org/abs/2203.05434v1
- Date: Thu, 10 Mar 2022 15:41:29 GMT
- Title: Near-optimal Deep Reinforcement Learning Policies from Data for Zone
Temperature Control
- Authors: Loris Di Natale, Bratislav Svetozarevic, Philipp Heer, and Colin N.
Jones
- Abstract summary: We investigate the performance of DRL agents compared to the theoretically optimal solution.
Our results hint that DRL agents not only clearly outperform conventional rule-based controllers, they furthermore attain near-optimal performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Replacing poorly performing existing controllers with smarter solutions will
decrease the energy intensity of the building sector. Recently, controllers
based on Deep Reinforcement Learning (DRL) have been shown to be more effective
than conventional baselines. However, since the optimal solution is usually
unknown, it is still unclear if DRL agents are attaining near-optimal
performance in general or if there is still a large gap to bridge.
In this paper, we investigate the performance of DRL agents compared to the
theoretically optimal solution. To that end, we leverage Physically Consistent
Neural Networks (PCNNs) as simulation environments, for which optimal control
inputs are easy to compute. Furthermore, PCNNs solely rely on data to be
trained, avoiding the difficult physics-based modeling phase, while retaining
physical consistency. Our results hint that DRL agents not only clearly
outperform conventional rule-based controllers, they furthermore attain
near-optimal performance.
Related papers
- Adaptive Data Exploitation in Deep Reinforcement Learning [50.53705050673944]
We introduce ADEPT, a powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL)
Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms.
We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-01-22T04:01:17Z) - Safe Load Balancing in Software-Defined-Networking [1.2521494095948067]
Control Barrier (CBF) designed on top of Deep Reinforcement Learning (DRL) algorithms for load-balancing.
We show that our DRL-CBF approach is capable of meeting safety requirements during training and testing.
arXiv Detail & Related papers (2024-10-22T09:34:22Z) - Active Reinforcement Learning for Robust Building Control [0.0]
Reinforcement learning (RL) is a powerful tool for optimal control that has found great success in Atari games, the game of Go, robotic control, and building optimization.
Unsupervised environment design (UED) has been proposed as a solution to this problem, in which the agent trains in environments that have been specially selected to help it learn.
We show that ActivePLR is able to outperform state-of-the-art UED algorithms in minimizing energy usage while maximizing occupant comfort in the setting of building control.
arXiv Detail & Related papers (2023-12-16T02:18:45Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Communication-Efficient Orchestrations for URLLC Service via
Hierarchical Reinforcement Learning [14.604814002402588]
We propose a multi-agent Hierarchical RL (HRL) framework that enables the implementation of multi-level policies with different control loop timescales.
On a use case from the prior art, with our HRL framework, we optimized the maximum number of retransmissions and transmission power of industrial devices.
arXiv Detail & Related papers (2023-07-25T11:23:38Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - Federated Deep Reinforcement Learning for the Distributed Control of
NextG Wireless Networks [16.12495409295754]
Next Generation (NextG) networks are expected to support demanding internet tactile applications such as augmented reality and connected autonomous vehicles.
Data-driven approaches can improve the ability of the network to adapt to the current operating conditions.
Deep RL (DRL) has been shown to achieve good performance even in complex environments.
arXiv Detail & Related papers (2021-12-07T03:13:20Z) - OptiDICE: Offline Policy Optimization via Stationary Distribution
Correction Estimation [59.469401906712555]
We present an offline reinforcement learning algorithm that prevents overestimation in a more principled way.
Our algorithm, OptiDICE, directly estimates the stationary distribution corrections of the optimal policy.
We show that OptiDICE performs competitively with the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-21T00:43:30Z) - Behavioral Priors and Dynamics Models: Improving Performance and Domain
Transfer in Offline RL [82.93243616342275]
We introduce Offline Model-based RL with Adaptive Behavioral Priors (MABE)
MABE is based on the finding that dynamics models, which support within-domain generalization, and behavioral priors, which support cross-domain generalization, are complementary.
In experiments that require cross-domain generalization, we find that MABE outperforms prior methods.
arXiv Detail & Related papers (2021-06-16T20:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.