Go Beyond Black-box Policies: Rethinking the Design of Learning Agent
for Interpretable and Verifiable HVAC Control
- URL: http://arxiv.org/abs/2403.00172v1
- Date: Thu, 29 Feb 2024 22:42:23 GMT
- Title: Go Beyond Black-box Policies: Rethinking the Design of Learning Agent
for Interpretable and Verifiable HVAC Control
- Authors: Zhiyu An, Xianzhong Ding, Wan Du
- Abstract summary: We overcome the bottleneck by redesigning HVAC controllers using decision trees extracted from thermal dynamics models and historical data.
Our method saves 68.4% more energy and increases human comfort gain by 14.8% compared to the state-of-the-art method.
- Score: 3.326392645107372
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research has shown the potential of Model-based Reinforcement Learning
(MBRL) to enhance energy efficiency of Heating, Ventilation, and Air
Conditioning (HVAC) systems. However, existing methods rely on black-box
thermal dynamics models and stochastic optimizers, lacking reliability
guarantees and posing risks to occupant health. In this work, we overcome the
reliability bottleneck by redesigning HVAC controllers using decision trees
extracted from existing thermal dynamics models and historical data. Our
decision tree-based policies are deterministic, verifiable, interpretable, and
more energy-efficient than current MBRL methods. First, we introduce a novel
verification criterion for RL agents in HVAC control based on domain knowledge.
Second, we develop a policy extraction procedure that produces a verifiable
decision tree policy. We found that the high dimensionality of the thermal
dynamics model input hinders the efficiency of policy extraction. To tackle the
dimensionality challenge, we leverage importance sampling conditioned on
historical data distributions, significantly improving policy extraction
efficiency. Lastly, we present an offline verification algorithm that
guarantees the reliability of a control policy. Extensive experiments show that
our method saves 68.4% more energy and increases human comfort gain by 14.8%
compared to the state-of-the-art method, in addition to an 1127x reduction in
computation overhead. Our code and data are available at
https://github.com/ryeii/Veri_HVAC
Related papers
- Experimental evaluation of offline reinforcement learning for HVAC control in buildings [12.542463083734614]
Reinforcement learning (RL) techniques have been increasingly investigated for dynamic HVAC control in buildings.
This paper comprehensively evaluates the strengths and limitations of state-of-the-art offline RL algorithms.
arXiv Detail & Related papers (2024-08-15T07:25:52Z) - Improving Building Temperature Forecasting: A Data-driven Approach with
System Scenario Clustering [3.2114754609864695]
Heat, Ventilation and Air Conditioning systems cost approximately 40% of primary energy usage in the building sector.
For large-scale HVAC system management, it is difficult to construct a detailed model for each subsystem.
New data-driven room temperature prediction model is proposed based on the k-means clustering method.
arXiv Detail & Related papers (2024-02-21T09:04:45Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration [72.24964965882783]
Confidence-Controlled Exploration (CCE) is designed to enhance the training sample efficiency of reinforcement learning algorithms for sparse reward settings such as robot navigation.
CCE is based on a novel relationship we provide between gradient estimation and policy entropy.
We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization.
arXiv Detail & Related papers (2023-06-09T18:45:15Z) - Data-driven HVAC Control Using Symbolic Regression: Design and
Implementation [0.0]
This study proposes a design and implementation methodology of data-driven heating, ventilation, and air conditioning () control.
Building thermodynamics is modeled using a symbolic regression model (SRM) built from the collected data.
The proposed framework reduces the peak power by 16.1% compared to the widely used thermostat controller.
arXiv Detail & Related papers (2023-04-06T13:57:50Z) - Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently
Distilled RL Policies with Many-sided Guarantees [0.0]
Variational Markov Decision Processes (VAE-MDPs) are discrete latent space models that provide a reliable framework for distilling verifiable controllers from any RL policy.
We introduce the Wasserstein auto-encoded MDP (WAE-MDP), a latent space model that fixes those issues by minimizing a penalized form of the optimal transport between the behaviors of the agent executing the original policy and the distilled policy.
Our experiments show that, besides distilling policies up to 10 times faster, the latent model quality is indeed better in general.
arXiv Detail & Related papers (2023-03-22T13:41:42Z) - Data-Driven Stochastic AC-OPF using Gaussian Processes [54.94701604030199]
Integrating a significant amount of renewables into a power grid is probably the most a way to reduce carbon emissions from power grids slow down climate change.
This paper presents an alternative data-driven approach based on the AC power flow equations that can incorporate uncertainty inputs.
The GP approach learns a simple yet non-constrained data-driven approach to close this gap to the AC power flow equations.
arXiv Detail & Related papers (2022-07-21T23:02:35Z) - Development of a Soft Actor Critic Deep Reinforcement Learning Approach
for Harnessing Energy Flexibility in a Large Office Building [0.0]
This research is concerned with the novel application and investigation of Soft Actor Critic' (SAC) based Deep Reinforcement Learning (DRL)
SAC is a model-free DRL technique that is able to handle continuous action spaces.
arXiv Detail & Related papers (2021-04-25T10:33:35Z) - SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road
Segmentation in Hazardous Environments [54.22535063244038]
We present a novel approach for unsupervised road segmentation in adverse weather conditions such as rain or fog.
This includes a new algorithm for source-free domain adaptation (SFDA) using self-supervised learning.
We have evaluated the performance on $6$ datasets corresponding to real and synthetic adverse weather conditions.
arXiv Detail & Related papers (2020-11-27T09:19:03Z) - Controlling Rayleigh-B\'enard convection via Reinforcement Learning [62.997667081978825]
The identification of effective control strategies to suppress or enhance the convective heat exchange under fixed external thermal gradients is an outstanding fundamental and technological issue.
In this work, we explore a novel approach, based on a state-of-the-art Reinforcement Learning (RL) algorithm.
We show that our RL-based control is able to stabilize the conductive regime and bring the onset of convection up to a Rayleigh number.
arXiv Detail & Related papers (2020-03-31T16:39:25Z) - NeurOpt: Neural network based optimization for building energy
management and climate control [58.06411999767069]
We propose a data-driven control algorithm based on neural networks to reduce this cost of model identification.
We validate our learning and control algorithms on a two-story building with ten independently controlled zones, located in Italy.
arXiv Detail & Related papers (2020-01-22T00:51:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.