Related papers: Demand response for residential building heating: Effective Monte Carlo Tree Search control based on physics-informed neural networks

Demand response for residential building heating: Effective Monte Carlo Tree Search control based on physics-informed neural networks

URL: http://arxiv.org/abs/2312.03365v4
Date: Tue, 21 May 2024 14:56:38 GMT
Title: Demand response for residential building heating: Effective Monte Carlo Tree Search control based on physics-informed neural networks
Authors: Fabio Pavirani, Gargya Gokhale, Bert Claessens, Chris Develder,
Abstract summary: This paper focuses on using a demand response (DR) algorithm to limit the energy consumption of a residential building's heating system. One such RL method is Monte Carlo Tree Search (MCTS), which has achieved impressive success in playing board games (go, chess)
Score: 4.1860949813005375
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To reduce global carbon emissions and limit climate change, controlling energy consumption in buildings is an important piece of the puzzle. Here, we specifically focus on using a demand response (DR) algorithm to limit the energy consumption of a residential building's heating system while respecting user's thermal comfort. In that domain, Reinforcement learning (RL) methods have been shown to be quite effective. One such RL method is Monte Carlo Tree Search (MCTS), which has achieved impressive success in playing board games (go, chess). A particular advantage of MCTS is that its decision tree structure naturally allows to integrate exogenous constraints (e.g., by trimming branches that violate them), while conventional RL solutions need more elaborate techniques (e.g., indirectly by adding penalties in the cost/reward function, or through a backup controller that corrects constraint-violating actions). The main aim of this paper is to study the adoption of MCTS for building control, since this (to the best of our knowledge) has remained largely unexplored. A specific property of MCTS is that it needs a simulator component that can predict subsequent system states, based on actions taken. A straightforward data-driven solution is to use black-box neural networks (NNs). We will however extend a Physics-informed Neural Network (PiNN) model to deliver multi-timestep predictions, and show the benefit it offers in terms of lower prediction errors ($-$32\% MAE) as well as better MCTS performance ($-$4\% energy cost, $+$7\% thermal comfort) compared to a black-box NN. A second contribution will be to extend a vanilla MCTS version to adopt the ideas applied in AlphaZero (i.e., using learned prior and value functions and an action selection heuristic) to obtain lower computational costs while maintaining control performance.

Related papers

Can Large Language Models Play Games? A Case Study of A Self-Play Approach [61.15761840203145]
Large Language Models (LLMs) harness extensive data from the Internet, storing a broad spectrum of prior knowledge. Monte-Carlo Tree Search (MCTS) is a search algorithm that provides reliable decision-making solutions. This work introduces an innovative approach that bolsters LLMs with MCTS self-play to efficiently resolve turn-based zero-sum games.
arXiv Detail & Related papers (2024-03-08T19:16:29Z)
Efficient Data-Driven MPC for Demand Response of Commercial Buildings [0.0]
We propose a data-driven and mixed-integer bidding strategy for energy management in small commercial buildings. We consider rooftop unit heating, air conditioning systems with discrete controls to accurately model the operation of most commercial buildings. We apply our approach in several demand response (DR) settings, including a time-of-use, and a critical rebate bidding.
arXiv Detail & Related papers (2024-01-28T20:01:44Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW) We derive regret bounds for CBO-MW that naturally depend on graph-related quantities. Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z)
Movement Penalized Bayesian Optimization with Application to Wind Energy Systems [84.7485307269572]
Contextual Bayesian optimization (CBO) is a powerful framework for sequential decision-making given side information. In this setting, the learner receives context (e.g., weather conditions) at each round, and has to choose an action (e.g., turbine parameters) Standard algorithms assume no cost for switching their decisions at every round, but in many practical applications, there is a cost associated with such changes, which should be minimized.
arXiv Detail & Related papers (2022-10-14T20:19:32Z)
Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations [17.08814685657957]
Monte Carlo Augmented Actor Critic (MCAC) is a parameter free modification to standard actor-critic algorithms. MCAC computes a modified $Q$-value by taking the maximum of the standard temporal distance (TD) target and a Monte Carlo estimate of the reward-to-go. Experiments across $5$ continuous control domains suggest that MCAC can be used to significantly increase learning efficiency across $6$ commonly used RL and RL-from-demonstrations algorithms.
arXiv Detail & Related papers (2022-10-14T00:23:37Z)
Low Emission Building Control with Zero-Shot Reinforcement Learning [70.70479436076238]
Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency. We show it is possible to obtain emission-reducing policies without a priori--a paradigm we call zero-shot building control.
arXiv Detail & Related papers (2022-08-12T17:13:25Z)
Input Convex Neural Networks for Building MPC [3.7597202216941783]
We introduce additional constraints to Input Convex Neural Networks to achieve a convex input-output relationship for multistep ahead predictions. In two five-day cooling experiments, MPC with Input Convex Neural Networks is able to keep room temperatures within comfort constraints while minimizing cooling energy consumption.
arXiv Detail & Related papers (2020-11-26T10:51:50Z)
Controlling Rayleigh-B\'enard convection via Reinforcement Learning [62.997667081978825]
The identification of effective control strategies to suppress or enhance the convective heat exchange under fixed external thermal gradients is an outstanding fundamental and technological issue. In this work, we explore a novel approach, based on a state-of-the-art Reinforcement Learning (RL) algorithm. We show that our RL-based control is able to stabilize the conductive regime and bring the onset of convection up to a Rayleigh number.
arXiv Detail & Related papers (2020-03-31T16:39:25Z)
NeurOpt: Neural network based optimization for building energy management and climate control [58.06411999767069]
We propose a data-driven control algorithm based on neural networks to reduce this cost of model identification. We validate our learning and control algorithms on a two-story building with ten independently controlled zones, located in Italy.
arXiv Detail & Related papers (2020-01-22T00:51:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.