GVFs in the Real World: Making Predictions Online for Water Treatment
- URL: http://arxiv.org/abs/2312.01624v1
- Date: Mon, 4 Dec 2023 04:49:10 GMT
- Title: GVFs in the Real World: Making Predictions Online for Water Treatment
- Authors: Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos
C. Machado, Adam White
- Abstract summary: We investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant.
We first describe this dataset, and highlight challenges with seasonality, nonstationarity, partial observability.
We show the importance of learning in deployment, by comparing a TD agent trained purely offline with no online updating to a TD agent that learns online.
- Score: 23.651798878534635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we investigate the use of reinforcement-learning based
prediction approaches for a real drinking-water treatment plant. Developing
such a prediction system is a critical step on the path to optimizing and
automating water treatment. Before that, there are many questions to answer
about the predictability of the data, suitable neural network architectures,
how to overcome partial observability and more. We first describe this dataset,
and highlight challenges with seasonality, nonstationarity, partial
observability, and heterogeneity across sensors and operation modes of the
plant. We then describe General Value Function (GVF) predictions -- discounted
cumulative sums of observations -- and highlight why they might be preferable
to classical n-step predictions common in time series prediction. We discuss
how to use offline data to appropriately pre-train our temporal difference
learning (TD) agents that learn these GVF predictions, including how to select
hyperparameters for online fine-tuning in deployment. We find that the
TD-prediction agent obtains an overall lower normalized mean-squared error than
the n-step prediction agent. Finally, we show the importance of learning in
deployment, by comparing a TD agent trained purely offline with no online
updating to a TD agent that learns online. This final result is one of the
first to motivate the importance of adapting predictions in real-time, for
non-stationary high-volume systems in the real world.
Related papers
- Online Residual Learning from Offline Experts for Pedestrian Tracking [5.047136039782827]
We propose Online Residual Learning (ORL), a method that combines online adaptation with offline-trained predictions.
At a lower level, we employ multiple offline predictions generated before or at the beginning of the prediction horizon.
At a higher level, we treat the augmented lower-level predictors as experts, adopting the Prediction with Expert Advice framework.
arXiv Detail & Related papers (2024-09-06T07:20:45Z) - Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - ADAPT: Efficient Multi-Agent Trajectory Prediction with Adaptation [0.0]
ADAPT is a novel approach for jointly predicting the trajectories of all agents in the scene with dynamic weight learning.
Our approach outperforms state-of-the-art methods in both single-agent and multi-agent settings.
arXiv Detail & Related papers (2023-07-26T13:41:51Z) - LMD: Light-weight Prediction Quality Estimation for Object Detection in
Lidar Point Clouds [3.927702899922668]
Object detection on Lidar point cloud data is a promising technology for autonomous driving and robotics.
Uncertainty estimation is a crucial component for down-stream tasks and deep neural networks remain error-prone even for predictions with high confidence.
We propose LidarMetaDetect, a light-weight post-processing scheme for prediction quality estimation.
Our experiments show a significant increase of statistical reliability in separating true from false predictions.
arXiv Detail & Related papers (2023-06-13T15:13:29Z) - Interpretable Self-Aware Neural Networks for Robust Trajectory
Prediction [50.79827516897913]
We introduce an interpretable paradigm for trajectory prediction that distributes the uncertainty among semantic concepts.
We validate our approach on real-world autonomous driving data, demonstrating superior performance over state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-16T06:28:20Z) - What Should I Know? Using Meta-gradient Descent for Predictive Feature
Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations.
An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making.
We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z) - Taming Overconfident Prediction on Unlabeled Data from Hindsight [50.9088560433925]
Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning.
This paper proposes a dual mechanism, named ADaptive Sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions.
ADS significantly improves the state-of-the-art SSL methods by making it a plug-in.
arXiv Detail & Related papers (2021-12-15T15:17:02Z) - Learning to Predict Trustworthiness with Steep Slope Loss [69.40817968905495]
We study the problem of predicting trustworthiness on real-world large-scale datasets.
We observe that the trustworthiness predictors trained with prior-art loss functions are prone to view both correct predictions and incorrect predictions to be trustworthy.
We propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other.
arXiv Detail & Related papers (2021-09-30T19:19:09Z) - Random vector functional link neural network based ensemble deep
learning for short-term load forecasting [14.184042046855884]
This paper proposes a novel ensemble deep Random Functional Link (edRVFL) network for electricity load forecasting.
The hidden layers are stacked to enforce deep representation learning.
The model generates the forecasts by ensembling the outputs of each layer.
arXiv Detail & Related papers (2021-07-30T01:20:48Z) - Long-Short Term Spatiotemporal Tensor Prediction for Passenger Flow
Profile [15.875569404476495]
We focus on a tensor-based prediction and propose several practical techniques to improve prediction.
For long-term prediction specifically, we propose the "Tensor Decomposition + 2-Dimensional Auto-Regressive Moving Average (2D-ARMA)" model.
For short-term prediction, we propose to conduct tensor completion based on tensor clustering to avoid oversimplifying and ensure accuracy.
arXiv Detail & Related papers (2020-04-23T08:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.