A Visual Diagnostics Framework for District Heating Data: Enhancing Data Quality for AI-Driven Heat Consumption Prediction
- URL: http://arxiv.org/abs/2510.00872v1
- Date: Wed, 01 Oct 2025 13:21:55 GMT
- Title: A Visual Diagnostics Framework for District Heating Data: Enhancing Data Quality for AI-Driven Heat Consumption Prediction
- Authors: Kristoffer Christensen, Bo Nørregaard Jørgensen, Zheng Grace Ma,
- Abstract summary: This paper presents a systematic approach for evaluating and improving data quality using visual diagnostics.<n>The dashboard employs Python-based visualization techniques, including time series plots, heatmaps, box plots, histograms, correlation matrices, and anomaly-sensitive such as skewness and anomaly detection.<n>The study contributes to a scalable, generalizable framework for visual data inspection and underlines the critical role of data quality in AI-driven energy management systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: High-quality data is a prerequisite for training reliable Artificial Intelligence (AI) models in the energy domain. In district heating networks, sensor and metering data often suffer from noise, missing values, and temporal inconsistencies, which can significantly degrade model performance. This paper presents a systematic approach for evaluating and improving data quality using visual diagnostics, implemented through an interactive web-based dashboard. The dashboard employs Python-based visualization techniques, including time series plots, heatmaps, box plots, histograms, correlation matrices, and anomaly-sensitive KPIs such as skewness and anomaly detection based on the modified z-scores. These tools al-low human experts to inspect and interpret data anomalies, enabling a human-in-the-loop strategy for data quality assessment. The methodology is demonstrated on a real-world dataset from a Danish district heating provider, covering over four years of hourly data from nearly 7000 meters. The findings show how visual analytics can uncover systemic data issues and, in the future, guide data cleaning strategies that enhance the accuracy, stability, and generalizability of Long Short-Term Memory and Gated Recurrent Unit models for heat demand forecasting. The study contributes to a scalable, generalizable framework for visual data inspection and underlines the critical role of data quality in AI-driven energy management systems.
Related papers
- RelMap: Reliable Spatiotemporal Sensor Data Visualization via Imputative Spatial Interpolation [18.947107160943595]
This paper introduces a novel shorttemporal data pipeline that achieves reliable results and produces a novel heatmap representation with uncertainty information.<n>We leverage imputation from Neural Networks (GNNs) to enhance visualization reliability and temporal resolution.
arXiv Detail & Related papers (2025-08-02T07:25:23Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [78.18946529195254]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - SensorQA: A Question Answering Benchmark for Daily-Life Monitoring [1.925154869666529]
SensorQA is the first human-created question-answering dataset for long-term time-series sensor data for daily life monitoring.<n>We establish benchmarks for state-of-the-art AI models on this dataset and evaluate their performance on typical edge devices.<n>Our results reveal a gap between current models and optimal QA performance and efficiency, highlighting the need for new contributions.
arXiv Detail & Related papers (2025-01-09T05:06:44Z) - AI-Powered Dynamic Fault Detection and Performance Assessment in Photovoltaic Systems [44.99833362998488]
intermittent nature of photovoltaic (PV) solar energy leads to power losses of 10-70% and an average energy production decrease of 25%.
Current fault detection strategies are costly and often yield unreliable results due to complex data signal profiles.
This research presents a computational model using the PVlib library in Python, incorporating a dynamic loss quantification algorithm.
arXiv Detail & Related papers (2024-08-19T23:52:06Z) - HGOE: Hybrid External and Internal Graph Outlier Exposure for Graph Out-of-Distribution Detection [78.47008997035158]
Graph data exhibits greater diversity but lower robustness to perturbations, complicating the integration of outliers.
We propose the introduction of textbfHybrid External and Internal textbfGraph textbfOutlier textbfExposure (HGOE) to improve graph OOD detection performance.
arXiv Detail & Related papers (2024-07-31T16:55:18Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using
Meta-Learning [64.92447072894055]
Infrared (IR) cameras are robust under adverse illumination and lighting conditions.
We propose an algorithm meta-learning framework to improve existing UDA methods.
We produce a state-of-the-art thermal detector for the KAIST and DSIAC datasets.
arXiv Detail & Related papers (2021-10-07T02:28:18Z) - Exploring the Efficacy of Automatically Generated Counterfactuals for
Sentiment Analysis [17.811597734603144]
We propose an approach to automatically generating counterfactual data for data augmentation and explanation.
A comprehensive evaluation on several different datasets and using a variety of state-of-the-art benchmarks demonstrate how our approach can achieve significant improvements in model performance.
arXiv Detail & Related papers (2021-06-29T10:27:01Z) - PSEUDo: Interactive Pattern Search in Multivariate Time Series with
Locality-Sensitive Hashing and Relevance Feedback [3.347485580830609]
PSEUDo is an adaptive feature learning technique for exploring visual patterns in multi-track sequential data.
Our algorithm features sub-linear training and inference time.
We demonstrate superiority of PSEUDo in terms of efficiency, accuracy, and steerability.
arXiv Detail & Related papers (2021-04-30T13:00:44Z) - Deep convolutional generative adversarial networks for traffic data
imputation encoding time series as images [7.053891669775769]
We have developed a generative adversarial network (GAN) based traffic sensor data imputation framework (TGAN)
In this study, we have developed a novel time-dependent encoding method called the Gramian Angular Summation Field (GASF)
This study shows that the proposed model can significantly improve the traffic data imputation accuracy in terms of Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) compared to state-of-the-art models on the benchmark dataset.
arXiv Detail & Related papers (2020-05-05T19:14:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.