Related papers: Bridging Smart Meter Gaps: A Benchmark of Statistical, Machine Learning and Time Series Foundation Models for Data Imputation

Bridging Smart Meter Gaps: A Benchmark of Statistical, Machine Learning and Time Series Foundation Models for Data Imputation

URL: http://arxiv.org/abs/2501.07276v2
Date: Thu, 20 Feb 2025 09:02:33 GMT
Title: Bridging Smart Meter Gaps: A Benchmark of Statistical, Machine Learning and Time Series Foundation Models for Data Imputation
Authors: Amir Sartipi, Joaquín Delgado Fernández, Sergio Potenciano Menci, Alessio Magitteri,
Abstract summary: Gaps in time series data in smart grids can bias consumption analyses and hinder reliable predictions.<n>Generative Artificial Intelligence offers promising solutions that may outperform traditional statistical methods.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The integrity of time series data in smart grids is often compromised by missing values due to sensor failures, transmission errors, or disruptions. Gaps in smart meter data can bias consumption analyses and hinder reliable predictions, causing technical and economic inefficiencies. As smart meter data grows in volume and complexity, conventional techniques struggle with its nonlinear and nonstationary patterns. In this context, Generative Artificial Intelligence offers promising solutions that may outperform traditional statistical methods. In this paper, we evaluate two general-purpose Large Language Models and five Time Series Foundation Models for smart meter data imputation, comparing them with conventional Machine Learning and statistical models. We introduce artificial gaps (30 minutes to one day) into an anonymized public dataset to test inference capabilities. Results show that Time Series Foundation Models, with their contextual understanding and pattern recognition, could significantly enhance imputation accuracy in certain cases. However, the trade-off between computational cost and performance gains remains a critical consideration.

Related papers

Privacy-Preserving Methods for Bug Severity Prediction [0.0]
We investigate method-level bug severity prediction using source code metrics and Large Language Models.<n>We compare the performance of models trained using centralized learning, federated learning, and synthetic data generation.<n>Our finding highlights the potential of privacy-preserving approaches to enable effective bug severity prediction in industrial context.
arXiv Detail & Related papers (2025-06-28T04:40:51Z)
Fairness and Robustness in Machine Unlearning [20.758637391023345]
We focus on fairness and robustness in machine unlearning algorithms. Experiments demonstrate the vulnerability of current state-of-the-art approximated unlearning algorithms to adversarial attacks. We demonstrate that unlearning in the intermediate and last layers is sufficient and cost-effective for time and memory complexity.
arXiv Detail & Related papers (2025-04-18T10:31:44Z)
Data Fusion of Deep Learned Molecular Embeddings for Property Prediction [44.99833362998488]
We use data fusion techniques to combine the learned molecular embeddings of various single-task models and trained a multi-task model on this combined embedding.<n>We show that the fused, multi-task models outperform standard multi-task models for sparse datasets and can provide enhanced prediction on data-limited properties compared to single-task models.
arXiv Detail & Related papers (2025-04-09T21:40:15Z)
Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios [23.160007389272575]
Missing values in time series data can harm machine learning performance and introduce bias. Previous work has tackled the imputation of missing data in random, complete blackouts and forecasting scenarios. We introduce a two-stage imputation process using self-attention and diffusion processes to model feature and temporal correlations.
arXiv Detail & Related papers (2025-03-03T16:58:15Z)
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
We introduce AutoElicit to extract knowledge from large language models and construct priors for predictive models. We show these priors are informative and can be refined using natural language. We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning.
arXiv Detail & Related papers (2024-11-26T10:13:39Z)
Gridded Transformer Neural Processes for Large Unstructured Spatio-Temporal Data [47.14384085714576]
We introduce gridded pseudo-tokenPs to handle unstructured observations and a processor containing gridded pseudo-tokens that leverage efficient attention mechanisms. Our method consistently outperforms a range of strong baselines on various synthetic and real-world regression tasks involving large-scale data. The real-life experiments are performed on weather data, demonstrating the potential of our approach to bring performance and computational benefits when applied at scale in a weather modelling pipeline.
arXiv Detail & Related papers (2024-10-09T10:00:56Z)
Inference for Large Scale Regression Models with Dependent Errors [3.3160726548489015]
This work defines and proves the statistical properties of the Generalized Method of Wavelet Moments with Exogenous variables (GMWMX) It is a highly scalable, stable, and statistically valid method for estimating and delivering inference for linear models using processes in the presence of data complexities like latent dependence structures and missing data.
arXiv Detail & Related papers (2024-09-08T17:01:05Z)
TimeSieve: Extracting Temporal Dynamics through Information Bottlenecks [31.10683149519954]
We propose an innovative time series forecasting model TimeSieve. Our approach employs wavelet transforms to preprocess time series data, effectively capturing multi-scale features. Our results validate the effectiveness of our approach in addressing the key challenges in time series forecasting.
arXiv Detail & Related papers (2024-06-07T15:58:12Z)
A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation [35.46631415365955]
We introduce a conditional diffusion framework called C$2$TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information. Our experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-18T11:59:04Z)
Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series [45.76310830281876]
We propose Quantile Sub-Ensembles, a novel method to estimate uncertainty with ensemble of quantile-regression-based task networks. Our method not only produces accurate imputations that is robust to high missing rates, but also is computationally efficient due to the fast training of its non-generative model.
arXiv Detail & Related papers (2023-12-03T05:52:30Z)
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization. We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z)
DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors. We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis. We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. We propose a versatile method that estimates joint distributions using an attention-based decoder. We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z)
Uncertainty Prediction for Machine Learning Models of Material Properties [0.0]
Uncertainty in AI-based predictions of material properties is of immense importance for the success and reliability of AI applications in material science. We compare 3 different approaches to obtain such individual uncertainty, testing them on 12 ML-physical properties.
arXiv Detail & Related papers (2021-07-16T16:33:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.