Related papers: Creating synthetic energy meter data using conditional diffusion and building metadata

Creating synthetic energy meter data using conditional diffusion and building metadata

URL: http://arxiv.org/abs/2404.00525v1
Date: Sun, 31 Mar 2024 01:58:38 GMT
Title: Creating synthetic energy meter data using conditional diffusion and building metadata
Authors: Chun Fu, Hussain Kazmi, Matias Quintana, Clayton Miller,
Abstract summary: The study proposes a conditional diffusion model for generating high-quality synthetic energy data using relevant metadata. Using a dataset comprising 1,828 power meters from various buildings and countries, this model is compared with traditional methods. Results demonstrate the proposed diffusion model's superior performance, with a 36% reduction in Frechet Inception Distance (FID) score and a 13% decrease in Kullback-Leibler divergence (KL divergence)
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Advances in machine learning and increased computational power have driven progress in energy-related research. However, limited access to private energy data from buildings hinders traditional regression models relying on historical data. While generative models offer a solution, previous studies have primarily focused on short-term generation periods (e.g., daily profiles) and a limited number of meters. Thus, the study proposes a conditional diffusion model for generating high-quality synthetic energy data using relevant metadata. Using a dataset comprising 1,828 power meters from various buildings and countries, this model is compared with traditional methods like Conditional Generative Adversarial Networks (CGAN) and Conditional Variational Auto-Encoders (CVAE). It explicitly handles long-term annual consumption profiles, harnessing metadata such as location, weather, building, and meter type to produce coherent synthetic data that closely resembles real-world energy consumption patterns. The results demonstrate the proposed diffusion model's superior performance, with a 36% reduction in Frechet Inception Distance (FID) score and a 13% decrease in Kullback-Leibler divergence (KL divergence) compared to the following best method. The proposed method successfully generates high-quality energy data through metadata, and its code will be open-sourced, establishing a foundation for a broader array of energy data generation models in the future.

Related papers

Time-series surrogates from energy consumers generated by machine learning approaches for long-term forecasting scenarios [0.0]
We provide an in-depth evaluation of data-driven methods for generating synthetic time series data tailored to energy consumption long-term forecasting.<n>High-fidelity synthetic data is crucial for a wide range of applications, including state estimations in energy systems or power grid planning.<n>This study utilizes an open-source dataset from households in Germany with 15min time resolution.
arXiv Detail & Related papers (2025-06-25T08:54:47Z)
Evaluating Privacy-Utility Tradeoffs in Synthetic Smart Grid Data [9.927400227483428]
We conduct a comparative evaluation of four synthetic data generation methods.<n>We assess classification utility, distribution fidelity, and privacy leakage.<n>These findings highlight the potential of structured generative models for developing privacy-preserving, data-driven energy systems.
arXiv Detail & Related papers (2025-05-20T10:46:29Z)
Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting. Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server. We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z)
From RNNs to Foundation Models: An Empirical Study on Commercial Building Energy Consumption [3.355907772736553]
Short-term energy consumption forecasting for commercial buildings is crucial for smart grid operations. While smart meters and deep learning models enable forecasting using past data from multiple buildings, data heterogeneity from diverse buildings can reduce model performance. We tackle this issue using the ComStock dataset, which provides synthetic energy consumption data for U.S. commercial buildings.
arXiv Detail & Related papers (2024-11-21T18:54:43Z)
Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step. Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z)
EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models [2.677325229270716]
High-resolution time series data are crucial for operation and planning in energy systems. Due to data collection costs and privacy concerns, such data is often unavailable or insufficient for downstream tasks. We propose EnergyDiff, a universal data generation framework for energy time series data.
arXiv Detail & Related papers (2024-07-18T14:10:50Z)
Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD. It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation. We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z)
Enhancing Indoor Temperature Forecasting through Synthetic Data in Low-Data Environments [42.8983261737774]
We investigate the efficacy of data augmentation techniques leveraging SoTA AI-based methods for synthetic data generation. Inspired by practical and experimental motivations, we explore fusion strategies of real and synthetic data to improve forecasting models.
arXiv Detail & Related papers (2024-06-07T12:36:31Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Grid Frequency Forecasting in University Campuses using Convolutional LSTM [0.0]
This paper harnesses Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to establish robust time forecasting models for grid frequency. Individual ConvLSTM models are trained on power consumption data for each campus building and forecast the grid frequency based on historical trends. An Ensemble Model is formulated to aggregate insights from the building-specific models, delivering comprehensive forecasts for the entire campus.
arXiv Detail & Related papers (2023-10-24T13:53:51Z)
Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task. We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z)
Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models. E-ARM takes advantage of a well-designed energy-based learning objective. We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z)
A Comparative Study on Energy Consumption Models for Drones [4.660172505713055]
We benchmark the five most popular energy consumption models for drones derived from their physical behaviours. We propose a novel data-driven energy model using the Long Short-Term Memory (LSTM) based deep learning architecture. Our experimental results have shown that the LSTM based approach can easily outperform other mathematical models for the dataset under study.
arXiv Detail & Related papers (2022-05-30T23:05:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.