Creating synthetic energy meter data using conditional diffusion and building metadata
- URL: http://arxiv.org/abs/2404.00525v1
- Date: Sun, 31 Mar 2024 01:58:38 GMT
- Title: Creating synthetic energy meter data using conditional diffusion and building metadata
- Authors: Chun Fu, Hussain Kazmi, Matias Quintana, Clayton Miller,
- Abstract summary: The study proposes a conditional diffusion model for generating high-quality synthetic energy data using relevant metadata.
Using a dataset comprising 1,828 power meters from various buildings and countries, this model is compared with traditional methods.
Results demonstrate the proposed diffusion model's superior performance, with a 36% reduction in Frechet Inception Distance (FID) score and a 13% decrease in Kullback-Leibler divergence (KL divergence)
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Advances in machine learning and increased computational power have driven progress in energy-related research. However, limited access to private energy data from buildings hinders traditional regression models relying on historical data. While generative models offer a solution, previous studies have primarily focused on short-term generation periods (e.g., daily profiles) and a limited number of meters. Thus, the study proposes a conditional diffusion model for generating high-quality synthetic energy data using relevant metadata. Using a dataset comprising 1,828 power meters from various buildings and countries, this model is compared with traditional methods like Conditional Generative Adversarial Networks (CGAN) and Conditional Variational Auto-Encoders (CVAE). It explicitly handles long-term annual consumption profiles, harnessing metadata such as location, weather, building, and meter type to produce coherent synthetic data that closely resembles real-world energy consumption patterns. The results demonstrate the proposed diffusion model's superior performance, with a 36% reduction in Frechet Inception Distance (FID) score and a 13% decrease in Kullback-Leibler divergence (KL divergence) compared to the following best method. The proposed method successfully generates high-quality energy data through metadata, and its code will be open-sourced, establishing a foundation for a broader array of energy data generation models in the future.
Related papers
- Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models [2.677325229270716]
High-resolution time series data are crucial for operation and planning in energy systems.
Due to data collection costs and privacy concerns, such data is often unavailable or insufficient for downstream tasks.
We propose EnergyDiff, a universal data generation framework for energy time series data.
arXiv Detail & Related papers (2024-07-18T14:10:50Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Enhancing Indoor Temperature Forecasting through Synthetic Data in Low-Data Environments [42.8983261737774]
We investigate the efficacy of data augmentation techniques leveraging SoTA AI-based methods for synthetic data generation.
Inspired by practical and experimental motivations, we explore fusion strategies of real and synthetic data to improve forecasting models.
arXiv Detail & Related papers (2024-06-07T12:36:31Z) - Synthetic Face Datasets Generation via Latent Space Exploration from Brownian Identity Diffusion [20.352548473293993]
Face Recognition (FR) models are trained on large-scale datasets, which have privacy and ethical concerns.
Lately, the use of synthetic data to complement or replace genuine data for the training of FR models has been proposed.
We introduce a new method, inspired by the physical motion of soft particles subjected to Brownian forces, allowing us to sample identities in a latent space under various constraints.
With this in hands, we generate several face datasets and benchmark them by training FR models, showing that data generated with our method exceeds the performance of previously GAN-based datasets and achieves competitive performance with state-of-the-
arXiv Detail & Related papers (2024-04-30T22:32:02Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Grid Frequency Forecasting in University Campuses using Convolutional
LSTM [0.0]
This paper harnesses Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to establish robust time forecasting models for grid frequency.
Individual ConvLSTM models are trained on power consumption data for each campus building and forecast the grid frequency based on historical trends.
An Ensemble Model is formulated to aggregate insights from the building-specific models, delivering comprehensive forecasts for the entire campus.
arXiv Detail & Related papers (2023-10-24T13:53:51Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - A Comparative Study on Energy Consumption Models for Drones [4.660172505713055]
We benchmark the five most popular energy consumption models for drones derived from their physical behaviours.
We propose a novel data-driven energy model using the Long Short-Term Memory (LSTM) based deep learning architecture.
Our experimental results have shown that the LSTM based approach can easily outperform other mathematical models for the dataset under study.
arXiv Detail & Related papers (2022-05-30T23:05:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.