Knowledge Distillation from Large Language Models for Household Energy Modeling
- URL: http://arxiv.org/abs/2502.03034v1
- Date: Wed, 05 Feb 2025 09:43:14 GMT
- Title: Knowledge Distillation from Large Language Models for Household Energy Modeling
- Authors: Mohannad Takrouri, Nicolás M. Cuadrado, Martin Takáč,
- Abstract summary: We propose integrating Large Language Models in energy modeling to generate realistic, culturally sensitive, and behavior-specific data.<n>A four-stage methodology synthesizes contextual daily data, including culturally nuanced activities, realistic weather ranges, and distinct energy signatures'<n>The resulting dataset provides insights into how cultural, climatic, and behavioral factors converge to shape carbon emissions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) is increasingly vital for smart-grid research, yet restricted access to realistic, diverse data - often due to privacy concerns - slows progress and fuels doubts within the energy sector about adopting ML-based strategies. We propose integrating Large Language Models (LLMs) in energy modeling to generate realistic, culturally sensitive, and behavior-specific data for household energy usage across diverse geographies. In this study, we employ and compare five different LLMs to systematically produce family structures, weather patterns, and daily consumption profiles for households in six distinct countries. A four-stage methodology synthesizes contextual daily data, including culturally nuanced activities, realistic weather ranges, HVAC operations, and distinct `energy signatures' that capture unique consumption footprints. Additionally, we explore an alternative strategy where external weather datasets can be directly integrated, bypassing intermediate weather modeling stages while ensuring physically consistent data inputs. The resulting dataset provides insights into how cultural, climatic, and behavioral factors converge to shape carbon emissions, offering a cost-effective avenue for scenario-based energy optimization. This approach underscores how prompt engineering, combined with knowledge distillation, can advance sustainable energy research and climate mitigation efforts. Source code is available at https://github.com/Singularity-AI-Lab/LLM-Energy-Knowledge-Distillation .
Related papers
- ClimateBench-M: A Multi-Modal Climate Data Benchmark with a Simple Generative Method [61.76389719956301]
We contribute a multi-modal climate benchmark, i.e., ClimateBench-M, which aligns time series climate data from ERA5, extreme weather events data from NOAA, and satellite image data from NASA.
Under each data modality, we also propose a simple but strong generative method that could produce competitive performance in weather forecasting, thunderstorm alerts, and crop segmentation tasks.
arXiv Detail & Related papers (2025-04-10T02:22:23Z) - Explainable AI for building energy retrofitting under data scarcity [40.14307808809578]
This study presents an Artificial Intelligence (AI) and Machine Learning (ML)-based framework to recommend energy efficiency measures for residential buildings.
Using Latvia as a case study, the methodology addresses challenges associated with limited datasets, class imbalance and data scarcity.
The evaluation of the approach shows that it notably overcomes data limitations, achieving improvements up to 54% in precision, recall and F1 score.
arXiv Detail & Related papers (2025-04-08T14:00:08Z) - Physics-guided machine learning predicts the planet-scale performance of solar farms with sparse, heterogeneous, public data [0.0]
To predict the potential and scalability of emerging PV technologies, a global understanding of these systems' performance is essential.
Here, we present a physics-guided machine learning (PGML) scheme to demonstrate that: (a) The world can be divided into a few PV-specific climate zones, called PVZones, illustrating that the relevant meteorological conditions are shared across continents; (b) by exploiting the climatic similarities, high-quality monthly energy yield data from as few as five locations can accurately predict yearly energy yield potential with high spatial resolution and a root mean square error of less than 8m$2$,
arXiv Detail & Related papers (2024-07-25T08:06:21Z) - The Price of Prompting: Profiling Energy Use in Large Language Models Inference [5.254805405012678]
This paper introduces MELODI, a framework crafted to monitor and analyze the energy consumed during large language models inference processes.
The dataset, generated using MELODI, encompasses a broad spectrum of LLM deployment frameworks, multiple language models, and extensive prompt datasets.
Our findings indicate substantial disparities in energy efficiency, suggesting ample scope for optimization and adoption of sustainable measures.
arXiv Detail & Related papers (2024-07-04T12:16:28Z) - FaIRGP: A Bayesian Energy Balance Model for Surface Temperatures
Emulation [13.745581787463962]
We introduce FaIRGP, a data-driven emulator that satisfies the physical temperature response equations of an energy balance model.
We show how FaIRGP can be used to obtain estimates of top-of-atmosphere radiative forcing.
We hope that this work will contribute to widening the adoption of data-driven methods in climate emulation.
arXiv Detail & Related papers (2023-07-14T08:43:36Z) - A Comparative Study on Generative Models for High Resolution Solar
Observation Imaging [59.372588316558826]
This work investigates capabilities of current state-of-the-art generative models to accurately capture the data distribution behind observed solar activity states.
Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts.
arXiv Detail & Related papers (2023-04-14T14:40:32Z) - Counting Carbon: A Survey of Factors Influencing the Emissions of
Machine Learning [77.62876532784759]
Machine learning (ML) requires using energy to carry out computations during the model training process.
The generation of this energy comes with an environmental cost in terms of greenhouse gas emissions, depending on quantity used and the energy source.
We present a survey of the carbon emissions of 95 ML models across time and different tasks in natural language processing and computer vision.
arXiv Detail & Related papers (2023-02-16T18:35:00Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - High-resolution synthetic residential energy use profiles for the United
States [12.699816591560712]
We release a large-scale, synthetic, residential energy-use dataset for the residential sector across the contiguous United States.
The data comprises of hourly energy use profiles for synthetic households, disaggregated into Thermostatically Controlled Loads (TCL) and appliance use.
arXiv Detail & Related papers (2022-10-14T20:55:10Z) - Modeling and Optimization of a Longitudinally-Distributed Global Solar
Grid [0.0]
These experiments consist of a network of model houses at different locations in the world, each producing and consuming only solar energy.
Data gathered from the power system simulation is used to develop optimization models to find the optimal solar panel area required at the different locations.
arXiv Detail & Related papers (2022-06-11T18:20:13Z) - DeepClimGAN: A High-Resolution Climate Data Generator [60.59639064716545]
Earth system models (ESMs) are often used to generate future projections of climate change scenarios.
As a compromise, emulators are substantially less expensive but may not have all of the complexity of an ESM.
Here we demonstrate the use of a conditional generative adversarial network (GAN) to act as an ESM emulator.
arXiv Detail & Related papers (2020-11-23T20:13:37Z) - Learning to Continuously Optimize Wireless Resource In Episodically
Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment.
We propose to build the notion of continual learning into the modeling process of learning wireless systems.
Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.