On Designing Data Models for Energy Feature Stores
- URL: http://arxiv.org/abs/2205.04267v1
- Date: Mon, 9 May 2022 13:35:53 GMT
- Title: On Designing Data Models for Energy Feature Stores
- Authors: Gregor Cerar, Bla\v{z} Bertalani\v{c}, An\v{z}e Pirnat, Andrej
\v{C}ampa, Carolina Fortuna
- Abstract summary: We study data models, energy feature engineering and feature management solutions for developing ML-based energy applications.
We first propose a taxonomy for designing data models suitable for energy applications, analyze feature engineering techniques able to transform the data model into features suitable for ML model training and finally also analyze available designs for feature stores.
- Score: 0.5809784853115825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The digitization of the energy infrastructure enables new, data driven,
applications often supported by machine learning models. However, domain
specific data transformations, pre-processing and management in modern data
driven pipelines is yet to be addressed. In this paper we perform a first time
study on data models, energy feature engineering and feature management
solutions for developing ML-based energy applications. We first propose a
taxonomy for designing data models suitable for energy applications, analyze
feature engineering techniques able to transform the data model into features
suitable for ML model training and finally also analyze available designs for
feature stores. Using a short-term forecasting dataset, we show the benefits of
designing richer data models and engineering the features on the performance of
the resulting models. Finally, we benchmark three complementary feature
management solutions, including an open-source feature store.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches [64.42735183056062]
Large language models (LLMs) have transitioned from specialized models to versatile foundation models.
LLMs exhibit impressive zero-shot ability, however, require fine-tuning on local datasets and significant resources for deployment.
arXiv Detail & Related papers (2024-08-20T09:42:17Z) - Code Generation for Machine Learning using Model-Driven Engineering and
SysML [0.0]
This work aims to facilitate the implementation of data-driven engineering in practice by extending the previous work of formalizing machine learning tasks.
The presented method is evaluated for feasibility in a case study to predict weather forecasts.
Results demonstrate the flexibility and the simplicity of the method reducing efforts for implementation.
arXiv Detail & Related papers (2023-07-10T15:00:20Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - T-METASET: Task-Aware Generation of Metamaterial Datasets by
Diversity-Based Active Learning [14.668178146934588]
We propose t-METASET: an intelligent data acquisition framework for task-aware dataset generation.
We validate the proposed framework in three hypothetical deployment scenarios, which encompass general use, task-aware use, and tailorable use.
arXiv Detail & Related papers (2022-02-21T22:46:49Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Concept for a Technical Infrastructure for Management of Predictive
Models in Industrial Applications [0.0]
We describe our technological concept for a model management system.
This concept includes versioned storage of data, support for different machine learning algorithms, fine tuning of models, subsequent deployment of models and monitoring of model performance after deployment.
arXiv Detail & Related papers (2021-07-29T08:38:46Z) - Learning Discrete Energy-based Models via Auxiliary-variable Local
Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data.
We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration.
We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z) - Gradient-Based Training and Pruning of Radial Basis Function Networks
with an Application in Materials Physics [0.24792948967354234]
We propose a gradient-based technique for training radial basis function networks with an efficient and scalable open-source implementation.
We derive novel closed-form optimization criteria for pruning the models for continuous as well as binary data.
arXiv Detail & Related papers (2020-04-06T11:32:37Z) - From Data to Actions in Intelligent Transportation Systems: a
Prescription of Functional Requirements for Model Actionability [10.27718355111707]
This work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes.
Grounded in this described data modeling pipeline for ITS, wedefine the characteristics, engineering requisites and intrinsic challenges to its three compounding stages, namely, data fusion, adaptive learning and model evaluation.
arXiv Detail & Related papers (2020-02-06T12:02:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.