Model Selection, Adaptation, and Combination for Deep Transfer Learning
through Neural Networks in Renewable Energies
- URL: http://arxiv.org/abs/2204.13293v1
- Date: Thu, 28 Apr 2022 05:34:50 GMT
- Title: Model Selection, Adaptation, and Combination for Deep Transfer Learning
through Neural Networks in Renewable Energies
- Authors: Jens Schreiber and Bernhard Sick
- Abstract summary: We conduct the first thorough experiment for model selection and adaptation for transfer learning in renewable power forecast.
We adopt models based on data from different seasons and limit the amount of training data.
We show how combining multiple models through ensembles can significantly improve the model selection and adaptation approach.
- Score: 5.953831950062808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is recent interest in using model hubs, a collection of pre-trained
models, in computer vision tasks. To utilize the model hub, we first select a
source model and then adapt the model for the target to compensate for
differences. While there is yet limited research on a model selection and
adaption for computer vision tasks, this holds even more for the field of
renewable power. At the same time, it is a crucial challenge to provide
forecasts for the increasing demand for power forecasts based on weather
features from a numerical weather prediction. We close these gaps by conducting
the first thorough experiment for model selection and adaptation for transfer
learning in renewable power forecast, adopting recent results from the field of
computer vision on six datasets. We adopt models based on data from different
seasons and limit the amount of training data. As an extension of the current
state of the art, we utilize a Bayesian linear regression for forecasting the
response based on features extracted from a neural network. This approach
outperforms the baseline with only seven days of training data. We further show
how combining multiple models through ensembles can significantly improve the
model selection and adaptation approach. In fact, with more than 30 days of
training data, both proposed model combination techniques achieve similar
results to those models trained with a full year of training data.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Model Selection with Model Zoo via Graph Learning [45.30615308692713]
We introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem.
We demonstrate TransferGraph's effectiveness in capturing essential model-dataset relationships, yielding up to a 32% improvement in correlation between predicted performance and the actual fine-tuning results compared to the state-of-the-art methods.
arXiv Detail & Related papers (2024-04-05T09:50:00Z) - A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework.
It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets.
It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Wrapper Boxes: Faithful Attribution of Model Predictions to Training Data [40.7542543934205]
We propose a "wrapper box'' pipeline: training a neural model as usual and then using its learned feature representation in classic, interpretable models to perform prediction.
Across seven language models of varying sizes, we first show that the predictive performance of wrapper classic models is largely comparable to the original neural models.
Our pipeline thus preserves the predictive performance of neural language models while faithfully attributing classic model decisions to training data.
arXiv Detail & Related papers (2023-11-15T01:50:53Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Revealing Secrets From Pre-trained Models [2.0249686991196123]
Transfer-learning has been widely adopted in many emerging deep learning algorithms.
We show that pre-trained models and fine-tuned models have significantly high similarities in weight values.
We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
arXiv Detail & Related papers (2022-07-19T20:19:03Z) - Do We Really Need Deep Learning Models for Time Series Forecasting? [4.2698418800007865]
Time series forecasting is a crucial task in machine learning, as it has a wide range of applications.
Deep learning and matrix factorization models have been recently proposed to tackle the same problem with more competitive performance.
In this paper, we try to answer whether these highly complex deep learning models are without alternative.
arXiv Detail & Related papers (2021-01-06T16:18:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.