Model Selection, Adaptation, and Combination for Deep Transfer Learning
through Neural Networks in Renewable Energies
- URL: http://arxiv.org/abs/2204.13293v1
- Date: Thu, 28 Apr 2022 05:34:50 GMT
- Title: Model Selection, Adaptation, and Combination for Deep Transfer Learning
through Neural Networks in Renewable Energies
- Authors: Jens Schreiber and Bernhard Sick
- Abstract summary: We conduct the first thorough experiment for model selection and adaptation for transfer learning in renewable power forecast.
We adopt models based on data from different seasons and limit the amount of training data.
We show how combining multiple models through ensembles can significantly improve the model selection and adaptation approach.
- Score: 5.953831950062808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is recent interest in using model hubs, a collection of pre-trained
models, in computer vision tasks. To utilize the model hub, we first select a
source model and then adapt the model for the target to compensate for
differences. While there is yet limited research on a model selection and
adaption for computer vision tasks, this holds even more for the field of
renewable power. At the same time, it is a crucial challenge to provide
forecasts for the increasing demand for power forecasts based on weather
features from a numerical weather prediction. We close these gaps by conducting
the first thorough experiment for model selection and adaptation for transfer
learning in renewable power forecast, adopting recent results from the field of
computer vision on six datasets. We adopt models based on data from different
seasons and limit the amount of training data. As an extension of the current
state of the art, we utilize a Bayesian linear regression for forecasting the
response based on features extracted from a neural network. This approach
outperforms the baseline with only seven days of training data. We further show
how combining multiple models through ensembles can significantly improve the
model selection and adaptation approach. In fact, with more than 30 days of
training data, both proposed model combination techniques achieve similar
results to those models trained with a full year of training data.
Related papers
- An exactly solvable model for emergence and scaling laws [2.598133279943607]
We present a framework where each new ability (a skill) is represented as a basis function.
We find analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute.
Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.
arXiv Detail & Related papers (2024-04-26T17:45:32Z) - A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework.
It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets.
It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z) - Diffusion-based Neural Network Weights Generation [85.6725307453325]
We propose an efficient and adaptive transfer learning scheme through dataset-conditioned pretrained weights sampling.
Specifically, we use a latent diffusion model with a variational autoencoder that can reconstruct the neural network weights.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Offline Q-Learning on Diverse Multi-Task Data Both Scales And
Generalizes [100.69714600180895]
offline Q-learning algorithms exhibit strong performance that scales with model capacity.
We train a single policy on 40 games with near-human performance using up-to 80 million parameter networks.
Compared to return-conditioned supervised approaches, offline Q-learning scales similarly with model capacity and has better performance, especially when the dataset is suboptimal.
arXiv Detail & Related papers (2022-11-28T08:56:42Z) - Revealing Secrets From Pre-trained Models [2.0249686991196123]
Transfer-learning has been widely adopted in many emerging deep learning algorithms.
We show that pre-trained models and fine-tuned models have significantly high similarities in weight values.
We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
arXiv Detail & Related papers (2022-07-19T20:19:03Z) - Do We Really Need Deep Learning Models for Time Series Forecasting? [4.2698418800007865]
Time series forecasting is a crucial task in machine learning, as it has a wide range of applications.
Deep learning and matrix factorization models have been recently proposed to tackle the same problem with more competitive performance.
In this paper, we try to answer whether these highly complex deep learning models are without alternative.
arXiv Detail & Related papers (2021-01-06T16:18:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.