Related papers: Model Selection, Adaptation, and Combination for Deep Transfer Learning through Neural Networks in Renewable Energies

Model Selection, Adaptation, and Combination for Deep Transfer Learning through Neural Networks in Renewable Energies

URL: http://arxiv.org/abs/2204.13293v1
Date: Thu, 28 Apr 2022 05:34:50 GMT
Title: Model Selection, Adaptation, and Combination for Deep Transfer Learning through Neural Networks in Renewable Energies
Authors: Jens Schreiber and Bernhard Sick
Abstract summary: We conduct the first thorough experiment for model selection and adaptation for transfer learning in renewable power forecast. We adopt models based on data from different seasons and limit the amount of training data. We show how combining multiple models through ensembles can significantly improve the model selection and adaptation approach.
Score: 5.953831950062808
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There is recent interest in using model hubs, a collection of pre-trained models, in computer vision tasks. To utilize the model hub, we first select a source model and then adapt the model for the target to compensate for differences. While there is yet limited research on a model selection and adaption for computer vision tasks, this holds even more for the field of renewable power. At the same time, it is a crucial challenge to provide forecasts for the increasing demand for power forecasts based on weather features from a numerical weather prediction. We close these gaps by conducting the first thorough experiment for model selection and adaptation for transfer learning in renewable power forecast, adopting recent results from the field of computer vision on six datasets. We adopt models based on data from different seasons and limit the amount of training data. As an extension of the current state of the art, we utilize a Bayesian linear regression for forecasting the response based on features extracted from a neural network. This approach outperforms the baseline with only seven days of training data. We further show how combining multiple models through ensembles can significantly improve the model selection and adaptation approach. In fact, with more than 30 days of training data, both proposed model combination techniques achieve similar results to those models trained with a full year of training data.

Related papers

The interplay between domain specialization and model size [8.653321928148547]
We investigate the interplay between domain and model size during continued pretraining under compute-constrained scenarios. Our goal is to identify an optimal training regime for this scenario and detect patterns in this interplay that can be generalized across different model sizes and domains.
arXiv Detail & Related papers (2025-01-03T19:28:53Z)
Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network) After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference. We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z)
Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning. By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z)
Model Selection with Model Zoo via Graph Learning [45.30615308692713]
We introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem. We demonstrate TransferGraph's effectiveness in capturing essential model-dataset relationships, yielding up to a 32% improvement in correlation between predicted performance and the actual fine-tuning results compared to the state-of-the-art methods.
arXiv Detail & Related papers (2024-04-05T09:50:00Z)
A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework. It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets. It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization. Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z)
Wrapper Boxes: Faithful Attribution of Model Predictions to Training Data [40.7542543934205]
We propose a "wrapper box'' pipeline: training a neural model as usual and then using its learned feature representation in classic, interpretable models to perform prediction. Across seven language models of varying sizes, we first show that the predictive performance of wrapper classic models is largely comparable to the original neural models. Our pipeline thus preserves the predictive performance of neural language models while faithfully attributing classic model decisions to training data.
arXiv Detail & Related papers (2023-11-15T01:50:53Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain. We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size. Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Revealing Secrets From Pre-trained Models [2.0249686991196123]
Transfer-learning has been widely adopted in many emerging deep learning algorithms. We show that pre-trained models and fine-tuned models have significantly high similarities in weight values. We propose a new model extraction attack that reveals the model architecture and the pre-trained model used by the black-box victim model.
arXiv Detail & Related papers (2022-07-19T20:19:03Z)
Do We Really Need Deep Learning Models for Time Series Forecasting? [4.2698418800007865]
Time series forecasting is a crucial task in machine learning, as it has a wide range of applications. Deep learning and matrix factorization models have been recently proposed to tackle the same problem with more competitive performance. In this paper, we try to answer whether these highly complex deep learning models are without alternative.
arXiv Detail & Related papers (2021-01-06T16:18:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.