Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations
- URL: http://arxiv.org/abs/2410.11539v1
- Date: Tue, 15 Oct 2024 12:14:01 GMT
- Title: Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations
- Authors: M. Germán-Morales, A. J. Rivera-Rivas, M. J. del Jesus Díaz, C. J. Carmona,
- Abstract summary: This study proposes LLIAM, the Llama Lora-Integrated Autorregresive Model.
Low-Rank Adaptations are used to enhance the knowledge of the model with diverse time series datasets, known as the fine-tuning phase.
- Score: 0.0
- License:
- Abstract: High computational power and the availability of large datasets have supported the development of Foundational Models. They are a new emerging technique widely used in Generative Artificial Intelligence, characterized by their scalability and their use in Transfer Learning. The enormous and heterogeneous amounts of data used in their initial training phase, known as pre-training, give them a higher generalization capacity than any other specific model, constituting a solid base that can be adapted or adjusted to a wide range of tasks, increasing their applicability. This study proposes LLIAM, the Llama Lora-Integrated Autorregresive Model. Low-Rank Adaptations are used to enhance the knowledge of the model with diverse time series datasets, known as the fine-tuning phase. To illustrate the capabilities of our proposal, two sets of experiments have been carried out that obtained favorable and promising results with lower training times than other Deep Learning approaches. With this work, we also encourage the use of available resources (such as these pre-trained models) to avoid unnecessary and costly training, narrowing the gap between the goals of traditional Artificial Intelligence and those specified by the definition of Green Artificial Intelligence.
Related papers
- Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - A Survey of Deep Learning and Foundation Models for Time Series
Forecasting [16.814826712022324]
Deep learning has been successfully applied to many application domains, yet its advantages have been slow to emerge for time series forecasting.
Foundation models with extensive pre-training allow models to understand patterns and acquire knowledge that can be applied to new related problems.
There is ongoing research examining how to utilize or inject such knowledge into deep learning models.
arXiv Detail & Related papers (2024-01-25T03:14:07Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - An Emulator for Fine-Tuning Large Language Models using Small Language
Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales.
We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training.
Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Learning to Jump: Thinning and Thickening Latent Counts for Generative
Modeling [69.60713300418467]
Learning to jump is a general recipe for generative modeling of various types of data.
We demonstrate when learning to jump is expected to perform comparably to learning to denoise, and when it is expected to perform better.
arXiv Detail & Related papers (2023-05-28T05:38:28Z) - Phased Data Augmentation for Training a Likelihood-Based Generative Model with Limited Data [0.0]
Generative models excel in creating realistic images, yet their dependency on extensive datasets for training presents significant challenges.
Current data-efficient methods largely focus on GAN architectures, leaving a gap in training other types of generative models.
"phased data augmentation" is a novel technique that addresses this gap by optimizing training in limited data scenarios without altering the inherent data distribution.
arXiv Detail & Related papers (2023-05-22T03:38:59Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Leveraging Jumpy Models for Planning and Fast Learning in Robotic
Domains [25.245208731491346]
We study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience.
We propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience.
We conduct a set of experiments in the RGB-stacking environment, showing that planning with the learned skills and the associated model can enable zero-shot generalization to new tasks.
arXiv Detail & Related papers (2023-02-24T13:26:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.