Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting
- URL: http://arxiv.org/abs/2405.10216v1
- Date: Thu, 16 May 2024 16:05:33 GMT
- Title: Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting
- Authors: Divij Gupta, Anubhav Bhatti, Suraj Parmar, Chen Dan, Yuwei Liu, Bingjie Shen, San Lee,
- Abstract summary: Low-Rank Adaptation (LoRA) is a technique for fine-tuning large pre-trained or foundational models across different modalities and tasks.
This paper examines the impact of LoRA on contemporary time series foundational models: Lag-Llama, MOIRAI, and Chronos.
- Score: 5.354055742467354
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Low-Rank Adaptation (LoRA) is a widely used technique for fine-tuning large pre-trained or foundational models across different modalities and tasks. However, its application to time series data, particularly within foundational models, remains underexplored. This paper examines the impact of LoRA on contemporary time series foundational models: Lag-Llama, MOIRAI, and Chronos. We demonstrate LoRA's fine-tuning potential for forecasting the vital signs of sepsis patients in intensive care units (ICUs), emphasizing the models' adaptability to previously unseen, out-of-domain modalities. Integrating LoRA aims to enhance forecasting performance while reducing inefficiencies associated with fine-tuning large models on limited domain-specific data. Our experiments show that LoRA fine-tuning of time series foundational models significantly improves forecasting, achieving results comparable to state-of-the-art models trained from scratch on similar modalities. We conduct comprehensive ablation studies to demonstrate the trade-offs between the number of tunable parameters and forecasting performance and assess the impact of varying LoRA matrix ranks on model performance.
Related papers
- Fuxi-DA: A Generalized Deep Learning Data Assimilation Framework for Assimilating Satellite Observations [15.934673617658609]
Deep learning models have shown promise in matching, even surpassing, the forecast accuracy of leading NWP models worldwide.
This study introduces FuxiDA, a generalized DL-based DA framework for assimilating satellite observations.
By assimilating data from Advanced Geosynchronous Radiation Imager (AGRI) aboard Fengyun-4B, FuXi-DA consistently mitigates analysis errors and significantly improves forecast performance.
arXiv Detail & Related papers (2024-04-12T15:02:14Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Enhancing Deep Traffic Forecasting Models with Dynamic Regression [15.31488551912888]
This paper introduces a dynamic regression (DR) framework to enhance existing traffic forecasting models by structured learning for the residual process.
We evaluate the effectiveness of the proposed framework on deep traffic forecasting models using both speed and flow datasets.
arXiv Detail & Related papers (2023-01-17T01:12:44Z) - An Attention Free Long Short-Term Memory for Time Series Forecasting [0.0]
We focused on time series forecasting using attention free mechanism, a more efficient framework, and proposed a new architecture for time series prediction.
We proposed an architecture built using attention free LSTM layers that overcome linear models for conditional variance prediction.
arXiv Detail & Related papers (2022-09-20T08:23:49Z) - Deep Autoregressive Models with Spectral Attention [74.08846528440024]
We propose a forecasting architecture that combines deep autoregressive models with a Spectral Attention (SA) module.
By characterizing in the spectral domain the embedding of the time series as occurrences of a random process, our method can identify global trends and seasonality patterns.
Two spectral attention models, global and local to the time series, integrate this information within the forecast and perform spectral filtering to remove time series's noise.
arXiv Detail & Related papers (2021-07-13T11:08:47Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Model-Attentive Ensemble Learning for Sequence Modeling [86.4785354333566]
We present Model-Attentive Ensemble learning for Sequence modeling (MAES)
MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions.
We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.
arXiv Detail & Related papers (2021-02-23T05:23:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.