Transformers and Their Roles as Time Series Foundation Models
- URL: http://arxiv.org/abs/2502.03383v1
- Date: Wed, 05 Feb 2025 17:18:55 GMT
- Title: Transformers and Their Roles as Time Series Foundation Models
- Authors: Dennis Wu, Yihan He, Yuan Cao, Jianqing Fan, Han Liu,
- Abstract summary: We analyze transformers as time series foundation models, focusing on their approximation and generalization capabilities.
We prove that MOIRAI is capable of automatically fitting autoregressive models with an arbitrary number of covariates.
Experiments support our theoretical findings, highlighting the efficacy of transformers as time series foundation models.
- Score: 14.61139607588868
- License:
- Abstract: We give a comprehensive analysis of transformers as time series foundation models, focusing on their approximation and generalization capabilities. First, we demonstrate that there exist transformers that fit an autoregressive model on input univariate time series via gradient descent. We then analyze MOIRAI, a multivariate time series foundation model capable of handling an arbitrary number of covariates. We prove that it is capable of automatically fitting autoregressive models with an arbitrary number of covariates, offering insights into its design and empirical success. For generalization, we establish bounds for pretraining when the data satisfies Dobrushin's condition. Experiments support our theoretical findings, highlighting the efficacy of transformers as time series foundation models.
Related papers
- Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts.
Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z) - Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework [2.8470354623829577]
We develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers.
We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts.
We find that the model performance remains mostly unaffected, while the model shows much improved interpretability.
arXiv Detail & Related papers (2024-10-08T14:22:40Z) - Predictive Modeling in the Reservoir Kernel Motif Space [0.9217021281095907]
This work proposes a time series prediction method based on the kernel view of linear reservoirs.
We provide a geometric interpretation of our approach shedding light on how our approach is related to the core reservoir models.
Empirical experiments then compare predictive performances of our suggested model with those of recent state-of-art transformer based models.
arXiv Detail & Related papers (2024-05-11T16:12:25Z) - tsGT: Stochastic Time Series Modeling With Transformer [0.12905935507312413]
We introduce tsGT, a time series model built on a general-purpose transformer architecture.
We show that tsGT outperforms the state-of-the-art models on MAD and RMSE, and surpasses its peers on QL and CRPS, on four commonly used datasets.
arXiv Detail & Related papers (2024-03-08T22:59:41Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Predicting Ordinary Differential Equations with Transformers [65.07437364102931]
We develop a transformer-based sequence-to-sequence model that recovers scalar ordinary differential equations (ODEs) in symbolic form from irregularly sampled and noisy observations of a single solution trajectory.
Our method is efficiently scalable: after one-time pretraining on a large set of ODEs, we can infer the governing law of a new observed solution in a few forward passes of the model.
arXiv Detail & Related papers (2023-07-24T08:46:12Z) - Non-autoregressive Conditional Diffusion Models for Time Series
Prediction [3.9722979176564763]
TimeDiff is a non-autoregressive diffusion model that achieves high-quality time series prediction.
We show that TimeDiff consistently outperforms existing time series diffusion models.
arXiv Detail & Related papers (2023-06-08T08:53:59Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - One Fits All:Power General Time Series Analysis by Pretrained LM [23.292260325891032]
We show that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks.
Our results demonstrate that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks.
arXiv Detail & Related papers (2023-02-23T11:37:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.