Scaling Law for Time Series Forecasting
- URL: http://arxiv.org/abs/2405.15124v3
- Date: Thu, 26 Sep 2024 18:11:44 GMT
- Title: Scaling Law for Time Series Forecasting
- Authors: Jingzhe Shi, Qinwei Ma, Huan Ma, Lei Li,
- Abstract summary: Scaling law that rewards large datasets, complex models and enhanced data granularity has been observed in various fields of deep learning.
Yet, studies on time series forecasting have cast doubt on scaling behaviors of deep learning methods for time series forecasting.
We propose a theory for scaling law for time series forecasting that can explain these seemingly abnormal behaviors.
- Score: 8.967263259533036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scaling law that rewards large datasets, complex models and enhanced data granularity has been observed in various fields of deep learning. Yet, studies on time series forecasting have cast doubt on scaling behaviors of deep learning methods for time series forecasting: while more training data improves performance, more capable models do not always outperform less capable models, and longer input horizons may hurt performance for some models. We propose a theory for scaling law for time series forecasting that can explain these seemingly abnormal behaviors. We take into account the impact of dataset size and model complexity, as well as time series data granularity, particularly focusing on the look-back horizon, an aspect that has been unexplored in previous theories. Furthermore, we empirically evaluate various models using a diverse set of time series forecasting datasets, which (1) verifies the validity of scaling law on dataset size and model complexity within the realm of time series forecasting, and (2) validates our theoretical framework, particularly regarding the influence of look back horizon. We hope our findings may inspire new models targeting time series forecasting datasets of limited size, as well as large foundational datasets and models for time series forecasting in future works. Codes for our experiments will be made public at: https://github.com/JingzheShi/ScalingLawForTimeSeriesForecasting.
Related papers
- GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation [90.53485251837235]
GIFT-Eval is a pioneering benchmark aimed at promoting evaluation across diverse datasets.
GIFT-Eval encompasses 28 datasets over 144,000 time series and 177 million data points.
We also provide a non-leaking pretraining dataset containing approximately 230 billion data points.
arXiv Detail & Related papers (2024-10-14T11:29:38Z) - Probing the Robustness of Time-series Forecasting Models with
CounterfacTS [1.823020744088554]
We present and publicly release CounterfacTS, a tool to probe the robustness of deep learning models in time-series forecasting tasks.
CounterfacTS has a user-friendly interface that allows the user to visualize, compare and quantify time series data and their forecasts.
arXiv Detail & Related papers (2024-03-06T07:34:47Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.