Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain
- URL: http://arxiv.org/abs/2310.05063v3
- Date: Tue, 5 Dec 2023 12:44:36 GMT
- Title: Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain
- Authors: Gerald Woo, Chenghao Liu, Akshat Kumar, Doyen Sahoo
- Abstract summary: We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
- Score: 54.67888148566323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series has been left behind in the era of pre-training and transfer
learning. While research in the fields of natural language processing and
computer vision are enjoying progressively larger datasets to train massive
models, the most popular time series datasets consist of only tens of thousands
of time steps, limiting our ability to study the effectiveness of pre-training
and scaling. Recent studies have also cast doubt on the need for expressive
models and scale. To alleviate these issues, we introduce three large-scale
time series forecasting datasets from the cloud operations (CloudOps) domain,
the largest having billions of observations, enabling further study into
pre-training and scaling of time series models. We build the empirical
groundwork for studying pre-training and scaling of time series models and pave
the way for future research by identifying a promising candidate architecture.
We show that it is a strong zero-shot baseline and benefits from further
scaling, both in model and dataset size. Accompanying these datasets and
results is a suite of comprehensive benchmark results comparing classical and
deep learning baselines to our pre-trained method - achieving a 27% reduction
in error on the largest dataset. Code and datasets can be found
https://github.com/SalesforceAIResearch/pretrain-time-series-cloudops.
Related papers
- GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation [90.53485251837235]
GIFT-Eval is a pioneering benchmark aimed at promoting evaluation across diverse datasets.
GIFT-Eval encompasses 28 datasets over 144,000 time series and 177 million data points.
We also provide a non-leaking pretraining dataset containing approximately 230 billion data points.
arXiv Detail & Related papers (2024-10-14T11:29:38Z) - Scaling Law for Time Series Forecasting [8.967263259533036]
Scaling law that rewards large datasets, complex models and enhanced data granularity has been observed in various fields of deep learning.
Yet, studies on time series forecasting have cast doubt on scaling behaviors of deep learning methods for time series forecasting.
We propose a theory for scaling law for time series forecasting that can explain these seemingly abnormal behaviors.
arXiv Detail & Related papers (2024-05-24T00:46:27Z) - Chronos: Learning the Language of Time Series [79.38691251254173]
Chronos is a framework for pretrained probabilistic time series models.
We show that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks.
arXiv Detail & Related papers (2024-03-12T16:53:54Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Large Pre-trained time series models for cross-domain Time series analysis tasks [20.228846068418765]
We propose a novel method of textitadaptive segmentation that automatically identifies optimal dataset-specific segmentation strategy during pre-training.
This enables LPTM to perform similar to or better than domain-specific state-of-art model when fine-tuned to different downstream time-series analysis tasks and under zero-shot settings.
arXiv Detail & Related papers (2023-11-19T20:16:16Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time-Series Pretraining [28.595342663018627]
We make key technical contributions that are tailored to the numerical properties of time-series data.
We adopt the Transformer architecture by first partitioning the input into non-overlapping windows.
To embed scalar values that may possess arbitrary numerical amplitudes in a high-dimensional space, we propose a numerically multi-scaled embedding module.
arXiv Detail & Related papers (2023-10-11T11:38:18Z) - AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud
Dataset [25.935496432142976]
It is a long-term vision for Autonomous Driving (AD) community that the perception models can learn from a large-scale point cloud dataset.
We formulate the point-cloud pre-training task as a semi-supervised problem, which leverages the few-shot labeled and massive unlabeled point-cloud data.
We achieve significant performance gains on a series of downstream perception benchmarks including nuScenes, and KITTI, under different baseline models.
arXiv Detail & Related papers (2023-06-01T12:32:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.