Deep Double Descent for Time Series Forecasting: Avoiding Undertrained
Models
- URL: http://arxiv.org/abs/2311.01442v3
- Date: Thu, 30 Nov 2023 06:51:26 GMT
- Title: Deep Double Descent for Time Series Forecasting: Avoiding Undertrained
Models
- Authors: Valentino Assandri, Sam Heshmati, Burhaneddin Yaman, Anton Iakovlev,
Ariel Emiliano Repetur
- Abstract summary: We investigate deep double descent in several Transformer models trained on public time series data sets.
We achieve state-of-the-art results for long sequence time series forecasting in nearly 70% of the 72 benchmarks tested.
This suggests that many models in the literature may possess untapped potential.
- Score: 1.7243216387069678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models, particularly Transformers, have achieved impressive
results in various domains, including time series forecasting. While existing
time series literature primarily focuses on model architecture modifications
and data augmentation techniques, this paper explores the training schema of
deep learning models for time series; how models are trained regardless of
their architecture. We perform extensive experiments to investigate the
occurrence of deep double descent in several Transformer models trained on
public time series data sets. We demonstrate epoch-wise deep double descent and
that overfitting can be reverted using more epochs. Leveraging these findings,
we achieve state-of-the-art results for long sequence time series forecasting
in nearly 70% of the 72 benchmarks tested. This suggests that many models in
the literature may possess untapped potential. Additionally, we introduce a
taxonomy for classifying training schema modifications, covering data
augmentation, model inputs, model targets, time series per model, and
computational budget.
Related papers
- Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts.
Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z) - Deep Time Series Models: A Comprehensive Survey and Benchmark [74.28364194333447]
Time series data is of great significance in real-world scenarios.
Recent years have witnessed remarkable breakthroughs in the time series community.
We release Time Series Library (TSLib) as a fair benchmark of deep time series models for diverse analysis tasks.
arXiv Detail & Related papers (2024-07-18T08:31:55Z) - Scaling Law for Time Series Forecasting [8.967263259533036]
Scaling law that rewards large datasets, complex models and enhanced data granularity has been observed in various fields of deep learning.
Yet, studies on time series forecasting have cast doubt on scaling behaviors of deep learning methods for time series forecasting.
We propose a theory for scaling law for time series forecasting that can explain these seemingly abnormal behaviors.
arXiv Detail & Related papers (2024-05-24T00:46:27Z) - Chronos: Learning the Language of Time Series [79.38691251254173]
Chronos is a framework for pretrained probabilistic time series models.
We show that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks.
arXiv Detail & Related papers (2024-03-12T16:53:54Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting [24.834846119163885]
We propose a novel framework, TEMPO, that can effectively learn time series representations.
TEMPO expands the capability for dynamically modeling real-world temporal phenomena from data within diverse domains.
arXiv Detail & Related papers (2023-10-08T00:02:25Z) - Boosted Embeddings for Time Series Forecasting [0.6042845803090501]
We propose a novel time series forecast model, DeepGB.
We formulate and implement a variant of Gradient boosting wherein the weak learners are DNNs whose weights are incrementally found in a greedy manner over iterations.
We demonstrate that our model outperforms existing comparable state-of-the-art models using real-world sensor data and public dataset.
arXiv Detail & Related papers (2021-04-10T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.