Two Steps Forward and One Behind: Rethinking Time Series Forecasting
with Deep Learning
- URL: http://arxiv.org/abs/2304.04553v3
- Date: Mon, 8 May 2023 07:57:24 GMT
- Title: Two Steps Forward and One Behind: Rethinking Time Series Forecasting
with Deep Learning
- Authors: Riccardo Ughi, Eugenio Lomurno and Matteo Matteucci
- Abstract summary: The Transformer is a highly successful deep learning model that has revolutionised the world of artificial neural networks.
We investigate the effectiveness of Transformer-based models applied to the domain of time series forecasting.
We propose a set of alternative models that are better performing and significantly less complex.
- Score: 7.967995669387532
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The Transformer is a highly successful deep learning model that has
revolutionised the world of artificial neural networks, first in natural
language processing and later in computer vision. This model is based on the
attention mechanism and is able to capture complex semantic relationships
between a variety of patterns present in the input data. Precisely because of
these characteristics, the Transformer has recently been exploited for time
series forecasting problems, assuming a natural adaptability to the domain of
continuous numerical series. Despite the acclaimed results in the literature,
some works have raised doubts about the robustness and effectiveness of this
approach. In this paper, we further investigate the effectiveness of
Transformer-based models applied to the domain of time series forecasting,
demonstrate their limitations, and propose a set of alternative models that are
better performing and significantly less complex. In particular, we empirically
show how simplifying Transformer-based forecasting models almost always leads
to an improvement, reaching state of the art performance. We also propose
shallow models without the attention mechanism, which compete with the overall
state of the art in long time series forecasting, and demonstrate their ability
to accurately predict time series over extremely long windows. From a
methodological perspective, we show how it is always necessary to use a simple
baseline to verify the effectiveness of proposed models, and finally, we
conclude the paper with a reflection on recent research paths and the
opportunity to follow trends and hypes even where it may not be necessary.
Related papers
- Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting [50.298817606660826]
We introduce Powerformer, a novel Transformer variant that replaces noncausal attention weights with causal weights that are reweighted according to a smooth heavy-tailed decay.
Our empirical results demonstrate that Powerformer achieves state-of-the-art accuracy on public time-series benchmarks.
Our analyses show that the model's locality bias is amplified during training, demonstrating an interplay between time-series data and power-law-based attention.
arXiv Detail & Related papers (2025-02-10T04:42:11Z) - A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting [0.07916635054977067]
Pruning is an established approach to reduce neural network parameter count and save compute.
We study the effects of these pruning strategies on model predictive performance and computational aspects like model size, operations, and inference time.
We demonstrate that even with corresponding hardware and software support, structured pruning is unable to provide significant time savings.
arXiv Detail & Related papers (2024-12-17T13:07:31Z) - Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization [74.3339999119713]
We develop a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies.
Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon.
arXiv Detail & Related papers (2024-12-06T18:22:59Z) - LSEAttention is All You Need for Time Series Forecasting [0.0]
Transformer-based architectures have achieved remarkable success in natural language processing and computer vision.
Previous research has identified the traditional attention mechanism as a key factor limiting their effectiveness in this domain.
We introduce LATST, a novel approach designed to mitigate entropy collapse and training instability common challenges in Transformer-based time series forecasting.
arXiv Detail & Related papers (2024-10-31T09:09:39Z) - Are Self-Attentions Effective for Time Series Forecasting? [4.990206466948269]
Time series forecasting is crucial for applications across multiple domains and various scenarios.
Recent findings have indicated that simpler linear models might outperform complex Transformer-based approaches.
We introduce a new architecture, Cross-Attention-only Time Series transformer (CATS)
Our model achieves superior performance with the lowest mean squared error and uses fewer parameters compared to existing models.
arXiv Detail & Related papers (2024-05-27T06:49:39Z) - On the Resurgence of Recurrent Models for Long Sequences -- Survey and
Research Opportunities in the Transformer Era [59.279784235147254]
This survey is aimed at providing an overview of these trends framed under the unifying umbrella of Recurrence.
It emphasizes novel research opportunities that become prominent when abandoning the idea of processing long sequences.
arXiv Detail & Related papers (2024-02-12T23:55:55Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Learning Robust Precipitation Forecaster by Temporal Frame Interpolation [65.5045412005064]
We develop a robust precipitation forecasting model that demonstrates resilience against spatial-temporal discrepancies.
Our approach has led to significant improvements in forecasting precision, culminating in our model securing textit1st place in the transfer learning leaderboard of the textitWeather4cast'23 competition.
arXiv Detail & Related papers (2023-11-30T08:22:08Z) - MPR-Net:Multi-Scale Pattern Reproduction Guided Universality Time Series
Interpretable Forecasting [13.790498420659636]
Time series forecasting has received wide interest from existing research due to its broad applications inherent challenging.
This paper proposes a forecasting model, MPR-Net. It first adaptively decomposes multi-scale historical series patterns using convolution operation, then constructs a pattern extension forecasting method based on the prior knowledge of pattern reproduction, and finally reconstructs future patterns into future series using deconvolution operation.
By leveraging the temporal dependencies present in the time series, MPR-Net not only achieves linear time complexity, but also makes the forecasting process interpretable.
arXiv Detail & Related papers (2023-07-13T13:16:01Z) - Deep Transformer Models for Time Series Forecasting: The Influenza
Prevalence Case [2.997238772148965]
Time series data are prevalent in many scientific and engineering disciplines.
We present a new approach to time series forecasting using Transformer-based machine learning models.
We show that the forecasting results produced by our approach are favorably comparable to the state-of-the-art.
arXiv Detail & Related papers (2020-01-23T00:22:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.