First De-Trend then Attend: Rethinking Attention for Time-Series
Forecasting
- URL: http://arxiv.org/abs/2212.08151v1
- Date: Thu, 15 Dec 2022 21:34:19 GMT
- Title: First De-Trend then Attend: Rethinking Attention for Time-Series
Forecasting
- Authors: Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta,
Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang
- Abstract summary: We seek to understand the relationships between attention models in different time and frequency domains.
We propose a new method: TDformer (Trend Decomposition Transformer), that first applies seasonal-trend decomposition.
experiments on benchmark time-series forecasting datasets demonstrate that TDformer achieves state-of-the-art performance against existing attention-based models.
- Score: 17.89566168289471
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Transformer-based models have gained large popularity and demonstrated
promising results in long-term time-series forecasting in recent years. In
addition to learning attention in time domain, recent works also explore
learning attention in frequency domains (e.g., Fourier domain, wavelet domain),
given that seasonal patterns can be better captured in these domains. In this
work, we seek to understand the relationships between attention models in
different time and frequency domains. Theoretically, we show that attention
models in different domains are equivalent under linear conditions (i.e.,
linear kernel to attention scores). Empirically, we analyze how attention
models of different domains show different behaviors through various synthetic
experiments with seasonality, trend and noise, with emphasis on the role of
softmax operation therein. Both these theoretical and empirical analyses
motivate us to propose a new method: TDformer (Trend Decomposition
Transformer), that first applies seasonal-trend decomposition, and then
additively combines an MLP which predicts the trend component with Fourier
attention which predicts the seasonal component to obtain the final prediction.
Extensive experiments on benchmark time-series forecasting datasets demonstrate
that TDformer achieves state-of-the-art performance against existing
attention-based models.
Related papers
- FAITH: Frequency-domain Attention In Two Horizons for Time Series Forecasting [13.253624747448935]
Time Series Forecasting plays a crucial role in various fields such as industrial equipment maintenance, meteorology, energy consumption, traffic flow and financial investment.
Current deep learning-based predictive models often exhibit a significant deviation between their forecasting outcomes and the ground truth.
We propose a novel model Frequency-domain Attention In Two Horizons, which decomposes time series into trend and seasonal components.
arXiv Detail & Related papers (2024-05-22T02:37:02Z) - PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from
the perspective of partial differential equations [49.80959046861793]
We present PDETime, a novel LMTF model inspired by the principles of Neural PDE solvers.
Our experimentation across seven diversetemporal real-world LMTF datasets reveals that PDETime adapts effectively to the intrinsic nature of the data.
arXiv Detail & Related papers (2024-02-25T17:39:44Z) - FreDF: Learning to Forecast in Frequency Domain [56.24773675942897]
Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences.
We introduce the Frequency-enhanced Direct Forecast (FreDF) which bypasses the complexity of label autocorrelation by learning to forecast in the frequency domain.
arXiv Detail & Related papers (2024-02-04T08:23:41Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting [24.834846119163885]
We propose a novel framework, TEMPO, that can effectively learn time series representations.
TEMPO expands the capability for dynamically modeling real-world temporal phenomena from data within diverse domains.
arXiv Detail & Related papers (2023-10-08T00:02:25Z) - FreDo: Frequency Domain-based Long-Term Time Series Forecasting [12.268979675200779]
We show that due to error accumulation, sophisticated models might not outperform baseline models for long-term forecasting.
We propose FreDo, a frequency domain-based neural network model that is built on top of the baseline model to enhance its performance.
arXiv Detail & Related papers (2022-05-24T18:19:15Z) - Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case
Study of the COVID-19 Epidemic Curves [0.0]
We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series.
We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage.
arXiv Detail & Related papers (2021-08-19T14:44:46Z) - Deep Autoregressive Models with Spectral Attention [74.08846528440024]
We propose a forecasting architecture that combines deep autoregressive models with a Spectral Attention (SA) module.
By characterizing in the spectral domain the embedding of the time series as occurrences of a random process, our method can identify global trends and seasonality patterns.
Two spectral attention models, global and local to the time series, integrate this information within the forecast and perform spectral filtering to remove time series's noise.
arXiv Detail & Related papers (2021-07-13T11:08:47Z) - Model-Attentive Ensemble Learning for Sequence Modeling [86.4785354333566]
We present Model-Attentive Ensemble learning for Sequence modeling (MAES)
MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions.
We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.
arXiv Detail & Related papers (2021-02-23T05:23:35Z) - A Multi-Channel Neural Graphical Event Model with Negative Evidence [76.51278722190607]
Event datasets are sequences of events of various types occurring irregularly over the time-line.
We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions.
arXiv Detail & Related papers (2020-02-21T23:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.