First De-Trend then Attend: Rethinking Attention for Time-Series
Forecasting
- URL: http://arxiv.org/abs/2212.08151v1
- Date: Thu, 15 Dec 2022 21:34:19 GMT
- Title: First De-Trend then Attend: Rethinking Attention for Time-Series
Forecasting
- Authors: Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta,
Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang
- Abstract summary: We seek to understand the relationships between attention models in different time and frequency domains.
We propose a new method: TDformer (Trend Decomposition Transformer), that first applies seasonal-trend decomposition.
experiments on benchmark time-series forecasting datasets demonstrate that TDformer achieves state-of-the-art performance against existing attention-based models.
- Score: 17.89566168289471
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Transformer-based models have gained large popularity and demonstrated
promising results in long-term time-series forecasting in recent years. In
addition to learning attention in time domain, recent works also explore
learning attention in frequency domains (e.g., Fourier domain, wavelet domain),
given that seasonal patterns can be better captured in these domains. In this
work, we seek to understand the relationships between attention models in
different time and frequency domains. Theoretically, we show that attention
models in different domains are equivalent under linear conditions (i.e.,
linear kernel to attention scores). Empirically, we analyze how attention
models of different domains show different behaviors through various synthetic
experiments with seasonality, trend and noise, with emphasis on the role of
softmax operation therein. Both these theoretical and empirical analyses
motivate us to propose a new method: TDformer (Trend Decomposition
Transformer), that first applies seasonal-trend decomposition, and then
additively combines an MLP which predicts the trend component with Fourier
attention which predicts the seasonal component to obtain the final prediction.
Extensive experiments on benchmark time-series forecasting datasets demonstrate
that TDformer achieves state-of-the-art performance against existing
attention-based models.
Related papers
- FlexTSF: A Universal Forecasting Model for Time Series with Variable Regularities [17.164913785452367]
We propose FlexTSF, a universal time series forecasting model that possesses better generalization and supports both regular and irregular time series.
Experiments on 12 datasets show that FlexTSF outperforms state-of-the-art forecasting models respectively designed for regular and irregular time series.
arXiv Detail & Related papers (2024-10-30T16:14:09Z) - Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts.
Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z) - Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift [30.581736814767606]
Time series forecasting aims to predict future values based on historical data.
Real-world time often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning.
We propose bftextS, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting.
arXiv Detail & Related papers (2024-10-13T13:35:29Z) - TSI: A Multi-View Representation Learning Approach for Time Series Forecasting [29.05140751690699]
This study introduces a novel multi-view approach for time series forecasting.
It integrates trend and seasonal representations with an Independent Component Analysis (ICA)-based representation.
This approach offers a holistic understanding of time series data, going beyond traditional models that often miss nuanced, nonlinear relationships.
arXiv Detail & Related papers (2024-09-30T02:11:57Z) - FAITH: Frequency-domain Attention In Two Horizons for Time Series Forecasting [13.253624747448935]
Time Series Forecasting plays a crucial role in various fields such as industrial equipment maintenance, meteorology, energy consumption, traffic flow and financial investment.
Current deep learning-based predictive models often exhibit a significant deviation between their forecasting outcomes and the ground truth.
We propose a novel model Frequency-domain Attention In Two Horizons, which decomposes time series into trend and seasonal components.
arXiv Detail & Related papers (2024-05-22T02:37:02Z) - PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from
the perspective of partial differential equations [49.80959046861793]
We present PDETime, a novel LMTF model inspired by the principles of Neural PDE solvers.
Our experimentation across seven diversetemporal real-world LMTF datasets reveals that PDETime adapts effectively to the intrinsic nature of the data.
arXiv Detail & Related papers (2024-02-25T17:39:44Z) - FreDF: Learning to Forecast in Frequency Domain [56.24773675942897]
Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences.
We introduce the Frequency-enhanced Direct Forecast (FreDF) which bypasses the complexity of label autocorrelation by learning to forecast in the frequency domain.
arXiv Detail & Related papers (2024-02-04T08:23:41Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Deep Autoregressive Models with Spectral Attention [74.08846528440024]
We propose a forecasting architecture that combines deep autoregressive models with a Spectral Attention (SA) module.
By characterizing in the spectral domain the embedding of the time series as occurrences of a random process, our method can identify global trends and seasonality patterns.
Two spectral attention models, global and local to the time series, integrate this information within the forecast and perform spectral filtering to remove time series's noise.
arXiv Detail & Related papers (2021-07-13T11:08:47Z) - Model-Attentive Ensemble Learning for Sequence Modeling [86.4785354333566]
We present Model-Attentive Ensemble learning for Sequence modeling (MAES)
MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions.
We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.
arXiv Detail & Related papers (2021-02-23T05:23:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.