Related papers: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

URL: http://arxiv.org/abs/2201.12740v1
Date: Sun, 30 Jan 2022 06:24:25 GMT
Title: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Authors: Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, Rong Jin
Abstract summary: We propose to combine Transformer with the seasonal-trend decomposition method, in which the decomposition method captures the global profile of time series. We exploit the fact that most time series tend to have a sparse representation in well-known basis such as Fourier transform. Besides being more effective, the proposed method, termed as Frequency Enhanced Decomposed Transformer (bf FEDformer), is more efficient than standard Transformer.
Score: 23.199388386249215
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although Transformer-based methods have significantly improved state-of-the-art results for long-term series forecasting, they are not only computationally expensive but more importantly, are unable to capture the global view of time series (e.g. overall trend). To address these problems, we propose to combine Transformer with the seasonal-trend decomposition method, in which the decomposition method captures the global profile of time series while Transformers capture more detailed structures. To further enhance the performance of Transformer for long-term prediction, we exploit the fact that most time series tend to have a sparse representation in well-known basis such as Fourier transform, and develop a frequency enhanced Transformer. Besides being more effective, the proposed method, termed as Frequency Enhanced Decomposed Transformer ({\bf FEDformer}), is more efficient than standard Transformer with a linear complexity to the sequence length. Our empirical studies with six benchmark datasets show that compared with state-of-the-art methods, FEDformer can reduce prediction error by $14.8\%$ and $22.6\%$ for multivariate and univariate time series, respectively. the code will be released soon.

Related papers

Decomposition-based multi-scale transformer framework for time series anomaly detection [0.9438207505148947]
We propose a transformer-based framework built on decomposition (TransDe) for time series anomaly detection. A multi-scale patch-based transformer architecture is proposed to exploit the representative dependencies of each decomposed component of the time series. A novel asynchronous loss function with a stop-gradient strategy is introduced to enhance the performance of TransDe effectively.
arXiv Detail & Related papers (2025-04-19T06:47:38Z)
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting. Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z)
PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction. We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences. We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z)
iTransformer: Inverted Transformers Are Effective for Time Series Forecasting [62.40166958002558]
We propose iTransformer, which simply applies the attention and feed-forward network on the inverted dimensions. The iTransformer model achieves state-of-the-art on challenging real-world datasets.
arXiv Detail & Related papers (2023-10-10T13:44:09Z)
U-shaped Transformer: Retain High Frequency Context in Time Series Analysis [0.5710971447109949]
In this paper, we consider the low-pass characteristics of transformers and try to incorporate the advantages of them. We introduce patch merge and split operation to extract features with different scales and use larger datasets to fully make use of the transformer backbone. Our experiments demonstrate that the model performs at an advanced level across multiple datasets with relatively low cost.
arXiv Detail & Related papers (2023-07-18T07:15:26Z)
Stecformer: Spatio-temporal Encoding Cascaded Transformer for Multivariate Long-term Time Series Forecasting [11.021398675773055]
We propose a complete solution to address problems in terms of feature extraction and target prediction. For extraction, we design an efficient-temporal encoding extractor including a semi-adaptive graph to acquire sufficient-temporal information. For prediction, we propose a Cascaded De Predictor (CDP) to strengthen the correlation between different intervals.
arXiv Detail & Related papers (2023-05-25T13:00:46Z)
CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting. First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals. Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions. Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z)
A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting [52.33042819442005]
Transformer has become the de facto solution for MTSF, especially for the long-term cases. In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token inductive bias of Transformers. We make a series of evolution on the basic architecture of the vanilla MTSF transformer. Surprisingly, the evolved simple transformer architecture is highly effective, which successfully avoids the over-smoothing phenomena in the vanilla MTSF transformer.
arXiv Detail & Related papers (2022-12-06T07:00:31Z)
W-Transformers : A Wavelet-based Transformer Framework for Univariate Time Series Forecasting [7.075125892721573]
We build a transformer model for non-stationary time series using wavelet-based transformer encoder architecture. We evaluate our framework on several publicly available benchmark time series datasets from various domains.
arXiv Detail & Related papers (2022-09-08T17:39:38Z)
Are Transformers Effective for Time Series Forecasting? [13.268196448051308]
Recently, there has been a surge of Transformer-based solutions for the time series forecasting (TSF) task. This study investigates whether Transformer-based techniques are the right solutions for long-term time series forecasting. We find that the relatively higher long-term forecasting accuracy of Transformer-based solutions has little to do with the temporal relation extraction capabilities of the Transformer architecture.
arXiv Detail & Related papers (2022-05-26T17:17:08Z)
ETSformer: Exponential Smoothing Transformers for Time-series Forecasting [35.76867542099019]
We propose ETSFormer, a novel time-series Transformer architecture, which exploits the principle of exponential smoothing in improving Transformers for time-series forecasting. In particular, inspired by the classical exponential smoothing methods in time-series forecasting, we propose the novel exponential smoothing attention (ESA) and frequency attention (FA) to replace the self-attention mechanism in vanilla Transformers, thus improving both accuracy and efficiency.
arXiv Detail & Related papers (2022-02-03T02:50:44Z)
Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation. A linear-complexity recurrent variant has proven well suited for autoregressive generation. This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z)
Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks. We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.