Preformer: Predictive Transformer with Multi-Scale Segment-wise
Correlations for Long-Term Time Series Forecasting
- URL: http://arxiv.org/abs/2202.11356v1
- Date: Wed, 23 Feb 2022 08:49:35 GMT
- Title: Preformer: Predictive Transformer with Multi-Scale Segment-wise
Correlations for Long-Term Time Series Forecasting
- Authors: Dazhao Du, Bing Su, Zhewei Wei
- Abstract summary: Transformer-based methods have shown great potential in long-term time series forecasting.
This paper proposes a predictive Transformer-based model called em Preformer.
- Score: 29.89267034806925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer-based methods have shown great potential in long-term time series
forecasting. However, most of these methods adopt the standard point-wise
self-attention mechanism, which not only becomes intractable for long-term
forecasting since its complexity increases quadratically with the length of
time series, but also cannot explicitly capture the predictive dependencies
from contexts since the corresponding key and value are transformed from the
same point. This paper proposes a predictive Transformer-based model called
{\em Preformer}. Preformer introduces a novel efficient {\em Multi-Scale
Segment-Correlation} mechanism that divides time series into segments and
utilizes segment-wise correlation-based attention for encoding time series. A
multi-scale structure is developed to aggregate dependencies at different
temporal scales and facilitate the selection of segment length. Preformer
further designs a predictive paradigm for decoding, where the key and value
come from two successive segments rather than the same segment. In this way, if
a key segment has a high correlation score with the query segment, its
successive segment contributes more to the prediction of the query segment.
Extensive experiments demonstrate that our Preformer outperforms other
Transformer-based methods.
Related papers
- PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting [21.033660755921737]
Time forecasting remains a critical challenge across various domains, often complicated by high-dimensional data and long-term dependencies.
This paper presents a novel transformer architecture for time series forecasting, incorporating two key innovations: parameter sharing (PS) and Spatial-Temporal Attention (SegAtt)
arXiv Detail & Related papers (2024-11-03T03:04:00Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - MultiResFormer: Transformer with Adaptive Multi-Resolution Modeling for
General Time Series Forecasting [18.990322695844675]
Transformer-based models have greatly pushed the boundaries of time series forecasting recently.
Existing methods typically encode time series data into $textitpatches$ using one or a fixed set of patch lengths.
We propose MultiResFormer, which dynamically models temporal variations by adaptively choosing optimal patch lengths.
arXiv Detail & Related papers (2023-11-30T18:24:33Z) - Compatible Transformer for Irregularly Sampled Multivariate Time Series [75.79309862085303]
We propose a transformer-based encoder to achieve comprehensive temporal-interaction feature learning for each individual sample.
We conduct extensive experiments on 3 real-world datasets and validate that the proposed CoFormer significantly and consistently outperforms existing methods.
arXiv Detail & Related papers (2023-10-17T06:29:09Z) - MPPN: Multi-Resolution Periodic Pattern Network For Long-Term Time
Series Forecasting [19.573651104129443]
Long-term time series forecasting plays an important role in various real-world scenarios.
Recent deep learning methods for long-term series forecasting tend to capture the intricate patterns of time series by decomposition-based or sampling-based methods.
We propose a novel deep learning network architecture, named Multi-resolution Periodic Pattern Network (MPPN), for long-term series forecasting.
arXiv Detail & Related papers (2023-06-12T07:00:37Z) - Stecformer: Spatio-temporal Encoding Cascaded Transformer for
Multivariate Long-term Time Series Forecasting [11.021398675773055]
We propose a complete solution to address problems in terms of feature extraction and target prediction.
For extraction, we design an efficient-temporal encoding extractor including a semi-adaptive graph to acquire sufficient-temporal information.
For prediction, we propose a Cascaded De Predictor (CDP) to strengthen the correlation between different intervals.
arXiv Detail & Related papers (2023-05-25T13:00:46Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Cluster-Former: Clustering-based Sparse Transformer for Long-Range
Dependency Encoding [90.77031668988661]
Cluster-Former is a novel clustering-based sparse Transformer to perform attention across chunked sequences.
The proposed framework is pivoted on two unique types of Transformer layer: Sliding-Window Layer and Cluster-Former Layer.
Experiments show that Cluster-Former achieves state-of-the-art performance on several major QA benchmarks.
arXiv Detail & Related papers (2020-09-13T22:09:30Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.