Informer: Beyond Efficient Transformer for Long Sequence Time-Series
Forecasting
- URL: http://arxiv.org/abs/2012.07436v3
- Date: Sun, 28 Mar 2021 14:45:04 GMT
- Title: Informer: Beyond Efficient Transformer for Long Sequence Time-Series
Forecasting
- Authors: Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui
Xiong, Wancai Zhang
- Abstract summary: Long sequence time-series forecasting (LSTF) demands a high prediction capacity.
Recent studies have shown the potential of Transformer to increase the prediction capacity.
We design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics.
- Score: 25.417560221400347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many real-world applications require the prediction of long sequence
time-series, such as electricity consumption planning. Long sequence
time-series forecasting (LSTF) demands a high prediction capacity of the model,
which is the ability to capture precise long-range dependency coupling between
output and input efficiently. Recent studies have shown the potential of
Transformer to increase the prediction capacity. However, there are several
severe issues with Transformer that prevent it from being directly applicable
to LSTF, including quadratic time complexity, high memory usage, and inherent
limitation of the encoder-decoder architecture. To address these issues, we
design an efficient transformer-based model for LSTF, named Informer, with
three distinctive characteristics: (i) a $ProbSparse$ self-attention mechanism,
which achieves $O(L \log L)$ in time complexity and memory usage, and has
comparable performance on sequences' dependency alignment. (ii) the
self-attention distilling highlights dominating attention by halving cascading
layer input, and efficiently handles extreme long input sequences. (iii) the
generative style decoder, while conceptually simple, predicts the long
time-series sequences at one forward operation rather than a step-by-step way,
which drastically improves the inference speed of long-sequence predictions.
Extensive experiments on four large-scale datasets demonstrate that Informer
significantly outperforms existing methods and provides a new solution to the
LSTF problem.
Related papers
- UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba [7.594115034632109]
We propose UmambaTSF, a novel long-term time series forecasting framework.
It integrates multi-scale feature extraction capabilities of U-shaped encoder-decoder multilayer perceptrons (MLP) with Mamba's long sequence representation.
UmambaTSF achieves state-of-the-art performance and excellent generality on widely used benchmark datasets.
arXiv Detail & Related papers (2024-10-15T04:56:43Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Infomaxformer: Maximum Entropy Transformer for Long Time-Series
Forecasting Problem [6.497816402045097]
The Transformer architecture yields state-of-the-art results in many tasks such as natural language processing (NLP) and computer vision (CV)
With this advanced capability, however, the quadratic time complexity and high memory usage prevents the Transformer from dealing with long time-series forecasting problem.
We propose a method that combines the encoder-decoder architecture with seasonal-trend decomposition to capture more specific seasonal parts.
arXiv Detail & Related papers (2023-01-04T14:08:21Z) - CLMFormer: Mitigating Data Redundancy to Revitalize Transformer-based
Long-Term Time Series Forecasting System [46.39662315849883]
Long-term time-series forecasting (LTSF) plays a crucial role in various practical applications.
Existing Transformer-based models, such as Fedformer and Informer, often achieve their best performances on validation sets after just a few epochs.
We propose a novel approach to address this issue by employing curriculum learning and introducing a memory-driven decoder.
arXiv Detail & Related papers (2022-07-16T04:05:15Z) - Triformer: Triangular, Variable-Specific Attentions for Long Sequence
Multivariate Time Series Forecasting--Full Version [50.43914511877446]
We propose a triangular, variable-specific attention to ensure high efficiency and accuracy.
We show that Triformer outperforms state-of-the-art methods w.r.t. both accuracy and efficiency.
arXiv Detail & Related papers (2022-04-28T20:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.