Related papers: Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer

Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer

URL: http://arxiv.org/abs/2312.05792v1
Date: Sun, 10 Dec 2023 06:50:56 GMT
Title: Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer
Authors: Li Shen, Yuning Wei, Yangzhu Wang, Hongguang Li
Abstract summary: We propose FPPformer to utilize bottom-up and top-down architectures in encoder and decoder to build the full and rational hierarchy. Extensive experiments with six state-of-the-art benchmarks verify the promising performances of FPPformer.
Score: 9.281993269355544
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: With the development of Internet of Things (IoT) systems, precise long-term forecasting method is requisite for decision makers to evaluate current statuses and formulate future policies. Currently, Transformer and MLP are two paradigms for deep time-series forecasting and the former one is more prevailing in virtue of its exquisite attention mechanism and encoder-decoder architecture. However, data scientists seem to be more willing to dive into the research of encoder, leaving decoder unconcerned. Some researchers even adopt linear projections in lieu of the decoder to reduce the complexity. We argue that both extracting the features of input sequence and seeking the relations of input and prediction sequence, which are respective functions of encoder and decoder, are of paramount significance. Motivated from the success of FPN in CV field, we propose FPPformer to utilize bottom-up and top-down architectures respectively in encoder and decoder to build the full and rational hierarchy. The cutting-edge patch-wise attention is exploited and further developed with the combination, whose format is also different in encoder and decoder, of revamped element-wise attention in this work. Extensive experiments with six state-of-the-art baselines on twelve benchmarks verify the promising performances of FPPformer and the importance of elaborately devising decoder in time-series forecasting Transformer. The source code is released in https://github.com/OrigamiSL/FPPformer.

Related papers

Augmented Invertible Koopman Autoencoder for long-term time series forecasting [7.875955593012905]
We present the Augmented Invertible Koopman AutoEncoder (AIKAE) as a new class of neural autoencoder-based implementations of the Koopman operator. We demonstrate the relevance of the AIKAE through a series of long-term time series forecasting experiments, on satellite image time series as well as on a benchmark involving predictions based on a large lookback window of observations.
arXiv Detail & Related papers (2025-03-17T08:40:50Z)
Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding [12.595019348741042]
We propose a transformer-based video saliency prediction approach with high temporal dimension network decoding (THTDNet) This architecture yields comparable performance to multi-branch and over-complicated models on common benchmarks such as DHF1K, UCF-sports and Hollywood-2.
arXiv Detail & Related papers (2024-01-15T20:09:56Z)
DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models [22.276574156358084]
We build a multi-exit encoder-decoder transformer model which is trained with deep supervision so that each of its decoder layers is capable of generating plausible predictions. We show our approach can reduce overall inference latency by 30%-60% with comparable or even higher accuracy compared to baselines.
arXiv Detail & Related papers (2023-11-15T01:01:02Z)
Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion. In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z)
Improving Position Encoding of Transformers for Multivariate Time Series Classification [5.467400475482668]
We propose a new absolute position encoding method dedicated to time series data called time Absolute Position. We then propose a novel time series classification (MTSC) model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data.
arXiv Detail & Related papers (2023-05-26T05:30:04Z)
Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving [74.28510044056706]
Existing methods usually adopt the decoupled encoder-decoder paradigm. In this work, we aim to alleviate the problem by two principles. We first predict a coarse-grained future position and action based on the encoder features. Then, conditioned on the position and action, the future scene is imagined to check the ramification if we drive accordingly.
arXiv Detail & Related papers (2023-05-10T15:22:02Z)
Decoder Tuning: Efficient Language Understanding as Decoding [84.68266271483022]
We present Decoder Tuning (DecT), which in contrast optimize task-specific decoder networks on the output side. By gradient-based optimization, DecT can be trained within several seconds and requires only one P query per sample. We conduct extensive natural language understanding experiments and show that DecT significantly outperforms state-of-the-art algorithms with a $200times$ speed-up.
arXiv Detail & Related papers (2022-12-16T11:15:39Z)
Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation. Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z)
Sparsity and Sentence Structure in Encoder-Decoder Attention of Summarization Systems [38.672160430296536]
Transformer models have achieved state-of-the-art results in a wide range of NLP tasks including summarization. Previous work has focused on one important bottleneck, the quadratic self-attention mechanism in the encoder. This work focuses on the transformer's encoder-decoder attention mechanism.
arXiv Detail & Related papers (2021-09-08T19:32:42Z)
On the Sub-Layer Functionalities of Transformer Decoder [74.83087937309266]
We study how Transformer-based decoders leverage information from the source and target languages. Based on these insights, we demonstrate that the residual feed-forward module in each Transformer decoder layer can be dropped with minimal loss of performance.
arXiv Detail & Related papers (2020-10-06T11:50:54Z)
On Sparsifying Encoder Outputs in Sequence-to-Sequence Models [90.58793284654692]
We take Transformer as the testbed and introduce a layer of gates in-between the encoder and the decoder. The gates are regularized using the expected value of the sparsity-inducing L0penalty. We investigate the effects of this sparsification on two machine translation and two summarization tasks.
arXiv Detail & Related papers (2020-04-24T16:57:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.