LSEAttention is All You Need for Time Series Forecasting
- URL: http://arxiv.org/abs/2410.23749v2
- Date: Fri, 01 Nov 2024 02:47:29 GMT
- Title: LSEAttention is All You Need for Time Series Forecasting
- Authors: Dizhen Liang,
- Abstract summary: Transformer-based architectures have achieved remarkable success in natural language processing and computer vision.
I introduce textbfLSEAttention, an approach designed to address entropy collapse and training instability commonly observed in transformer models.
- Score: 0.0
- License:
- Abstract: Transformer-based architectures have achieved remarkable success in natural language processing and computer vision. However, their performance in multivariate long-term forecasting often lags behind simpler linear baselines. Previous studies have identified the traditional attention mechanism as a significant factor contributing to this limitation. To unlock the full potential of transformers for multivariate time series forecasting, I introduce \textbf{LSEAttention}, an approach designed to address entropy collapse and training instability commonly observed in transformer models. I validate the effectiveness of LSEAttention across various real-world multivariate time series datasets, demonstrating that it not only outperforms existing time series transformer models but also exceeds the performance of some state-of-the-art models on specific datasets.
Related papers
- Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens.
Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - Spatiotemporal-Linear: Towards Universal Multivariate Time Series
Forecasting [10.404951989266191]
We introduce the Spatio-Temporal- Linear (STL) framework.
STL seamlessly integrates time-embedded and spatially-informed bypasses to augment the Linear-based architecture.
Empirical evidence highlights STL's prowess, outpacing both Linear and Transformer benchmarks across varied observation and prediction durations and datasets.
arXiv Detail & Related papers (2023-12-22T17:46:34Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - W-Transformers : A Wavelet-based Transformer Framework for Univariate
Time Series Forecasting [7.075125892721573]
We build a transformer model for non-stationary time series using wavelet-based transformer encoder architecture.
We evaluate our framework on several publicly available benchmark time series datasets from various domains.
arXiv Detail & Related papers (2022-09-08T17:39:38Z) - ETSformer: Exponential Smoothing Transformers for Time-series
Forecasting [35.76867542099019]
We propose ETSFormer, a novel time-series Transformer architecture, which exploits the principle of exponential smoothing in improving Transformers for time-series forecasting.
In particular, inspired by the classical exponential smoothing methods in time-series forecasting, we propose the novel exponential smoothing attention (ESA) and frequency attention (FA) to replace the self-attention mechanism in vanilla Transformers, thus improving both accuracy and efficiency.
arXiv Detail & Related papers (2022-02-03T02:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.