Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers
- URL: http://arxiv.org/abs/2405.13810v1
- Date: Wed, 22 May 2024 16:41:21 GMT
- Title: Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers
- Authors: Xin Cheng, Xiuying Chen, Shuqi Li, Di Luo, Xun Wang, Dongyan Zhao, Rui Yan,
- Abstract summary: Time series prediction is crucial for understanding and forecasting complex dynamics in various domains.
We introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions.
The model consistently delivers state-of-the-art performance across various real-world datasets.
- Score: 55.475142494272724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series prediction is crucial for understanding and forecasting complex dynamics in various domains, ranging from finance and economics to climate and healthcare. Based on Transformer architecture, one approach involves encoding multiple variables from the same timestamp into a single temporal token to model global dependencies. In contrast, another approach embeds the time points of individual series into separate variate tokens. The former method faces challenges in learning variate-centric representations, while the latter risks missing essential temporal information critical for accurate forecasting. In our work, we introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions based on a vanilla Transformer. We regard the input time series data as a grid, where the $x$-axis represents the time steps and the $y$-axis represents the variates. A vertical slicing of this grid combines the variates at each time step into a \textit{time token}, while a horizontal slicing embeds the individual series across all time steps into a \textit{variate token}. Correspondingly, a \textit{horizontal attention mechanism} focuses on time tokens to comprehend the correlations between data at various time steps, while a \textit{vertical}, variate-aware \textit{attention} is employed to grasp multivariate correlations. This combination enables efficient processing of information across both time and variate dimensions, thereby enhancing the model's analytical strength. % We also integrate the patch technique, segmenting time tokens into subseries-level patches, ensuring that local semantic information is retained in the embedding. The GridTST model consistently delivers state-of-the-art performance across various real-world datasets.
Related papers
- Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting [3.420673126033772]
We propose a dynamic tokenizer with a dynamic sparse learning algorithm to capture diverse receptive fields and sparse patterns of time series data.
Our proposed model, named DRFormer, is evaluated on various real-world datasets, and experimental results demonstrate its superiority compared to existing methods.
arXiv Detail & Related papers (2024-08-05T07:26:47Z) - EdgeConvFormer: Dynamic Graph CNN and Transformer based Anomaly
Detection in Multivariate Time Series [7.514010315664322]
We propose a novel anomaly detection method, named EdgeConvFormer, which integrates stacked Time2vec embedding, dynamic graph CNN, and Transformer to extract global and local spatial-time information.
Experiments demonstrate that EdgeConvFormer can learn the spatial-temporal modeling from multivariate time series data and achieve better anomaly detection performance than the state-of-the-art approaches on many real-world datasets of different scales.
arXiv Detail & Related papers (2023-12-04T08:38:54Z) - TimeMAE: Self-Supervised Representations of Time Series with Decoupled
Masked Autoencoders [55.00904795497786]
We propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks.
The TimeMAE learns enriched contextual representations of time series with a bidirectional encoding scheme.
To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture.
arXiv Detail & Related papers (2023-03-01T08:33:16Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - TimesNet: Temporal 2D-Variation Modeling for General Time Series
Analysis [80.56913334060404]
Time series analysis is of immense importance in applications, such as weather forecasting, anomaly detection, and action recognition.
Previous methods attempt to accomplish this directly from the 1D time series.
We ravel out the complex temporal variations into the multiple intraperiod- and interperiod-variations.
arXiv Detail & Related papers (2022-10-05T12:19:51Z) - Expressing Multivariate Time Series as Graphs with Time Series Attention
Transformer [14.172091921813065]
We propose the Time Series Attention Transformer (TSAT) for multivariate time series representation learning.
Using TSAT, we represent both temporal information and inter-dependencies of time series in terms of edge-enhanced dynamic graphs.
We show that TSAT clearly outerperforms six state-of-the-art baseline methods in various forecasting horizons.
arXiv Detail & Related papers (2022-08-19T12:25:56Z) - Long-Range Transformers for Dynamic Spatiotemporal Forecasting [16.37467119526305]
Methods based on graph neural networks explicitly model variable relationships.
Long-Range Transformers can learn interactions between time, value, and information jointly along this extended sequence.
arXiv Detail & Related papers (2021-09-24T22:11:46Z) - Instance-wise Graph-based Framework for Multivariate Time Series
Forecasting [69.38716332931986]
We propose a simple yet efficient instance-wise graph-based framework to utilize the inter-dependencies of different variables at different time stamps.
The key idea of our framework is aggregating information from the historical time series of different variables to the current time series that we need to forecast.
arXiv Detail & Related papers (2021-09-14T07:38:35Z) - Time Series Alignment with Global Invariances [14.632733235929926]
We propose a novel distance accounting both feature space and temporal variabilities by learning a latent global transformation of the feature space together with a temporal alignment.
We present two algorithms for the computation of time series barycenters under this new geometry.
We illustrate the interest of our approach on both simulated and real world data and show the robustness of our approach compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-02-10T15:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.