Related papers: Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers

Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers

URL: http://arxiv.org/abs/2405.13810v1
Date: Wed, 22 May 2024 16:41:21 GMT
Title: Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers
Authors: Xin Cheng, Xiuying Chen, Shuqi Li, Di Luo, Xun Wang, Dongyan Zhao, Rui Yan,
Abstract summary: Time series prediction is crucial for understanding and forecasting complex dynamics in various domains. We introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions. The model consistently delivers state-of-the-art performance across various real-world datasets.
Score: 55.475142494272724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series prediction is crucial for understanding and forecasting complex dynamics in various domains, ranging from finance and economics to climate and healthcare. Based on Transformer architecture, one approach involves encoding multiple variables from the same timestamp into a single temporal token to model global dependencies. In contrast, another approach embeds the time points of individual series into separate variate tokens. The former method faces challenges in learning variate-centric representations, while the latter risks missing essential temporal information critical for accurate forecasting. In our work, we introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions based on a vanilla Transformer. We regard the input time series data as a grid, where the $x$-axis represents the time steps and the $y$-axis represents the variates. A vertical slicing of this grid combines the variates at each time step into a \textit{time token}, while a horizontal slicing embeds the individual series across all time steps into a \textit{variate token}. Correspondingly, a \textit{horizontal attention mechanism} focuses on time tokens to comprehend the correlations between data at various time steps, while a \textit{vertical}, variate-aware \textit{attention} is employed to grasp multivariate correlations. This combination enables efficient processing of information across both time and variate dimensions, thereby enhancing the model's analytical strength. % We also integrate the patch technique, segmenting time tokens into subseries-level patches, ensuring that local semantic information is retained in the embedding. The GridTST model consistently delivers state-of-the-art performance across various real-world datasets.

Related papers

HGTS-Former: Hierarchical HyperGraph Transformer for Multivariate Time Series Analysis [14.388097471205102]
This paper proposes a novel hypergraph-based time series transformer backbone network, termed HGTS-Former.<n>We first normalize and embed each patch into tokens. Then, we adopt the multi-head self-attention to enhance the temporal representation of each patch.<n>The hierarchical hypergraphs are constructed to aggregate the temporal patterns within each channel and fine-grained relations between different variables.
arXiv Detail & Related papers (2025-08-04T13:33:28Z)
Gateformer: Advancing Multivariate Time Series Forecasting through Temporal and Variate-Wise Attention with Gated Representations [2.2091590689610823]
We re-purpose the Transformer architecture to model both cross-time and cross-variate dependencies. Our method achieves state-of-the-art performance across 13 real-world datasets, delivering performance improvements up to 20.7% over original models.
arXiv Detail & Related papers (2025-05-01T04:59:05Z)
Sensorformer: Cross-patch attention with global-patch compression is effective for high-dimensional multivariate time series forecasting [12.103678233732584]
We propose a new Transformer, Sensorformer, which first compresses the global patch information and then simultaneously extracts cross-variable and cross-time dependencies from the compressed representations. Sensorformer can effectively capture the correct inter-variable correlations and causal relationships, even in the presence of dynamic causal lags between variables.
arXiv Detail & Related papers (2025-01-06T03:14:47Z)
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting. Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z)
DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting [3.420673126033772]
We propose a dynamic tokenizer with a dynamic sparse learning algorithm to capture diverse receptive fields and sparse patterns of time series data. Our proposed model, named DRFormer, is evaluated on various real-world datasets, and experimental results demonstrate its superiority compared to existing methods.
arXiv Detail & Related papers (2024-08-05T07:26:47Z)
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification [13.110156202816112]
We propose a novel multi-view approach to capture patterns with properties like shift equivariance. Our method integrates diverse features, including spectral, temporal, local, and global features, to obtain rich, complementary contexts for TSC. Our approach achieves average accuracy improvements of 4.01-6.45% and 7.93% respectively, over leading TSC models.
arXiv Detail & Related papers (2024-06-06T18:05:10Z)
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables [75.83318701911274]
TimeXer ingests external information to enhance the forecasting of endogenous variables. TimeXer achieves consistent state-of-the-art performance on twelve real-world forecasting benchmarks.
arXiv Detail & Related papers (2024-02-29T11:54:35Z)
EdgeConvFormer: Dynamic Graph CNN and Transformer based Anomaly Detection in Multivariate Time Series [7.514010315664322]
We propose a novel anomaly detection method, named EdgeConvFormer, which integrates stacked Time2vec embedding, dynamic graph CNN, and Transformer to extract global and local spatial-time information. Experiments demonstrate that EdgeConvFormer can learn the spatial-temporal modeling from multivariate time series data and achieve better anomaly detection performance than the state-of-the-art approaches on many real-world datasets of different scales.
arXiv Detail & Related papers (2023-12-04T08:38:54Z)
TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders [55.00904795497786]
We propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks. The TimeMAE learns enriched contextual representations of time series with a bidirectional encoding scheme. To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture.
arXiv Detail & Related papers (2023-03-01T08:33:16Z)
FormerTime: Hierarchical Multi-Scale Representations for Multivariate Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task. It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z)
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis [80.56913334060404]
Time series analysis is of immense importance in applications, such as weather forecasting, anomaly detection, and action recognition. Previous methods attempt to accomplish this directly from the 1D time series. We ravel out the complex temporal variations into the multiple intraperiod- and interperiod-variations.
arXiv Detail & Related papers (2022-10-05T12:19:51Z)
Expressing Multivariate Time Series as Graphs with Time Series Attention Transformer [14.172091921813065]
We propose the Time Series Attention Transformer (TSAT) for multivariate time series representation learning. Using TSAT, we represent both temporal information and inter-dependencies of time series in terms of edge-enhanced dynamic graphs. We show that TSAT clearly outerperforms six state-of-the-art baseline methods in various forecasting horizons.
arXiv Detail & Related papers (2022-08-19T12:25:56Z)
Long-Range Transformers for Dynamic Spatiotemporal Forecasting [16.37467119526305]
Methods based on graph neural networks explicitly model variable relationships. Long-Range Transformers can learn interactions between time, value, and information jointly along this extended sequence.
arXiv Detail & Related papers (2021-09-24T22:11:46Z)
Instance-wise Graph-based Framework for Multivariate Time Series Forecasting [69.38716332931986]
We propose a simple yet efficient instance-wise graph-based framework to utilize the inter-dependencies of different variables at different time stamps. The key idea of our framework is aggregating information from the historical time series of different variables to the current time series that we need to forecast.
arXiv Detail & Related papers (2021-09-14T07:38:35Z)
Time Series Alignment with Global Invariances [14.632733235929926]
We propose a novel distance accounting both feature space and temporal variabilities by learning a latent global transformation of the feature space together with a temporal alignment. We present two algorithms for the computation of time series barycenters under this new geometry. We illustrate the interest of our approach on both simulated and real world data and show the robustness of our approach compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-02-10T15:11:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.