Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting
- URL: http://arxiv.org/abs/2503.17658v1
- Date: Sat, 22 Mar 2025 06:01:50 GMT
- Title: Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting
- Authors: Davide Villaboni, Alberto Castellini, Ivan Luciano Danesi, Alessandro Farinelli,
- Abstract summary: Transformer-based time series forecasting has recently gained strong interest due to the ability of transformers to model sequential data.<n>We propose Sentinel, a transformer-based architecture composed of an encoder able to extract contextual information from the channel dimension.<n>We introduce a multi-patch attention mechanism, which leverages the patching process to structure the input sequence in a way that can be naturally integrated into the transformer architecture.
- Score: 48.52101281458809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer-based time series forecasting has recently gained strong interest due to the ability of transformers to model sequential data. Most of the state-of-the-art architectures exploit either temporal or inter-channel dependencies, limiting their effectiveness in multivariate time-series forecasting where both types of dependencies are crucial. We propose Sentinel, a full transformer-based architecture composed of an encoder able to extract contextual information from the channel dimension, and a decoder designed to capture causal relations and dependencies across the temporal dimension. Additionally, we introduce a multi-patch attention mechanism, which leverages the patching process to structure the input sequence in a way that can be naturally integrated into the transformer architecture, replacing the multi-head splitting process. Extensive experiments on standard benchmarks demonstrate that Sentinel, because of its ability to "monitor" both the temporal and the inter-channel dimension, achieves better or comparable performance with respect to state-of-the-art approaches.
Related papers
- Sensorformer: Cross-patch attention with global-patch compression is effective for high-dimensional multivariate time series forecasting [12.103678233732584]
We propose a new Transformer, Sensorformer, which first compresses the global patch information and then simultaneously extracts cross-variable and cross-time dependencies from the compressed representations.<n>Sensorformer can effectively capture the correct inter-variable correlations and causal relationships, even in the presence of dynamic causal lags between variables.
arXiv Detail & Related papers (2025-01-06T03:14:47Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens.
Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z) - Rough Transformers: Lightweight and Continuous Time Series Modelling through Signature Patching [46.58170057001437]
We introduce the Rough Transformer, a variation of the Transformer model that operates on continuous-time representations of input sequences.<n>We find that, on a variety of time-series-related tasks, Rough Transformers consistently outperform their vanilla attention counterparts.
arXiv Detail & Related papers (2024-05-31T14:00:44Z) - Multi-resolution Time-Series Transformer for Long-term Forecasting [24.47302799009906]
We propose a novel framework, Multi-resolution Time-Series Transformer (MTST), for simultaneous modeling of diverse temporal patterns at different resolutions.
In contrast to many existing time-series transformers, we employ relative positional encoding, which is better suited for extracting periodic components at different scales.
arXiv Detail & Related papers (2023-11-07T17:18:52Z) - iTransformer: Inverted Transformers Are Effective for Time Series Forecasting [62.40166958002558]
We propose iTransformer, which simply applies the attention and feed-forward network on the inverted dimensions.
The iTransformer model achieves state-of-the-art on challenging real-world datasets.
arXiv Detail & Related papers (2023-10-10T13:44:09Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - W-Transformers : A Wavelet-based Transformer Framework for Univariate
Time Series Forecasting [7.075125892721573]
We build a transformer model for non-stationary time series using wavelet-based transformer encoder architecture.
We evaluate our framework on several publicly available benchmark time series datasets from various domains.
arXiv Detail & Related papers (2022-09-08T17:39:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.