SCFormer: Structured Channel-wise Transformer with Cumulative Historical State for Multivariate Time Series Forecasting
- URL: http://arxiv.org/abs/2505.02655v1
- Date: Mon, 05 May 2025 13:59:55 GMT
- Title: SCFormer: Structured Channel-wise Transformer with Cumulative Historical State for Multivariate Time Series Forecasting
- Authors: Shiwei Guo, Ziang Chen, Yupeng Ma, Yunfei Han, Yi Wang,
- Abstract summary: We propose the Structured Channel-wise Transformer with Cumulative Historical state (SCFormer)<n>SCFormer introduces temporal constraints to all linear transformations, including the query, key, and value matrices, as well as the fully connected layers within the Transformer.<n>Experiments on multiple real-world datasets demonstrate that SCFormer significantly outperforms mainstream baselines.
- Score: 6.186790384993048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Transformer model has shown strong performance in multivariate time series forecasting by leveraging channel-wise self-attention. However, this approach lacks temporal constraints when computing temporal features and does not utilize cumulative historical series effectively.To address these limitations, we propose the Structured Channel-wise Transformer with Cumulative Historical state (SCFormer). SCFormer introduces temporal constraints to all linear transformations, including the query, key, and value matrices, as well as the fully connected layers within the Transformer. Additionally, SCFormer employs High-order Polynomial Projection Operators (HiPPO) to deal with cumulative historical time series, allowing the model to incorporate information beyond the look-back window during prediction. Extensive experiments on multiple real-world datasets demonstrate that SCFormer significantly outperforms mainstream baselines, highlighting its effectiveness in enhancing time series forecasting. The code is publicly available at https://github.com/ShiweiGuo1995/SCFormer
Related papers
- Gateformer: Advancing Multivariate Time Series Forecasting through Temporal and Variate-Wise Attention with Gated Representations [2.4302562182247636]
We re-purpose the Transformer architecture to model both cross-time and cross-variate dependencies.<n>Our method achieves state-of-the-art performance across 13 real-world datasets, delivering performance improvements up to 20.7% over original models.
arXiv Detail & Related papers (2025-05-01T04:59:05Z) - LSEAttention is All You Need for Time Series Forecasting [0.0]
Transformer-based architectures have achieved remarkable success in natural language processing and computer vision.<n>Previous research has identified the traditional attention mechanism as a key factor limiting their effectiveness in this domain.<n>We introduce LATST, a novel approach designed to mitigate entropy collapse and training instability common challenges in Transformer-based time series forecasting.
arXiv Detail & Related papers (2024-10-31T09:09:39Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a causal Transformer for unified time series forecasting.<n>Based on large-scale pre-training, Timer-XL achieves state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting [6.434378359932152]
We review and categorize existing Transformer-based models into two main types: (1) modifications to the model structure and (2) modifications to the input data.
We propose $textbfsTransformer$, which introduces the Sequence and Temporal Convolutional Network (STCN) to fully capture both sequential and temporal information.
We compare our model with linear models and existing forecasting models on long-term time-series forecasting, achieving new state-of-the-art results.
arXiv Detail & Related papers (2024-08-19T06:23:41Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - MultiResFormer: Transformer with Adaptive Multi-Resolution Modeling for
General Time Series Forecasting [18.990322695844675]
Transformer-based models have greatly pushed the boundaries of time series forecasting recently.
Existing methods typically encode time series data into $textitpatches$ using one or a fixed set of patch lengths.
We propose MultiResFormer, which dynamically models temporal variations by adaptively choosing optimal patch lengths.
arXiv Detail & Related papers (2023-11-30T18:24:33Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - A Time Series is Worth 64 Words: Long-term Forecasting with Transformers [4.635547236305835]
We propose an efficient design of Transformer-based models for time series forecasting and self-supervised representation learning.
It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer.
PatchTST can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models.
arXiv Detail & Related papers (2022-11-27T05:15:42Z) - Mitigating Data Redundancy to Revitalize Transformer-based Long-Term Time Series Forecasting System [46.39662315849883]
We introduce CLMFormer, a novel framework that mitigates redundancy through curriculum learning and a memory-driven decoder.<n>CLMFormer consistently improves Transformer-based models by up to 30%, demonstrating its effectiveness in long-horizon forecasting.
arXiv Detail & Related papers (2022-07-16T04:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.