CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting
- URL: http://arxiv.org/abs/2406.05316v3
- Date: Thu, 26 Sep 2024 03:54:15 GMT
- Title: CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting
- Authors: Chaolv Zeng, Zhanyu Liu, Guanjie Zheng, Linghe Kong,
- Abstract summary: Mamba, a state space model, has emerged with robust sequence and feature mixing capabilities.
Capturing cross-channel dependencies is critical in enhancing performance of time series prediction.
We introduce a refined Mamba variant tailored for time series forecasting.
- Score: 18.50360049235537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in multivariate time series forecasting have been propelled by Linear-based, Transformer-based, and Convolution-based models, with Transformer-based architectures gaining prominence for their efficacy in temporal and cross-channel mixing. More recently, Mamba, a state space model, has emerged with robust sequence and feature mixing capabilities. However, the suitability of the vanilla Mamba design for time series forecasting remains an open question, particularly due to its inadequate handling of cross-channel dependencies. Capturing cross-channel dependencies is critical in enhancing the performance of multivariate time series prediction. Recent findings show that self-attention excels in capturing cross-channel dependencies, whereas other simpler mechanisms, such as MLP, may degrade model performance. This is counterintuitive, as MLP, being a learnable architecture, should theoretically capture both correlations and irrelevances, potentially leading to neutral or improved performance. Diving into the self-attention mechanism, we attribute the observed degradation in MLP performance to its lack of data dependence and global receptive field, which result in MLP's lack of generalization ability. Based on the above insights, we introduce a refined Mamba variant tailored for time series forecasting. Our proposed model, \textbf{CMamba}, incorporates a modified Mamba (M-Mamba) module for temporal dependencies modeling, a global data-dependent MLP (GDD-MLP) to effectively capture cross-channel dependencies, and a Channel Mixup mechanism to mitigate overfitting. Comprehensive experiments conducted on seven real-world datasets demonstrate the efficacy of our model in improving forecasting performance.
Related papers
- A Mamba Foundation Model for Time Series Forecasting [13.593170999506889]
We introduce TSMamba, a linear-complexity foundation model for time series forecasting built on the Mamba architecture.
The model captures temporal dependencies through both forward and backward Mamba encoders, achieving high prediction accuracy.
It also achieves competitive or superior full-shot performance compared to task-specific prediction models.
arXiv Detail & Related papers (2024-11-05T09:34:05Z) - UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba [7.594115034632109]
We propose UmambaTSF, a novel long-term time series forecasting framework.
It integrates multi-scale feature extraction capabilities of U-shaped encoder-decoder multilayer perceptrons (MLP) with Mamba's long sequence representation.
UmambaTSF achieves state-of-the-art performance and excellent generality on widely used benchmark datasets.
arXiv Detail & Related papers (2024-10-15T04:56:43Z) - Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need [28.301119776877822]
Time series forecasting requires balancing short-term and long-term dependencies for accurate predictions.
Transformers are superior in modeling long-term dependencies but are criticized for their quadratic computational cost.
Mamba provides a near-linear alternative but is reported less effective in time series longterm forecasting due to potential information loss.
arXiv Detail & Related papers (2024-08-28T17:59:27Z) - Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.
We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.
Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - Mamba-PTQ: Outlier Channels in Recurrent Large Language Models [49.1574468325115]
We show that Mamba models exhibit the same pattern of outlier channels observed in attention-based LLMs.
We show that the reason for the difficulty of quantizing SSMs is caused by activation outliers, similar to those observed in transformer-based LLMs.
arXiv Detail & Related papers (2024-07-17T08:21:06Z) - UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens.
Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z) - PDMLP: Patch-based Decomposed MLP for Long-Term Time Series Forecasting [0.0]
Recent studies have attempted to refine the Transformer architecture to demonstrate its effectiveness in Long-Term Time Series Forecasting (LTSF) tasks.
We attribute the effectiveness of these models largely to the adopted Patch mechanism, which enhances sequence locality.
Further investigation suggests that simple linear layers augmented with the Patch mechanism may outperform complex Transformer-based LTSF models.
arXiv Detail & Related papers (2024-05-22T12:12:20Z) - SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion [59.96233305733875]
Time series forecasting plays a crucial role in various fields such as finance, traffic management, energy, and healthcare.
Several methods utilize mechanisms like attention or mixer to address this by capturing channel correlations.
This paper presents an efficient-based model, the Series-cOre Fused Time Series forecaster (SOFTS)
arXiv Detail & Related papers (2024-04-22T14:06:35Z) - MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal
and Channel Mixing [18.058617044421293]
This paper investigates the contributions and deficiencies of attention mechanisms on the performance of time series forecasting.
We propose MTS-Mixers, which use two factorized modules to capture temporal and channel dependencies.
Experimental results on several real-world datasets show that MTS-Mixers outperform existing Transformer-based models with higher efficiency.
arXiv Detail & Related papers (2023-02-09T08:52:49Z) - On Continual Model Refinement in Out-of-Distribution Data Streams [64.62569873799096]
Real-world natural language processing (NLP) models need to be continually updated to fix the prediction errors in out-of-distribution (OOD) data streams.
Existing continual learning (CL) problem setups cannot cover such a realistic and complex scenario.
We propose a new CL problem formulation dubbed continual model refinement (CMR)
arXiv Detail & Related papers (2022-05-04T11:54:44Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.