Related papers: Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting

Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting

URL: http://arxiv.org/abs/2404.14757v1
Date: Tue, 23 Apr 2024 05:43:44 GMT
Title: Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting
Authors: Xiongxiao Xu, Yueqing Liang, Baixiang Huang, Zhiling Lan, Kai Shu,
Abstract summary: Time series forecasting is an important problem and plays a key role in a variety of applications including weather forecasting, stock market, and scientific simulations. Recent progress on state space models (SSMs) have shown impressive performance on modeling long range dependency. We propose to leverage a hybrid framework Mambaformer that internally combines Mamba for long-range dependency, and Transformer for short range dependency.
Score: 14.476978391383405
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series forecasting is an important problem and plays a key role in a variety of applications including weather forecasting, stock market, and scientific simulations. Although transformers have proven to be effective in capturing dependency, its quadratic complexity of attention mechanism prevents its further adoption in long-range time series forecasting, thus limiting them attend to short-range range. Recent progress on state space models (SSMs) have shown impressive performance on modeling long range dependency due to their subquadratic complexity. Mamba, as a representative SSM, enjoys linear time complexity and has achieved strong scalability on tasks that requires scaling to long sequences, such as language, audio, and genomics. In this paper, we propose to leverage a hybrid framework Mambaformer that internally combines Mamba for long-range dependency, and Transformer for short range dependency, for long-short range forecasting. To the best of our knowledge, this is the first paper to combine Mamba and Transformer architecture in time series data. We investigate possible hybrid architectures to combine Mamba layer and attention layer for long-short range time series forecasting. The comparative study shows that the Mambaformer family can outperform Mamba and Transformer in long-short range time series forecasting problem. The code is available at https://github.com/XiongxiaoXu/Mambaformerin-Time-Series.

Related papers

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state [12.940694192516059]
In long-term time series forecasting, different variables often influence the target variable over distinct time intervals.<n>Traditional models typically process all variables or time points uniformly, which limits their ability to capture complex variable relationships.<n>We propose TimePro, an innovative Mamba-based model that constructs variate- and time-aware hyper-states.
arXiv Detail & Related papers (2025-05-27T06:24:21Z)
LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics [56.99021951927683]
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Existing Large Language Models (LLMs) usually perform suboptimally because they neglect the inherent characteristics of time series data. We propose LLM-PS to empower the LLM for TSF by learning the fundamental textitPatterns and meaningful textitSemantics from time series data.
arXiv Detail & Related papers (2025-03-12T11:45:11Z)
S2TX: Cross-Attention Multi-Scale State-Space Transformer for Time Series Forecasting [31.19126944008011]
Time series forecasting has recently achieved significant progress with multi-scale models to address the heterogeneity between long and short range patterns. We propose State Space Transformer with cross-attention (S2TX) to address these concerns. S2TX can achieve highly robust SOTA results while maintaining a low memory footprint.
arXiv Detail & Related papers (2025-02-17T01:40:45Z)
UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba [7.594115034632109]
We propose UmambaTSF, a novel long-term time series forecasting framework. It integrates multi-scale feature extraction capabilities of U-shaped encoder-decoder multilayer perceptrons (MLP) with Mamba's long sequence representation. UmambaTSF achieves state-of-the-art performance and excellent generality on widely used benchmark datasets.
arXiv Detail & Related papers (2024-10-15T04:56:43Z)
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts. Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z)
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting. Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z)
Oscillatory State-Space Models [61.923849241099184]
We propose Lineary State-Space models (LinOSS) for efficiently learning on long sequences.<n>A stable discretization, integrated over time using fast associative parallel scans, yields the proposed state-space model.<n>We show that LinOSS is universal, i.e., it can approximate any continuous and causal operator mapping between time-varying functions.
arXiv Detail & Related papers (2024-10-04T22:00:13Z)
MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters [6.733646592789575]
Long-term Time Series Forecasting (LTSF) involves predicting long-term values by analyzing a large amount of historical time-series data to identify patterns and trends. Transformer-based models offer high forecasting accuracy, but they are often too compute-intensive to be deployed on devices with hardware constraints. We propose MixLinear, an ultra-lightweight time series forecasting model specifically designed for resource-constrained devices.
arXiv Detail & Related papers (2024-10-02T23:04:57Z)
Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics [7.745945701278489]
Long-short range time series forecasting is essential for predicting future trends and patterns over extended periods. Deep learning models such as Transformers have made significant strides in advancing time series forecasting. This article examines the advantages and disadvantages of both Mamba and Transformer models.
arXiv Detail & Related papers (2024-09-13T04:23:54Z)
Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction. We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation. Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification [13.110156202816112]
We propose a novel multi-view approach to capture patterns with properties like shift equivariance. Our method integrates diverse features, including spectral, temporal, local, and global features, to obtain rich, complementary contexts for TSC. Our approach achieves average accuracy improvements of 4.01-6.45% and 7.93% respectively, over leading TSC models.
arXiv Detail & Related papers (2024-06-06T18:05:10Z)
MambaTS: Improved Selective State Space Models for Long-term Time Series Forecasting [12.08746904573603]
Mamba, based on selective state space models (SSMs), has emerged as a competitive alternative to Transformer. We propose four targeted improvements, leading to MambaTS. Experiments conducted on eight public datasets demonstrate that MambaTS achieves new state-of-the-art performance.
arXiv Detail & Related papers (2024-05-26T05:50:17Z)
TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting [13.110156202816112]
TimeMachine exploits the unique properties of time series data to produce salient contextual cues at multi-scales. TimeMachine achieves superior performance in prediction accuracy, scalability, and memory efficiency, as extensively validated using benchmark datasets.
arXiv Detail & Related papers (2024-03-14T22:19:37Z)
Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai) Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains. Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z)
Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM) During pre-training, we curate large-scale datasets with up to 1 billion time points. To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z)
Grouped self-attention mechanism for a memory-efficient Transformer [64.0125322353281]
Real-world tasks such as forecasting weather, electricity consumption, and stock market involve predicting data that vary over time. Time-series data are generally recorded over a long period of observation with long sequences owing to their periodic characteristics and long-range dependencies over time. We propose two novel modules, Grouped Self-Attention (GSA) and Compressed Cross-Attention (CCA) Our proposed model efficiently exhibited reduced computational complexity and performance comparable to or better than existing methods.
arXiv Detail & Related papers (2022-10-02T06:58:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.