Related papers: Repetitive Contrastive Learning Enhances Mamba's Selectivity in Time Series Prediction

Repetitive Contrastive Learning Enhances Mamba's Selectivity in Time Series Prediction

URL: http://arxiv.org/abs/2504.09185v1
Date: Sat, 12 Apr 2025 11:57:27 GMT
Title: Repetitive Contrastive Learning Enhances Mamba's Selectivity in Time Series Prediction
Authors: Wenbo Yan, Hanzhong Cao, Ying Tan,
Abstract summary: We introduce Repetitive Contrastive Learning (RCL), a token-level contrastive pretraining framework aimed at enhancing Mamba's selective capabilities.<n>RCL pretrains a single Mamba block to strengthen its selective abilities and then transfers these pretrained parameters to initialize Mamba blocks in various backbone models.<n>Extensive experiments show that RCL consistently boosts the performance of backbone models, surpassing existing methods and achieving state-of-the-art results.
Score: 1.6590638305972631
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Long sequence prediction is a key challenge in time series forecasting. While Mamba-based models have shown strong performance due to their sequence selection capabilities, they still struggle with insufficient focus on critical time steps and incomplete noise suppression, caused by limited selective abilities. To address this, we introduce Repetitive Contrastive Learning (RCL), a token-level contrastive pretraining framework aimed at enhancing Mamba's selective capabilities. RCL pretrains a single Mamba block to strengthen its selective abilities and then transfers these pretrained parameters to initialize Mamba blocks in various backbone models, improving their temporal prediction performance. RCL uses sequence augmentation with Gaussian noise and applies inter-sequence and intra-sequence contrastive learning to help the Mamba module prioritize information-rich time steps while ignoring noisy ones. Extensive experiments show that RCL consistently boosts the performance of backbone models, surpassing existing methods and achieving state-of-the-art results. Additionally, we propose two metrics to quantify Mamba's selective capabilities, providing theoretical, qualitative, and quantitative evidence for the improvements brought by RCL.

Related papers

Learning Mamba as a Continual Learner: Meta-learning Selective State Space Models for Efficient Continual Learning [12.697915176594314]
Continual learning (CL) aims to efficiently learn from a non-stationary data stream, without storing or recomputing all seen samples.<n>We focus on meta-learning sequence-prediction-based continual learners without retaining all past representations.<n>Given Mamba's strong sequence modeling performance and attention-free nature, we explore a key question: Can attention-free models like Mamba perform well on meta-continual learning.
arXiv Detail & Related papers (2024-12-01T11:43:46Z)
Mamba-CL: Optimizing Selective State Space Model in Null Space for Continual Learning [54.19222454702032]
Continual Learning aims to equip AI models with the ability to learn a sequence of tasks over time, without forgetting previously learned knowledge. State Space Models (SSMs) have achieved notable success in computer vision. We introduce Mamba-CL, a framework that continuously fine-tunes the core SSMs of the large-scale Mamba foundation model.
arXiv Detail & Related papers (2024-11-23T06:36:16Z)
A Mamba Foundation Model for Time Series Forecasting [13.593170999506889]
We introduce TSMamba, a linear-complexity foundation model for time series forecasting built on the Mamba architecture. The model captures temporal dependencies through both forward and backward Mamba encoders, achieving high prediction accuracy. It also achieves competitive or superior full-shot performance compared to task-specific prediction models.
arXiv Detail & Related papers (2024-11-05T09:34:05Z)
ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts.<n>ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z)
Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series Forecasting [8.841699904757506]
In this paper, we identify and formally define three critical dependencies essential for improving forecasting accuracy. We propose SAMBA, a simplified Mamba with disentangled dependency encoding. Experiments on nine real-world datasets demonstrate the effectiveness of SAMBA over state-of-the-art forecasting models.
arXiv Detail & Related papers (2024-08-22T02:14:59Z)
SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
DeciMamba: Exploring the Length Extrapolation Potential of Mamba [89.07242846058023]
We introduce DeciMamba, a context-extension method specifically designed for Mamba. Experiments over real-world long-range NLP tasks show that DeciMamba can extrapolate to context lengths significantly longer than the ones seen during training.
arXiv Detail & Related papers (2024-06-20T17:40:18Z)
MambaLRP: Explaining Selective State Space Sequence Models [18.133138020777295]
Recent sequence modeling approaches using selective state space sequence models, referred to as Mamba models, have seen a surge of interest. These models allow efficient processing of long sequences in linear time and are rapidly being adopted in a wide range of applications such as language modeling. To foster their reliable use in real-world scenarios, it is crucial to augment their transparency.
arXiv Detail & Related papers (2024-06-11T12:15:47Z)
MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology [10.933433327636918]
Multiple Instance Learning (MIL) has emerged as a dominant paradigm to extract discriminative feature representations within Whole Slide Images (WSIs) in computational pathology. In this paper, we incorporate the Selective Scan Space State Sequential Model (Mamba) in Multiple Instance Learning (MIL) for long sequence modeling with linear complexity. Our proposed framework performs favorably against state-of-the-art MIL methods.
arXiv Detail & Related papers (2024-03-11T15:17:25Z)
RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones. We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z)
CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances. We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data. Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.