Sequential Order-Robust Mamba for Time Series Forecasting
- URL: http://arxiv.org/abs/2410.23356v1
- Date: Wed, 30 Oct 2024 18:05:22 GMT
- Title: Sequential Order-Robust Mamba for Time Series Forecasting
- Authors: Seunghan Lee, Juri Hong, Kibok Lee, Taeyoung Park,
- Abstract summary: Mamba has emerged as a promising alternative to Transformers, offering near-linear complexity in processing sequential data.
We propose SOR-Mamba, a TS forecasting method that incorporates a regularization strategy to minimize the discrepancy between two embedding vectors generated from data with reversed channel orders.
We also introduce channel correlation modeling (CCM), a pretraining task aimed at preserving correlations between channels from the data space to the latent space in order to enhance the ability to capture CD.
- Score: 5.265578815577529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mamba has recently emerged as a promising alternative to Transformers, offering near-linear complexity in processing sequential data. However, while channels in time series (TS) data have no specific order in general, recent studies have adopted Mamba to capture channel dependencies (CD) in TS, introducing a sequential order bias. To address this issue, we propose SOR-Mamba, a TS forecasting method that 1) incorporates a regularization strategy to minimize the discrepancy between two embedding vectors generated from data with reversed channel orders, thereby enhancing robustness to channel order, and 2) eliminates the 1D-convolution originally designed to capture local information in sequential data. Furthermore, we introduce channel correlation modeling (CCM), a pretraining task aimed at preserving correlations between channels from the data space to the latent space in order to enhance the ability to capture CD. Extensive experiments demonstrate the efficacy of the proposed method across standard and transfer learning scenarios. Code is available at https://github.com/seunghan96/SOR-Mamba.
Related papers
- Time-Correlated Video Bridge Matching [49.94768097995648]
Time-Correlated Video Bridge Matching (TCVBM) is a framework that extends Bridge Matching (BM) to time-correlated data sequences in the video domain.<n>TCVBM achieves superior performance across multiple quantitative metrics, demonstrating enhanced generation quality and reconstruction fidelity.
arXiv Detail & Related papers (2025-10-14T12:35:30Z) - ATTS: Asynchronous Test-Time Scaling via Conformal Prediction [112.54016379556073]
Large language models (LLMs) benefit from test-time scaling but are often hampered by high inference latency.<n>We introduce ATTS (Asynchronous Test-Time Scaling), a statistically guaranteed adaptive scaling framework.<n>We show that ATTS delivers up to 56.7x speedup in test-time scaling and a 4.14x throughput improvement.
arXiv Detail & Related papers (2025-09-18T16:55:09Z) - Streaming Sequence-to-Sequence Learning with Delayed Streams Modeling [57.708486655254966]
Delayed Streams Modeling is a flexible formulation for sequence-to-sequence learning.<n>It provides streaming inference of arbitrary output sequences from any input combination.
arXiv Detail & Related papers (2025-09-10T16:43:01Z) - FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [50.438552588818]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z) - DC-Mamber: A Dual Channel Prediction Model based on Mamba and Linear Transformer for Multivariate Time Series Forecasting [6.238490256097465]
Current mainstream models are mostly based on Transformer and the emerging Mamba.<n>DC-Mamber is a dual-channel forecasting model based on Mamba and linear Transformer for time series forecasting.<n>Experiments on eight public datasets confirm DC-Mamber's superior accuracy over existing models.
arXiv Detail & Related papers (2025-07-06T12:58:52Z) - Addressing Missing Data Issue for Diffusion-based Recommendation [26.605773432154518]
We propose a novel dual-side Thompson sampling-based Diffusion Model (TDM)<n>TDM simulates extra missing data in the guidance signals and allows diffusion models to handle existing missing data through extrapolation.<n> experiments and theoretical analysis validate the effectiveness of TDM in addressing missing data in sequential recommendations.
arXiv Detail & Related papers (2025-05-18T07:45:46Z) - Unifying Autoregressive and Diffusion-Based Sequence Generation [3.1853022872760186]
We present significant extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models.<n>We introduce hyperschedules, which assign distinct noise schedules to individual token positions.<n>Second, we propose two hybrid token-wise noising processes that interpolate between absorbing and uniform processes, enabling the model to fix past mistakes.
arXiv Detail & Related papers (2025-04-08T20:32:10Z) - NIMBA: Towards Robust and Principled Processing of Point Clouds With SSMs [9.978766637766373]
We introduce a method to convert point clouds into 1D sequences that maintain 3D spatial structure with no need for data replication.
Our method does not require positional embeddings and allows for shorter sequence lengths while still achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-31T18:58:40Z) - Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.
We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.
Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - MambaTS: Improved Selective State Space Models for Long-term Time Series Forecasting [12.08746904573603]
Mamba, based on selective state space models (SSMs), has emerged as a competitive alternative to Transformer.
We propose four targeted improvements, leading to MambaTS.
Experiments conducted on eight public datasets demonstrate that MambaTS achieves new state-of-the-art performance.
arXiv Detail & Related papers (2024-05-26T05:50:17Z) - Transform-Equivariant Consistency Learning for Temporal Sentence
Grounding [66.10949751429781]
We introduce a novel Equivariant Consistency Regulation Learning framework to learn more discriminative representations for each video.
Our motivation comes from that the temporal boundary of the query-guided activity should be consistently predicted.
In particular, we devise a self-supervised consistency loss module to enhance the completeness and smoothness of the augmented video.
arXiv Detail & Related papers (2023-05-06T19:29:28Z) - Reverse Ordering Techniques for Attention-Based Channel Prediction [11.630651920572221]
This work aims to predict channels in wireless communication systems based on noisy observations.
Models are adapted from natural language processing to tackle the complex challenge of channel prediction.
arXiv Detail & Related papers (2023-02-01T09:53:57Z) - Sampling in Dirichlet Process Mixture Models for Clustering Streaming
Data [5.660207256468972]
Dirichlet Process Mixture Model (DPMM) seems a natural choice for the streaming-data case.
Existing methods for online DPMM inference are too slow to handle rapid data streams.
We propose adapting both the DPMM and a known DPMM sampling-based non-streaming inference method for streaming-data clustering.
arXiv Detail & Related papers (2022-02-27T08:51:50Z) - Continuous-Time Sequential Recommendation with Temporal Graph
Collaborative Transformer [69.0621959845251]
We propose a new framework Temporal Graph Sequential Recommender (TGSRec) upon our defined continuous-time bi-partite graph.
TCT layer can simultaneously capture collaborative signals from both users and items, as well as considering temporal dynamics inside sequential patterns.
Empirical results on five datasets show that TGSRec significantly outperforms other baselines.
arXiv Detail & Related papers (2021-08-14T22:50:53Z) - Contrastive Self-supervised Sequential Recommendation with Robust
Augmentation [101.25762166231904]
Sequential Recommendationdescribes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data.
Old and new issues remain, including data-sparsity and noisy data.
We propose Contrastive Self-Supervised Learning for sequential Recommendation (CoSeRec)
arXiv Detail & Related papers (2021-08-14T07:15:25Z) - Modeling Sequences as Distributions with Uncertainty for Sequential
Recommendation [63.77513071533095]
Most existing sequential methods assume users are deterministic.
Item-item transitions might fluctuate significantly in several item aspects and exhibit randomness of user interests.
We propose a Distribution-based Transformer Sequential Recommendation (DT4SR) which injects uncertainties into sequential modeling.
arXiv Detail & Related papers (2021-06-11T04:35:21Z) - Composably secure data processing for Gaussian-modulated continuous
variable quantum key distribution [58.720142291102135]
Continuous-variable quantum key distribution (QKD) employs the quadratures of a bosonic mode to establish a secret key between two remote parties.
We consider a protocol with homodyne detection in the general setting of composable finite-size security.
In particular, we analyze the high signal-to-noise regime which requires the use of high-rate (non-binary) low-density parity check codes.
arXiv Detail & Related papers (2021-03-30T18:02:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.