Related papers: TIMBA: Time series Imputation with Bi-directional Mamba Blocks and Diffusion models

TIMBA: Time series Imputation with Bi-directional Mamba Blocks and Diffusion models

URL: http://arxiv.org/abs/2410.05916v1
Date: Tue, 8 Oct 2024 11:10:06 GMT
Title: TIMBA: Time series Imputation with Bi-directional Mamba Blocks and Diffusion models
Authors: Javier Solís-García, Belén Vega-Márquez, Juan A. Nepomuceno, Isabel A. Nepomuceno-Chamorro,
Abstract summary: We propose replacing time-oriented Transformers with State-Space Models (SSM) We develop a model that integrates SSM, Graph Neural Networks, and node-oriented Transformers to achieve enhanced representations.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The problem of imputing multivariate time series spans a wide range of fields, from clinical healthcare to multi-sensor systems. Initially, Recurrent Neural Networks (RNNs) were employed for this task; however, their error accumulation issues led to the adoption of Transformers, leveraging attention mechanisms to mitigate these problems. Concurrently, the promising results of diffusion models in capturing original distributions have positioned them at the forefront of current research, often in conjunction with Transformers. In this paper, we propose replacing time-oriented Transformers with State-Space Models (SSM), which are better suited for temporal data modeling. Specifically, we utilize the latest SSM variant, S6, which incorporates attention-like mechanisms. By embedding S6 within Mamba blocks, we develop a model that integrates SSM, Graph Neural Networks, and node-oriented Transformers to achieve enhanced spatiotemporal representations. Implementing these architectural modifications, previously unexplored in this field, we present Time series Imputation with Bi-directional mamba blocks and diffusion models (TIMBA). TIMBA achieves superior performance in almost all benchmark scenarios and performs comparably in others across a diverse range of missing value situations and three real-world datasets. We also evaluate how the performance of our model varies with different amounts of missing values and analyse its performance on downstream tasks. In addition, we provide the original code to replicate the results.

Related papers

DC-Mamber: A Dual Channel Prediction Model based on Mamba and Linear Transformer for Multivariate Time Series Forecasting [6.238490256097465]
Current mainstream models are mostly based on Transformer and the emerging Mamba.<n>DC-Mamber is a dual-channel forecasting model based on Mamba and linear Transformer for time series forecasting.<n>Experiments on eight public datasets confirm DC-Mamber's superior accuracy over existing models.
arXiv Detail & Related papers (2025-07-06T12:58:52Z)
Transformers Use Causal World Models in Maze-Solving Tasks [49.67445252528868]
We identify World Models in transformers trained on maze-solving tasks. We find that it is easier to activate features than to suppress them. positional encoding schemes appear to influence how World Models are structured within the model's residual stream.
arXiv Detail & Related papers (2024-12-16T15:21:04Z)
Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction. We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation. Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens. Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z)
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification [13.110156202816112]
We propose a novel multi-view approach to capture patterns with properties like shift equivariance. Our method integrates diverse features, including spectral, temporal, local, and global features, to obtain rich, complementary contexts for TSC. Our approach achieves average accuracy improvements of 4.01-6.45% and 7.93% respectively, over leading TSC models.
arXiv Detail & Related papers (2024-06-06T18:05:10Z)
Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models [5.37935922811333]
State Space Models (SSMs) are classical approaches for univariate time series modeling. We present Chimera that uses two input-dependent 2-D SSM heads with different discretization processes to learn long-term progression and seasonal patterns. Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks.
arXiv Detail & Related papers (2024-06-06T17:58:09Z)
Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs. Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction. We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z)
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection [5.37935922811333]
MambaMixer is a new architecture with data-dependent weights that uses a dual selection mechanism across tokens and channels. As a proof of concept, we design Vision MambaMixer (ViM2) and Time Series MambaMixer (TSM2) architectures based on the MambaMixer block.
arXiv Detail & Related papers (2024-03-29T00:05:13Z)
Mamba: Linear-Time Sequence Modeling with Selective State Spaces [31.985243136674146]
Foundation models are almost universally based on the Transformer architecture and its core attention module. We identify that a key weakness of such models is their inability to perform content-based reasoning. We integrate these selective SSMs into a simplified end-to-end neural network architecture without attention or even blocks (Mamba) As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics.
arXiv Detail & Related papers (2023-12-01T18:01:34Z)
Robust representations of oil wells' intervals via sparse attention mechanism [2.604557228169423]
We introduce the class of efficient Transformers named Regularized Transformers (Reguformers) The focus in our experiments is on oil&gas data, namely, well logs. To evaluate our models for such problems, we work with an industry-scale open dataset consisting of well logs of more than 20 wells.
arXiv Detail & Related papers (2022-12-29T09:56:33Z)
Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision. This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z)
Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF) Our model avoids the influence of cumulative error and does not increase the time complexity. Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)
Wake Word Detection with Streaming Transformers [72.66551640048405]
We show that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate. Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25%.
arXiv Detail & Related papers (2021-02-08T19:14:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.