MSDformer: Multi-scale Discrete Transformer For Time Series Generation
- URL: http://arxiv.org/abs/2505.14202v1
- Date: Tue, 20 May 2025 11:01:36 GMT
- Title: MSDformer: Multi-scale Discrete Transformer For Time Series Generation
- Authors: Zhicheng Chen, Shibo Feng, Xi Xiao, Zhong Zhang, Qing Li, Xingyu Gao, Peilin Zhao,
- Abstract summary: We propose a novel multi-scale DTM-based time series generation method, called Multi-Scale Discrete Transformer (MSDformer)<n>MSDformer employs a multi-scale time series tokenizer to learn discrete token representations at multiple scales, which jointly characterize the complex nature of time series data.<n>Experiments demonstrate that MSDformer significantly outperforms state-of-the-art methods.
- Score: 34.850082388835034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discrete Token Modeling (DTM), which employs vector quantization techniques, has demonstrated remarkable success in modeling non-natural language modalities, particularly in time series generation. While our prior work SDformer established the first DTM-based framework to achieve state-of-the-art performance in this domain, two critical limitations persist in existing DTM approaches: 1) their inability to capture multi-scale temporal patterns inherent to complex time series data, and 2) the absence of theoretical foundations to guide model optimization. To address these challenges, we proposes a novel multi-scale DTM-based time series generation method, called Multi-Scale Discrete Transformer (MSDformer). MSDformer employs a multi-scale time series tokenizer to learn discrete token representations at multiple scales, which jointly characterize the complex nature of time series data. Subsequently, MSDformer applies a multi-scale autoregressive token modeling technique to capture the multi-scale patterns of time series within the discrete latent space. Theoretically, we validate the effectiveness of the DTM method and the rationality of MSDformer through the rate-distortion theorem. Comprehensive experiments demonstrate that MSDformer significantly outperforms state-of-the-art methods. Both theoretical analysis and experimental results demonstrate that incorporating multi-scale information and modeling multi-scale patterns can substantially enhance the quality of generated time series in DTM-based approaches. The code will be released upon acceptance.
Related papers
- FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [50.438552588818]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z) - Continual Multimodal Contrastive Learning [70.60542106731813]
Multimodal contrastive learning (MCL) advances in aligning different modalities and generating multimodal representations in a joint space.<n>However, a critical yet often overlooked challenge remains: multimodal data is rarely collected in a single process, and training from scratch is computationally expensive.<n>In this paper, we formulate CMCL through two specialized principles of stability and plasticity.<n>We theoretically derive a novel optimization-based method, which projects updated gradients from dual sides onto subspaces where any gradient is prevented from interfering with the previously learned knowledge.
arXiv Detail & Related papers (2025-03-19T07:57:08Z) - TimeDiT: General-purpose Diffusion Transformers for Time Series Foundation Model [11.281386703572842]
TimeDiT is a diffusion transformer model that combines temporal dependency learning with probabilistic sampling.<n>TimeDiT employs a unified masking mechanism to harmonize the training and inference process across diverse tasks.<n>Our systematic evaluation demonstrates TimeDiT's effectiveness both in fundamental tasks, i.e., forecasting and imputation, through zero-shot/fine-tuning.
arXiv Detail & Related papers (2024-09-03T22:31:57Z) - LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting [69.33802286580786]
We introduce LTSM-Bundle, a comprehensive toolbox, and benchmark for training LTSMs.<n>It modularized and benchmarked LTSMs from multiple dimensions, encompassing prompting strategies, tokenization approaches, base model selection, data quantity, and dataset diversity.<n> Empirical results demonstrate that this combination achieves superior zero-shot and few-shot performances compared to state-of-the-art LTSMs and traditional TSF methods.
arXiv Detail & Related papers (2024-06-20T07:09:19Z) - UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens.
Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z) - Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models [5.37935922811333]
State Space Models (SSMs) are classical approaches for univariate time series modeling.
We present Chimera that uses two input-dependent 2-D SSM heads with different discretization processes to learn long-term progression and seasonal patterns.
Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks.
arXiv Detail & Related papers (2024-06-06T17:58:09Z) - Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting.<n>Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block.<n>Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z) - Multitask Learning for Time Series Data with 2D Convolution [32.72419542473646]
Multitask learning (MTL) aims to develop a unified model that can handle a set of closely related tasks simultaneously.
In this paper, we investigate the application of MTL to the time series classification problem.
We show that when we integrate the state-of-the-art 1D convolution-based TSC model with MTL, the performance of the TSC model actually deteriorates.
arXiv Detail & Related papers (2023-10-05T22:00:17Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF)
Our model avoids the influence of cumulative error and does not increase the time complexity.
Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z) - Variational Dynamic Mixtures [18.730501689781214]
We develop variational dynamic mixtures (VDM) to infer sequential latent variables.
In an empirical study, we show that VDM outperforms competing approaches on highly multi-modal datasets.
arXiv Detail & Related papers (2020-10-20T16:10:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.