Related papers: MSTN: Fast and Efficient Multivariate Time Series Model

MSTN: Fast and Efficient Multivariate Time Series Model

URL: http://arxiv.org/abs/2511.20577v1
Date: Tue, 25 Nov 2025 18:09:42 GMT
Title: MSTN: Fast and Efficient Multivariate Time Series Model
Authors: Sumit S Shevtekar, Chandresh K Maurya, Gourab Sil,
Abstract summary: We introduce the Multi-scale Temporal Network (MSTN), a novel deep learning architecture founded on a hierarchical multi-scale and sequence modeling principle.<n>MSTN integrates a convolutional encoder that constructs a hierarchical feature pyramid for local patterns and a sequence modeling component for long-range temporal dependencies.<n>Extensive evaluations across time-series long-horizon forecasting, imputation, classification and generalizability study demonstrate that MSTN achieves competitive state-of-the-art (SOTA) performance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-world time-series data is highly non stationary and complex in dynamics that operate across multiple timescales, ranging from fast, short-term changes to slow, long-term trends. Most existing models rely on fixed-scale structural priors, such as patch-based tokenization, fixed frequency transformations, or frozen backbone architectures. This often leads to over-regularization of temporal dynamics, which limits their ability to adaptively model the full spectrum of temporal variations and impairs their performance on unpredictable, Sudden, high-magnitude events. To address this, we introduce the Multi-scale Temporal Network (MSTN), a novel deep learning architecture founded on a hierarchical multi-scale and sequence modeling principle. The MSTN framework integrates: (i) a multi-scale convolutional encoder that constructs a hierarchical feature pyramid for local patterns (ii) a sequence modeling component for long-range temporal dependencies. We empirically validate this with BiLSTM and Transformer variants, establishing a flexible foundation for future architectural advancements. and (iii) a gated fusion mechanism augmented with squeeze-and-excitation (SE) and multi-head temporal attention (MHTA) for dynamic, context-aware feature integration. This design enables MSTN to adaptively model temporal patterns from milliseconds to long-range dependencies within a unified framework. Extensive evaluations across time-series long-horizon forecasting, imputation, classification and generalizability study demonstrate that MSTN achieves competitive state-of-the-art (SOTA) performance, showing improvements over contemporary approaches including EMTSF, LLM4TS, HiMTM, TIME-LLM, MTST, SOFTS, iTransformer, TimesNet, and PatchTST. In total, MSTN establishes new SOTA performance on 24 of 32 benchmark datasets, demonstrating its consistent performance across diverse temporal tasks.

Related papers

MEMTS: Internalizing Domain Knowledge via Parameterized Memory for Retrieval-Free Domain Adaptation of Time Series Foundation Models [51.506429027626005]
Memory for Time Series (MEMTS) is a lightweight and plug-and-play method for retrieval-free domain adaptation in time series forecasting.<n>Key component of MEMTS is a Knowledge Persistence Module (KPM), which internalizes domain-specific temporal dynamics.<n>This paradigm shift enables MEMTS to achieve accurate domain adaptation with constant-time inference and near-zero latency.
arXiv Detail & Related papers (2026-02-14T14:00:06Z)
DiTS: Multimodal Diffusion Transformers Are Time Series Forecasters [50.43534351968113]
Existing generative time series models do not address the multi-dimensional properties of time series data well.<n>Inspired by Multimodal Diffusion Transformers that integrate textual guidance into video generation, we propose Diffusion Transformers for Time Series (DiTS)
arXiv Detail & Related papers (2026-02-06T10:48:13Z)
UniDiff: A Unified Diffusion Framework for Multimodal Time Series Forecasting [90.47915032778366]
We propose UniDiff, a unified diffusion framework for multimodal time series forecasting.<n>At its core lies a unified and parallel fusion module, where a single cross-attention mechanism integrates structural information from timestamps and semantic context from texts.<n>Experiments on real-world benchmark datasets across eight domains demonstrate that the proposed UniDiff model achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-12-08T05:36:14Z)
Multi-modal Spatio-Temporal Transformer for High-resolution Land Subsidence Prediction [3.3295066998131637]
We propose a novel framework that fuses dynamic displacement data with static physical priors.<n>On the public EGMS dataset, MM-STT establishes a new state-of-the-art, reducing the long-range forecast RMSE by an order of high magnitude.
arXiv Detail & Related papers (2025-09-29T18:49:04Z)
Generative Models for Long Time Series: Approximately Equivariant Recurrent Network Structures for an Adjusted Training Scheme [4.327763441385371]
We present a simple yet effective generative model for time series data based on a Variational Autoencoder (VAE) with recurrent layers.<n>Our method introduces an adapted training scheme that progressively increases the sequence length.<n>By leveraging the recurrent architecture, the model maintains a constant number of parameters regardless of sequence length.
arXiv Detail & Related papers (2025-05-08T07:52:37Z)
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling.<n>Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning.<n>We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z)
Higher-order Cross-structural Embedding Model for Time Series Analysis [12.35149125898563]
Time series analysis has gained significant attention due to its critical applications in diverse fields such as healthcare, finance, and sensor networks. Current approaches struggle to model higher-order interactions within time series, and focus on learning temporal or spatial dependencies separately. We propose Higher-order Cross-structural Embedding Model for Time Series (High-TS), a novel framework that jointly models both temporal and spatial perspectives.
arXiv Detail & Related papers (2024-10-30T12:51:14Z)
TiVaT: A Transformer with a Single Unified Mechanism for Capturing Asynchronous Dependencies in Multivariate Time Series Forecasting [4.733959271565453]
TiVaT is a novel architecture incorporating a single unified module, a Joint-Axis (JA) attention module.<n>The JA attention module dynamically selects relevant features to particularly capture asynchronous interactions.<n>Extensive experiments demonstrate TiVaT's overall performance across diverse datasets.
arXiv Detail & Related papers (2024-10-02T13:24:24Z)
Disentangling Structured Components: Towards Adaptive, Interpretable and Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns. SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns. Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z)
FormerTime: Hierarchical Multi-Scale Representations for Multivariate Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task. It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z)
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding [60.292702363839716]
Current SOTA backbone networks for speaker embedding are designed to aggregate multi-scale features from an utterance with multi-branch network architectures for speaker representation. We propose an effective temporal multi-scale (TMS) model where multi-scale branches could be efficiently designed in a speaker embedding network almost without increasing computational costs.
arXiv Detail & Related papers (2022-03-17T05:49:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.