Related papers: Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines

Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines

URL: http://arxiv.org/abs/2505.15151v1
Date: Wed, 21 May 2025 06:18:41 GMT
Title: Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines
Authors: Xiaohou Shi, Ke Li, Aobo Liang, Yan Sun,
Abstract summary: Time series often exhibit significant diversity in their temporal patterns across different time spans and domains.<n>Time Tracker achieves state-of-the-art performance in predicting accuracy, model generalization and adaptability.
Score: 5.543238821368548
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the past few years, time series foundation models have achieved superior predicting accuracy. However, real-world time series often exhibit significant diversity in their temporal patterns across different time spans and domains, making it challenging for a single model architecture to fit all complex scenarios. In addition, time series data may have multiple variables exhibiting complex correlations between each other. Recent mainstream works have focused on modeling times series in a channel-independent manner in both pretraining and finetuning stages, overlooking the valuable inter-series dependencies. To this end, we propose \textbf{Time Tracker} for better predictions on multivariate time series data. Firstly, we leverage sparse mixture of experts (MoE) within Transformers to handle the modeling of diverse time series patterns, thereby alleviating the learning difficulties of a single model while improving its generalization. Besides, we propose Any-variate Attention, enabling a unified model structure to seamlessly handle both univariate and multivariate time series, thereby supporting channel-independent modeling during pretraining and channel-mixed modeling for finetuning. Furthermore, we design a graph learning module that constructs relations among sequences from frequency-domain features, providing more precise guidance to capture inter-series dependencies in channel-mixed modeling. Based on these advancements, Time Tracker achieves state-of-the-art performance in predicting accuracy, model generalization and adaptability.

Related papers

Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting [64.45587649141842]
Time-series forecasting plays a critical role in many real-world applications.<n>No single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases.<n>We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models.
arXiv Detail & Related papers (2025-05-24T00:45:07Z)
Generalized Prompt Tuning: Adapting Frozen Univariate Time Series Foundation Models for Multivariate Healthcare Time Series [3.9599054392856483]
Time series foundation models are pre-trained on large datasets and are able to achieve state-of-the-art performance in diverse tasks. We propose a prompt-tuning-inspired fine-tuning technique, Gen-P-Tuning, that enables us to adapt an existing univariate time series foundation model. We demonstrate the effectiveness of our fine-tuning approach against various baselines on two MIMIC classification tasks, and on influenza-like illness forecasting.
arXiv Detail & Related papers (2024-11-19T19:20:58Z)
DisenTS: Disentangled Channel Evolving Pattern Modeling for Multivariate Time Series Forecasting [43.071713191702486]
DisenTS is a tailored framework for modeling disentangled channel evolving patterns in general time series forecasting. We introduce a novel Forecaster Aware Gate (FAG) module that generates the routing signals adaptively according to both the forecasters' states and input series' characteristics.
arXiv Detail & Related papers (2024-10-30T12:46:14Z)
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts. Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z)
Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift [51.01356105618118]
Time series often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning.<n>Existing approaches, which typically train a single model to capture all these diverse patterns, often struggle with the pattern drifts between patches.<n>We propose TFPS, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting.
arXiv Detail & Related papers (2024-10-13T13:35:29Z)
UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens. Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z)
Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai) Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains. Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z)
Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM) During pre-training, we curate large-scale datasets with up to 1 billion time points. To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z)
TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series [57.4208255711412]
Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS) We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks.
arXiv Detail & Related papers (2023-10-02T16:45:19Z)
Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF) Our model avoids the influence of cumulative error and does not increase the time complexity. Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)
Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows [8.859284959951204]
Time series forecasting is fundamental to scientific and engineering problems. Deep learning methods are well suited for this problem. We show that it improves over the state-of-the-art for standard metrics on many real-world data sets.
arXiv Detail & Related papers (2020-02-14T16:16:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.