Related papers: AdaMixT: Adaptive Weighted Mixture of Multi-Scale Expert Transformers for Time Series Forecasting

AdaMixT: Adaptive Weighted Mixture of Multi-Scale Expert Transformers for Time Series Forecasting

URL: http://arxiv.org/abs/2509.18107v1
Date: Tue, 09 Sep 2025 15:30:53 GMT
Title: AdaMixT: Adaptive Weighted Mixture of Multi-Scale Expert Transformers for Time Series Forecasting
Authors: Huanyao Zhang, Jiaye Lin, Wentao Zhang, Haitao Yuan, Guoliang Li,
Abstract summary: We propose a novel architecture named Adaptive Weighted Mixture of Multi-Scale Expert Transformers (AdaMixT)<n>AdaMixT introduces various patches and leverages both General Pre-trained Models (GPM) and Domain-specific Models (DSM) for multi-scale feature extraction.<n> Comprehensive experiments on eight widely used benchmarks, including Weather, Traffic, Electricity, ILI, and four ETT datasets, consistently demonstrate the effectiveness of AdaMixT.
Score: 15.522567372502762
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multivariate time series forecasting involves predicting future values based on historical observations. However, existing approaches primarily rely on predefined single-scale patches or lack effective mechanisms for multi-scale feature fusion. These limitations hinder them from fully capturing the complex patterns inherent in time series, leading to constrained performance and insufficient generalizability. To address these challenges, we propose a novel architecture named Adaptive Weighted Mixture of Multi-Scale Expert Transformers (AdaMixT). Specifically, AdaMixT introduces various patches and leverages both General Pre-trained Models (GPM) and Domain-specific Models (DSM) for multi-scale feature extraction. To accommodate the heterogeneity of temporal features, AdaMixT incorporates a gating network that dynamically allocates weights among different experts, enabling more accurate predictions through adaptive multi-scale fusion. Comprehensive experiments on eight widely used benchmarks, including Weather, Traffic, Electricity, ILI, and four ETT datasets, consistently demonstrate the effectiveness of AdaMixT in real-world scenarios.

Related papers

SDMixer: Sparse Dual-Mixer for Time Series Forecasting [8.124083509364981]
This paper proposes a dual-stream sparse prediction framework that extracts global trends and local dynamic features from sequences in both the frequency and time domains.<n>It employs a sparsity mechanism to filter out invalid information, thereby enhancing the accuracy of cross-variable dependency modeling.
arXiv Detail & Related papers (2026-02-27T01:13:56Z)
FusAD: Time-Frequency Fusion with Adaptive Denoising for General Time Series Analysis [92.23551599659186]
Time series analysis plays a vital role in fields such as finance, healthcare, industry, and meteorology.<n>FusAD is a unified analysis framework designed for diverse time series tasks.
arXiv Detail & Related papers (2025-12-16T04:34:27Z)
Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting [64.45587649141842]
Time-series forecasting plays a critical role in many real-world applications.<n>No single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases.<n>We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models.
arXiv Detail & Related papers (2025-05-24T00:45:07Z)
A Multi-scale Representation Learning Framework for Long-Term Time Series Forecasting [6.344911113059126]
Long-term time series forecasting (LTSF) offers broad utility in practical settings like energy consumption and weather prediction.<n>This work confronts key issues in LTSF, including the suboptimal use of multi-granularity information.<n>Our method adeptly disentangles complex temporal dynamics using clear, concurrent predictions across various scales.
arXiv Detail & Related papers (2025-05-13T03:26:44Z)
Mixing It Up: Exploring Mixer Networks for Irregular Multivariate Time Series Forecasting [9.642976236410833]
We introduce IMTS-Mixer, a novel forecasting architecture designed specifically for IMTS.<n>Our approach retains the core principles of TS mixer models while introducing innovative methods to transform IMTS into fixed-size matrix representations.<n>Our results demonstrate that IMTS-Mixer establishes a new state-of-the-art in forecasting accuracy while also improving computational efficiency.
arXiv Detail & Related papers (2025-02-17T14:06:36Z)
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories [20.773694998061707]
Time series data is prevalent across numerous fields, necessitating the development of robust and accurate forecasting models. We introduce xLSTM-Mixer, a model designed to effectively integrate temporal sequences, joint time-variable information, and multiple perspectives for robust forecasting. Our evaluations demonstrate xLSTM-Mixer's superior long-term forecasting performance compared to recent state-of-the-art methods.
arXiv Detail & Related papers (2024-10-22T11:59:36Z)
UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting [98.12558945781693]
We propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens. Although our proposed model employs a simple architecture, it offers compelling performance as shown in our experiments on several datasets for time series forecasting.
arXiv Detail & Related papers (2024-06-07T14:39:28Z)
MGCP: A Multi-Grained Correlation based Prediction Network for Multivariate Time Series [54.91026286579748]
We propose a Multi-Grained Correlations-based Prediction Network. It simultaneously considers correlations at three levels to enhance prediction performance. It employs adversarial training with an attention mechanism-based predictor and conditional discriminator to optimize prediction results at coarse-grained level.
arXiv Detail & Related papers (2024-05-30T03:32:44Z)
TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting [19.88184356154215]
Time series forecasting is widely used in applications, such as traffic planning and weather forecasting. TimeMixer is able to achieve consistent state-of-the-art performances in both long-term and short-term forecasting tasks.
arXiv Detail & Related papers (2024-05-23T14:27:07Z)
Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai) Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains. Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z)
A Multi-Scale Decomposition MLP-Mixer for Time Series Analysis [14.40202378972828]
We propose MSD-Mixer, a Multi-Scale Decomposition-Mixer, which learns to explicitly decompose and represent the input time series in its different layers. We demonstrate that MSD-Mixer consistently and significantly outperforms other state-of-the-art algorithms with better efficiency.
arXiv Detail & Related papers (2023-10-18T13:39:07Z)
Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF) Our model avoids the influence of cumulative error and does not increase the time complexity. Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.