M$^2$FMoE: Multi-Resolution Multi-View Frequency Mixture-of-Experts for Extreme-Adaptive Time Series Forecasting
- URL: http://arxiv.org/abs/2601.08631v1
- Date: Tue, 13 Jan 2026 15:05:49 GMT
- Title: M$^2$FMoE: Multi-Resolution Multi-View Frequency Mixture-of-Experts for Extreme-Adaptive Time Series Forecasting
- Authors: Yaohui Huang, Runmin Zou, Yun Wang, Laeeq Aslam, Ruipeng Dong,
- Abstract summary: M$2$FMoE learns both regular and extreme patterns through multi-resolution and multi-view frequency modeling.<n>Experiments on real-world hydrological datasets with extreme patterns demonstrate that M$2$FMoE outperforms state-of-the-art baselines.
- Score: 6.89383702832204
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Forecasting time series with extreme events is critical yet challenging due to their high variance, irregular dynamics, and sparse but high-impact nature. While existing methods excel in modeling dominant regular patterns, their performance degrades significantly during extreme events, constituting the primary source of forecasting errors in real-world applications. Although some approaches incorporate auxiliary signals to improve performance, they still fail to capture extreme events' complex temporal dynamics. To address these limitations, we propose M$^2$FMoE, an extreme-adaptive forecasting model that learns both regular and extreme patterns through multi-resolution and multi-view frequency modeling. It comprises three modules: (1) a multi-view frequency mixture-of-experts module assigns experts to distinct spectral bands in Fourier and Wavelet domains, with cross-view shared band splitter aligning frequency partitions and enabling inter-expert collaboration to capture both dominant and rare fluctuations; (2) a multi-resolution adaptive fusion module that hierarchically aggregates frequency features from coarse to fine resolutions, enhancing sensitivity to both short-term variations and sudden changes; (3) a temporal gating integration module that dynamically balances long-term trends and short-term frequency-aware features, improving adaptability to both regular and extreme temporal patterns. Experiments on real-world hydrological datasets with extreme patterns demonstrate that M$^2$FMoE outperforms state-of-the-art baselines without requiring extreme-event labels.
Related papers
- Dualformer: Time-Frequency Dual Domain Learning for Long-term Time Series Forecasting [5.4806374384787695]
Transformer-based models suffer from an inherent low-pass filtering effect that limits their effectiveness.<n>We propose Dualformer, a principled dual-domain framework that rethinks frequency modeling from a layer-wise perspective.<n>Tests conducted on eight widely used benchmarks demonstrate Dualformer's robustness and superior performance.
arXiv Detail & Related papers (2026-01-22T05:51:56Z) - Improving Day-Ahead Grid Carbon Intensity Forecasting by Joint Modeling of Local-Temporal and Cross-Variable Dependencies Across Different Frequencies [3.1953619915392246]
Accurate forecasting of the grid carbon intensity factor (CIF) is critical for enabling demand-side management and reducing emissions in modern electricity systems.<n>Despite advances in deep learning-based methods, it remains challenging to capture the fine-grained local-temporal dependencies.<n>We propose a novel model that integrates two parallel modules.
arXiv Detail & Related papers (2026-01-10T11:20:55Z) - FusAD: Time-Frequency Fusion with Adaptive Denoising for General Time Series Analysis [92.23551599659186]
Time series analysis plays a vital role in fields such as finance, healthcare, industry, and meteorology.<n>FusAD is a unified analysis framework designed for diverse time series tasks.
arXiv Detail & Related papers (2025-12-16T04:34:27Z) - WaveTuner: Comprehensive Wavelet Subband Tuning for Time Series Forecasting [22.28753360953834]
We propose a Wavelet decomposition framework empowered by full-spectrum subband Tuning for time series forecasting.<n>WaveTuner comprises two key modules: (i) Adaptive Wavelet Refinement module, that transforms time series into time-frequency coefficients, and (ii) Multi-Branch module, that employs multiple functional branches, each instantiated as a flexible Kolmogorov-Arnold Network (KAN) with a distinct functional order to model a specific spectral subband.
arXiv Detail & Related papers (2025-11-24T07:33:35Z) - A Unified Frequency Domain Decomposition Framework for Interpretable and Robust Time Series Forecasting [81.73338008264115]
Current approaches for time series forecasting, whether in the time or frequency domain, predominantly use deep learning models based on linear layers or transformers.<n>We propose FIRE, a unified frequency domain decomposition framework that provides a mathematical abstraction for diverse types of time series.<n>Fire consistently outperforms state-of-the-art models on long-term forecasting benchmarks.
arXiv Detail & Related papers (2025-10-11T09:59:25Z) - MFRS: A Multi-Frequency Reference Series Approach to Scalable and Accurate Time-Series Forecasting [51.94256702463408]
Time series predictability is derived from periodic characteristics at different frequencies.<n>We propose a novel time series forecasting method based on multi-frequency reference series correlation analysis.<n> Experiments on major open and synthetic datasets show state-of-the-art performance.
arXiv Detail & Related papers (2025-03-11T11:40:14Z) - Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts.
Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z) - Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift [51.01356105618118]
Time series often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning.<n>Existing approaches, which typically train a single model to capture all these diverse patterns, often struggle with the pattern drifts between patches.<n>We propose TFPS, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting.
arXiv Detail & Related papers (2024-10-13T13:35:29Z) - MMFNet: Multi-Scale Frequency Masking Neural Network for Multivariate Time Series Forecasting [6.733646592789575]
Long-term Time Series Forecasting (LTSF) is critical for numerous real-world applications, such as electricity consumption planning, financial forecasting, and disease propagation analysis.
We introduce MMFNet, a novel model designed to enhance long-term multivariate forecasting by leveraging a multi-scale masked frequency decomposition approach.
MMFNet captures fine, intermediate, and coarse-grained temporal patterns by converting time series into frequency segments at varying scales while employing a learnable mask to filter out irrelevant components adaptively.
arXiv Detail & Related papers (2024-10-02T22:38:20Z) - FAITH: Frequency-domain Attention In Two Horizons for Time Series Forecasting [13.253624747448935]
Time Series Forecasting plays a crucial role in various fields such as industrial equipment maintenance, meteorology, energy consumption, traffic flow and financial investment.
Current deep learning-based predictive models often exhibit a significant deviation between their forecasting outcomes and the ground truth.
We propose a novel model Frequency-domain Attention In Two Horizons, which decomposes time series into trend and seasonal components.
arXiv Detail & Related papers (2024-05-22T02:37:02Z) - Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF)
Our model avoids the influence of cumulative error and does not increase the time complexity.
Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.