Mixture-of-Linear-Experts for Long-term Time Series Forecasting
- URL: http://arxiv.org/abs/2312.06786v3
- Date: Wed, 1 May 2024 22:23:58 GMT
- Title: Mixture-of-Linear-Experts for Long-term Time Series Forecasting
- Authors: Ronghao Ni, Zinan Lin, Shuaiqi Wang, Giulia Fanti,
- Abstract summary: We propose a Mixture-of-Experts-style augmentation for linear-centric models.
Instead of training a single model, MoLE trains multiple linear-centric models and a router model that weighs and mixes their outputs.
- Score: 13.818468255379969
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-term time series forecasting (LTSF) aims to predict future values of a time series given the past values. The current state-of-the-art (SOTA) on this problem is attained in some cases by linear-centric models, which primarily feature a linear mapping layer. However, due to their inherent simplicity, they are not able to adapt their prediction rules to periodic changes in time series patterns. To address this challenge, we propose a Mixture-of-Experts-style augmentation for linear-centric models and propose Mixture-of-Linear-Experts (MoLE). Instead of training a single model, MoLE trains multiple linear-centric models (i.e., experts) and a router model that weighs and mixes their outputs. While the entire framework is trained end-to-end, each expert learns to specialize in a specific temporal pattern, and the router model learns to compose the experts adaptively. Experiments show that MoLE reduces forecasting error of linear-centric models, including DLinear, RLinear, and RMLP, in over 78% of the datasets and settings we evaluated. By using MoLE existing linear-centric models can achieve SOTA LTSF results in 68% of the experiments that PatchTST reports and we compare to, whereas existing single-head linear-centric models achieve SOTA results in only 25% of cases.
Related papers
- UrbanAI 2025 Challenge: Linear vs Transformer Models for Long-Horizon Exogenous Temperature Forecasting [0.0]
We study long-horizon-only temperature forecasting using linear and Transformer-family models.<n>Results show that linear baselines consistently outperform more complex Transformer-family architectures.
arXiv Detail & Related papers (2025-12-11T17:59:44Z) - Efficiently Generating Correlated Sample Paths from Multi-step Time Series Foundation Models [66.60042743462175]
We present a copula-based approach to efficiently generate accurate, correlated sample paths from time series foundation models.<n>Our approach generates correlated sample paths orders of magnitude faster than autoregressive sampling.
arXiv Detail & Related papers (2025-10-02T17:08:58Z) - CMoS: Rethinking Time Series Prediction Through the Lens of Chunk-wise Spatial Correlations [13.515201493037917]
We present CMoS, a super-lightweight time series forecasting model.<n>CMoS directly models the spatial correlations between different time series chunks.
arXiv Detail & Related papers (2025-05-25T11:01:53Z) - Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting [64.45587649141842]
Time-series forecasting plays a critical role in many real-world applications.<n>No single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases.<n>We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models.
arXiv Detail & Related papers (2025-05-24T00:45:07Z) - Not-So-Optimal Transport Flows for 3D Point Cloud Generation [58.164908756416615]
Learning generative models of 3D point clouds is one of the fundamental problems in 3D generative learning.
In this paper, we analyze the recently proposed equivariant OT flows that learn permutation invariant generative models for point-based molecular data.
We show that our proposed model outperforms prior diffusion- and flow-based approaches on a wide range of unconditional generation and shape completion.
arXiv Detail & Related papers (2025-02-18T02:37:34Z) - LeMoLE: LLM-Enhanced Mixture of Linear Experts for Time Series Forecasting [9.132953776171808]
This paper introduces an LLM-enhanced mixture of linear experts for precise and efficient time series forecasting.
The use of a mixture of linear experts is efficient due to its simplicity, while the multimodal fusion mechanism adaptively combines multiple linear experts.
Our experimental results show that the proposed LeMoLE model presents lower prediction errors and higher computational efficiency than existing LLM models.
arXiv Detail & Related papers (2024-11-24T12:40:50Z) - Self-Supervised Learning for Time Series: A Review & Critique of FITS [0.0]
Recently proposed model, FITS, claims competitive performance with significantly reduced parameter counts.
By training a one-layer neural network in the complex frequency domain, we are able to replicate these results.
Our experiments reveal that FITS especially excels at capturing periodic and seasonal patterns, but struggles with trending, non-periodic, or random-resembling behavior.
arXiv Detail & Related papers (2024-10-23T23:03:09Z) - Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts [103.725112190618]
This paper introduces Moirai-MoE, using a single input/output projection layer while delegating the modeling of diverse time series patterns to the sparse mixture of experts.
Extensive experiments on 39 datasets demonstrate the superiority of Moirai-MoE over existing foundation models in both in-distribution and zero-shot scenarios.
arXiv Detail & Related papers (2024-10-14T13:01:11Z) - LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting [69.33802286580786]
We introduce LTSM-Bundle, a comprehensive toolbox, and benchmark for training LTSMs.
It modularized and benchmarked LTSMs from multiple dimensions, encompassing prompting strategies, tokenization approaches, base model selection, data quantity, and dataset diversity.
Empirical results demonstrate that this combination achieves superior zero-shot and few-shot performances compared to state-of-the-art LTSMs and traditional TSF methods.
arXiv Detail & Related papers (2024-06-20T07:09:19Z) - Predictive Modeling in the Reservoir Kernel Motif Space [0.9217021281095907]
This work proposes a time series prediction method based on the kernel view of linear reservoirs.
We provide a geometric interpretation of our approach shedding light on how our approach is related to the core reservoir models.
Empirical experiments then compare predictive performances of our suggested model with those of recent state-of-art transformer based models.
arXiv Detail & Related papers (2024-05-11T16:12:25Z) - An Analysis of Linear Time Series Forecasting Models [0.0]
We show that several popular variants of linear models for time series forecasting are equivalent and functionally indistinguishable from standard, unconstrained linear regression.
We provide experimental evidence that the models under inspection learn nearly identical solutions, and finally demonstrate that the simpler closed form solutions are superior forecasters across 72% of test settings.
arXiv Detail & Related papers (2024-03-21T17:42:45Z) - Timer: Generative Pre-trained Transformers Are Large Time Series Models [83.03091523806668]
This paper aims at the early development of large time series models (LTSM)
During pre-training, we curate large-scale datasets with up to 1 billion time points.
To meet diverse application needs, we convert forecasting, imputation, and anomaly detection of time series into a unified generative task.
arXiv Detail & Related papers (2024-02-04T06:55:55Z) - OneNet: Enhancing Time Series Forecasting Models under Concept Drift by
Online Ensembling [65.93805881841119]
We propose textbfOnline textbfensembling textbfNetwork (OneNet) to address the concept drifting problem.
OneNet reduces online forecasting error by more than $mathbf50%$ compared to the State-Of-The-Art (SOTA) method.
arXiv Detail & Related papers (2023-09-22T06:59:14Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Grasping Core Rules of Time Series through Pure Models [6.849905754473385]
PureTS is a network with three pure linear layers that achieved state-of-the-art in 80% of the long sequence prediction tasks.
We discuss the potential of pure linear layers in both phenomena and essence.
arXiv Detail & Related papers (2022-08-15T10:22:15Z) - Learning Mixtures of Linear Dynamical Systems [94.49754087817931]
We develop a two-stage meta-algorithm to efficiently recover each ground-truth LDS model up to error $tildeO(sqrtd/T)$.
We validate our theoretical studies with numerical experiments, confirming the efficacy of the proposed algorithm.
arXiv Detail & Related papers (2022-01-26T22:26:01Z) - Global Models for Time Series Forecasting: A Simulation Study [2.580765958706854]
We simulate time series from simple data generating processes (DGP), such as Auto Regressive (AR) and Seasonal AR, to complex DGPs, such as Chaotic Logistic Map, Self-Exciting Threshold Auto-Regressive, and Mackey-Glass equations.
The lengths and the number of series in the dataset are varied in different scenarios.
We perform experiments on these datasets using global forecasting models including Recurrent Neural Networks (RNN), Feed-Forward Neural Networks, Pooled Regression (PR) models, and Light Gradient Boosting Models (LGBM)
arXiv Detail & Related papers (2020-12-23T04:45:52Z) - Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents.
We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings.
We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.