TiMi: Empower Time Series Transformers with Multimodal Mixture of Experts
- URL: http://arxiv.org/abs/2602.21693v1
- Date: Wed, 25 Feb 2026 08:51:03 GMT
- Title: TiMi: Empower Time Series Transformers with Multimodal Mixture of Experts
- Authors: Jiafeng Lin, Yuxuan Wang, Huakun Luo, Zhongyi Pei, Jianmin Wang,
- Abstract summary: We propose Time series transformers with Multimodal Mixture-of-Experts, TiMi, to unleash the causal reasoning capabilities of LLMs.<n>To seamlessly integrate both factors and time series into predictions, we introduce a Multimodal Mixture-of-Experts (MMoE) module.
- Score: 16.497819301793538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal time series forecasting has garnered significant attention for its potential to provide more accurate predictions than traditional single-modality models by leveraging rich information inherent in other modalities. However, due to fundamental challenges in modality alignment, existing methods often struggle to effectively incorporate multimodal data into predictions, particularly textual information that has a causal influence on time series fluctuations, such as emergency reports and policy announcements. In this paper, we reflect on the role of textual information in numerical forecasting and propose Time series transformers with Multimodal Mixture-of-Experts, TiMi, to unleash the causal reasoning capabilities of LLMs. Concretely, TiMi utilizes LLMs to generate inferences on future developments, which serve as guidance for time series forecasting. To seamlessly integrate both exogenous factors and time series into predictions, we introduce a Multimodal Mixture-of-Experts (MMoE) module as a lightweight plug-in to empower Transformer-based time series models for multimodal forecasting, eliminating the need for explicit representation-level alignment. Experimentally, our proposed TiMi demonstrates consistent state-of-the-art performance on sixteen real-world multimodal forecasting benchmarks, outperforming advanced baselines while offering both strong adaptability and interpretability.
Related papers
- DiTS: Multimodal Diffusion Transformers Are Time Series Forecasters [50.43534351968113]
Existing generative time series models do not address the multi-dimensional properties of time series data well.<n>Inspired by Multimodal Diffusion Transformers that integrate textual guidance into video generation, we propose Diffusion Transformers for Time Series (DiTS)
arXiv Detail & Related papers (2026-02-06T10:48:13Z) - Multi-Modal Time Series Prediction via Mixture of Modulated Experts [28.358760170766004]
We propose Expert Modulation, a new paradigm for multi-modal time series prediction.<n>Our proposed method demonstrates substantial improvements in multi-modal time series prediction.
arXiv Detail & Related papers (2026-01-29T11:03:09Z) - A Unified Frequency Domain Decomposition Framework for Interpretable and Robust Time Series Forecasting [81.73338008264115]
Current approaches for time series forecasting, whether in the time or frequency domain, predominantly use deep learning models based on linear layers or transformers.<n>We propose FIRE, a unified frequency domain decomposition framework that provides a mathematical abstraction for diverse types of time series.<n>Fire consistently outperforms state-of-the-art models on long-term forecasting benchmarks.
arXiv Detail & Related papers (2025-10-11T09:59:25Z) - When Does Multimodality Lead to Better Time Series Forecasting? [96.26052272121615]
We investigate whether and under what conditions such multimodal integration consistently yields gains.<n>Our findings reveal that the benefits of multimodality are highly condition-dependent.<n>Our study offers a rigorous, quantitative foundation for understanding when multimodality can be expected to aid forecasting tasks.
arXiv Detail & Related papers (2025-06-20T23:55:56Z) - Context-Aware Probabilistic Modeling with LLM for Multimodal Time Series Forecasting [24.56167831047955]
We propose CAPTime, a context-aware probabilistic multimodal time series forecasting method.<n>Our method first encodes temporal patterns using a pretrained time series encoder, then aligns them with textual contexts via learnable interactions.<n> Experiments on diverse time series forecasting tasks demonstrate the superior accuracy and generalization of CAPTime.
arXiv Detail & Related papers (2025-05-16T01:23:53Z) - ChronoSteer: Bridging Large Language Model and Time Series Foundation Model via Synthetic Data [22.81326423408988]
We introduce ChronoSteer, a multimodal TSFM that can be steered through textual revision instructions.<n>To mitigate the shortage of cross-modal instruction-series paired data, we devise a two-stage training strategy based on synthetic data.<n> ChronoSteer achieves a 25.7% improvement in prediction accuracy compared to the unimodal backbone and a 22.5% gain over the previous state-of-the-art multimodal method.
arXiv Detail & Related papers (2025-05-15T08:37:23Z) - Dual-Forecaster: A Multimodal Time Series Model Integrating Descriptive and Predictive Texts [5.873261646876953]
We propose Dual-Forecaster, a pioneering multimodal time series model that combines both descriptively historical textual information and predictive textual insights.<n>Our comprehensive evaluations on fifteen multimodal time series datasets demonstrate that Dual-Forecaster is a distinctly effective multimodal time series model.
arXiv Detail & Related papers (2025-05-02T09:24:31Z) - TimeXL: Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop [79.5773512667468]
TimeXL is a multi-modal prediction framework that integrates a prototype-based time series encoder with three collaborating Large Language Models.<n>A reflection LLM compares the predicted values against the ground truth, identifying textual inconsistencies or noise.<n>This closed-loop workflow-prediction, critique (reflect), and refinement-continuously boosts the framework's performance and interpretability.
arXiv Detail & Related papers (2025-03-02T20:40:53Z) - MGCP: A Multi-Grained Correlation based Prediction Network for Multivariate Time Series [54.91026286579748]
We propose a Multi-Grained Correlations-based Prediction Network.
It simultaneously considers correlations at three levels to enhance prediction performance.
It employs adversarial training with an attention mechanism-based predictor and conditional discriminator to optimize prediction results at coarse-grained level.
arXiv Detail & Related papers (2024-05-30T03:32:44Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z) - Modality-aware Transformer for Financial Time series Forecasting [3.401797102198429]
We introduce a novel multimodal transformer-based model named the textitModality-aware Transformer.
Our model excels in exploring the power of both categorical text and numerical timeseries to forecast the target time series effectively.
Our experiments on financial datasets demonstrate that Modality-aware Transformer outperforms existing methods.
arXiv Detail & Related papers (2023-10-02T14:22:41Z) - Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF)
Our model avoids the influence of cumulative error and does not increase the time complexity.
Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.