Related papers: VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones

VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones

URL: http://arxiv.org/abs/2508.04379v1
Date: Wed, 06 Aug 2025 12:17:09 GMT
Title: VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones
Authors: Lefei Shen, Mouxiang Chen, Xu Liu, Han Fu, Xiaoxue Ren, Jianling Sun, Zhuo Li, Chenghao Liu,
Abstract summary: We propose VisionTS++, a vision-model-based TSFM that performs continual pre-training on large-scale time series datasets.<n>Our work establishes a new paradigm for cross-modal knowledge transfer, advancing the development of universal TSFMs.
Score: 27.97547118858576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have revealed that vision models pre-trained on images can perform well in time series forecasting by reformulating forecasting as an image reconstruction task, suggesting their potential as universal time series foundation models. However, effective cross-modal transfer from vision to time series remains challenging due to three key discrepancies: (1) data-modality gap between structured, bounded image data and unbounded, heterogeneous time series; (2) multivariate-forecasting gap between standard RGB three-channel-based vision models and the need to model time series with arbitrary numbers of variates; and (3) probabilistic-forecasting gap between the deterministic output formats of most vision models and the requirement for uncertainty-aware probabilistic predictions. To bridge these gaps, we propose VisionTS++, a vision-model-based TSFM that performs continual pre-training on large-scale time series datasets, including 3 innovations: (1) a vision-model-based filtering mechanism to identify high-quality time series data, thereby mitigating modality gap and improving pre-training stability, (2) a colorized multivariate conversion method that transforms multivariate time series into multi-subfigure RGB images, capturing complex inter-variate dependencies; and (3) a multi-quantile forecasting approach using parallel reconstruction heads to generate forecasts of different quantile levels, thus more flexibly approximating arbitrary output distributions without restrictive prior distributional assumptions. Evaluated on both in-distribution and out-of-distribution TSF benchmarks, \model achieves SOTA results, outperforming specialized TSFMs by 6%-44% in MSE reduction and ranking first in 9 out of 12 probabilistic forecasting settings. Our work establishes a new paradigm for cross-modal knowledge transfer, advancing the development of universal TSFMs.

Related papers

T3Time: Tri-Modal Time Series Forecasting via Adaptive Multi-Head Alignment and Residual Fusion [0.4915744683251151]
T3Time is a novel trimodal framework consisting of time, spectral, and prompt branches.<n>It learns prioritization between temporal and spectral features based on the prediction horizon.<n>Our model consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-08-06T09:31:44Z)
MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings [75.0617088717528]
MoCa is a framework for transforming pre-trained VLM backbones into effective bidirectional embedding models.<n>MoCa consistently improves performance across MMEB and ViDoRe-v2 benchmarks, achieving new state-of-the-art results.
arXiv Detail & Related papers (2025-06-29T06:41:00Z)
General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data [61.163542597764796]
We show that time series with different time granularities (or corresponding frequency resolutions) exhibit distinct joint distributions in the frequency domain.<n>A novel Fourier knowledge attention mechanism is proposed to enable learning time-aware representations from both the temporal and frequency domains.<n>An autoregressive blank infilling pre-training framework is incorporated to time series analysis for the first time, leading to a generative tasks agnostic pre-training strategy.
arXiv Detail & Related papers (2025-02-05T15:20:04Z)
UTSD: Unified Time Series Diffusion Model [13.555837288440946]
A Unified Time Series Diffusion model is established for the first time to model the multi-domain probability distribution.<n>We conduct extensive experiments on mainstream benchmarks, and the pre-trained UTSD outperforms existing foundation models on all data domains.
arXiv Detail & Related papers (2024-12-04T06:42:55Z)
DisenTS: Disentangled Channel Evolving Pattern Modeling for Multivariate Time Series Forecasting [43.071713191702486]
DisenTS is a tailored framework for modeling disentangled channel evolving patterns in general time series forecasting. We introduce a novel Forecaster Aware Gate (FAG) module that generates the routing signals adaptively according to both the forecasters' states and input series' characteristics.
arXiv Detail & Related papers (2024-10-30T12:46:14Z)
DAM: Towards A Foundation Model for Time Series Forecasting [0.8231118867997028]
We propose a neural model that takes randomly sampled histories and outputs an adjustable basis composition as a continuous function of time. It involves three key components: (1) a flexible approach for using randomly sampled histories from a long-tail distribution; (2) a transformer backbone that is trained on these actively sampled histories to produce, as representational output; and (3) the basis coefficients of a continuous function of time.
arXiv Detail & Related papers (2024-07-25T08:48:07Z)
MGCP: A Multi-Grained Correlation based Prediction Network for Multivariate Time Series [54.91026286579748]
We propose a Multi-Grained Correlations-based Prediction Network. It simultaneously considers correlations at three levels to enhance prediction performance. It employs adversarial training with an attention mechanism-based predictor and conditional discriminator to optimize prediction results at coarse-grained level.
arXiv Detail & Related papers (2024-05-30T03:32:44Z)
Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai) Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains. Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z)
TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series [57.4208255711412]
Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS) We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks.
arXiv Detail & Related papers (2023-10-02T16:45:19Z)
Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement [51.55157852647306]
Time series forecasting has been a widely explored task of great importance in many applications. It is common that real-world time series data are recorded in a short time period, which results in a big gap between the deep model and the limited and noisy time series. We propose to address the time series forecasting problem with generative modeling and propose a bidirectional variational auto-encoder equipped with diffusion, denoise, and disentanglement.
arXiv Detail & Related papers (2023-01-08T12:20:46Z)
Temporal Saliency Detection Towards Explainable Transformer-based Timeseries Forecasting [3.046315755726937]
This paper introduces Temporal Saliency Detection (TSD), an effective approach that builds upon the attention mechanism and applies it to multi-horizon time series prediction. The TSD approach facilitates the multiresolution analysis of saliency patterns by condensing multi-heads, thereby progressively enhancing the forecasting of complex time series data.
arXiv Detail & Related papers (2022-12-15T12:47:59Z)
Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF) Our model avoids the influence of cumulative error and does not increase the time complexity. Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z)
Improving the Accuracy of Global Forecasting Models using Time Series Data Augmentation [7.38079566297881]
Forecasting models that are trained across sets of many time series, known as Global Forecasting Models (GFM), have shown promising results in forecasting competitions and real-world applications. We propose a novel, data augmentation based forecasting framework that is capable of improving the baseline accuracy of GFM models in less data-abundant settings.
arXiv Detail & Related papers (2020-08-06T13:52:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.