Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
- URL: http://arxiv.org/abs/2506.14087v2
- Date: Fri, 10 Oct 2025 10:32:07 GMT
- Title: Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
- Authors: Zhongzheng Qiao, Chenghao Liu, Yiming Zhang, Ming Jin, Quang Pham, Qingsong Wen, P. N. Suganthan, Xudong Jiang, Savitha Ramasamy,
- Abstract summary: Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting.<n>While naive finetuning can yield performance gains, we argue that it falls short of fully leveraging TSFMs' capabilities.<n>We propose Multiscale finetuning (MSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the finetuning process.
- Score: 67.95907033226585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting. However, an important yet underexplored challenge is how to effectively finetune TSFMs on specific downstream tasks. While naive finetuning can yield performance gains, we argue that it falls short of fully leveraging TSFMs' capabilities, often resulting in overfitting and suboptimal performance. Given the diverse temporal patterns across sampling scales and the inherent multi-scale forecasting capabilities of TSFMs, we adopt a causal perspective to analyze finetuning process, through which we highlight the critical importance of explicitly modeling multiple scales and reveal the shortcomings of naive approaches. Focusing on encoder-based TSFMs, we propose Multiscale finetuning (MSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the finetuning process. Experimental results on three different backbones (Moirai, Moment and Units) demonstrate that TSFMs finetuned with MSFT not only outperform naive and typical parameter efficient finetuning methods but also surpass state-of-the-art deep learning methods. Codes are available at https://github.com/zqiao11/MSFT.
Related papers
- Benchmarking Few-shot Transferability of Pre-trained Models with Improved Evaluation Protocols [123.73663884421272]
Few-shot transfer has been revolutionized by stronger pre-trained models and improved adaptation algorithms.<n>We establish FEWTRANS, a comprehensive benchmark containing 10 diverse datasets.<n>By releasing FEWTRANS, we aim to provide a rigorous "ruler" to streamline reproducible advances in few-shot transfer learning research.
arXiv Detail & Related papers (2026-02-28T05:41:57Z) - Diversified Scaling Inference in Time Series Foundation Models [17.268760626931517]
This work systematically investigates two questions: how do TSFMs behave under standard sampling-based inference scaling, and can controlled sampling diversity enhance performance?<n>We first examine the properties of TSFMs under standard sampling often fail to adhere to scaling laws due to insufficient exploration of the solution space.<n>We then delve into diversified inference scaling via tailored time series perturbations to expand the generative distribution's support.
arXiv Detail & Related papers (2026-01-24T08:53:42Z) - Time Series Foundation Models for Process Model Forecasting [8.339024524110828]
Process Model Forecasting aims to predict how the control-flow structure of a process evolves over time.<n>Machine learning and deep learning models provide only modest gains over statistical baselines.<n>We investigate Time Series Foundation Models (TSFMs) as an alternative for PMF.
arXiv Detail & Related papers (2025-12-08T15:08:50Z) - Synapse: Adaptive Arbitration of Complementary Expertise in Time Series Foundational Models [50.877082340479085]
We study how different Time Series Foundational Models (TSFMs) exhibit specialized performance profiles across various forecasting settings.<n>We propose Synapse, a novel arbitration framework for TSFMs.<n>Results demonstrate that Synapse consistently outperforms other popular ensembling techniques as well as individual TSFMs.
arXiv Detail & Related papers (2025-11-07T18:01:51Z) - TSGym: Design Choices for Deep Multivariate Time-Series Forecasting [38.12202305030755]
This work bridges gaps by decomposing deep MTSF methods into their core, fine-grained components.<n>We propose a novel automated solution called TSGym for MTSF tasks.<n>Extensive experiments indicate that TSGym significantly outperforms existing state-of-the-art MTSF and AutoML methods.
arXiv Detail & Related papers (2025-09-21T12:49:31Z) - One-Embedding-Fits-All: Efficient Zero-Shot Time Series Forecasting by a Model Zoo [82.65837388129746]
Time Series Foundation Models (TSFMs) have significantly advanced zero-shot forecasting.<n>No single TSFM excels universally, as different models exhibit preferences for distinct temporal patterns.<n>We propose ZooCast, which characterizes each model's distinct forecasting strengths.
arXiv Detail & Related papers (2025-09-04T13:34:54Z) - Wavelet Mixture of Experts for Time Series Forecasting [7.478995447422547]
We propose a novel, lightweight time series prediction model, WaveTS-B.<n>This model combines wavelet transforms with a mechanism to capture both periodic and non-stationary characteristics of data in the wavelet domain.<n>We show that our model achieves state-of-the-art (SOTA) performance with significantly fewer parameters.
arXiv Detail & Related papers (2025-08-12T10:32:51Z) - Can Time-Series Foundation Models Perform Building Energy Management Tasks? [5.450531952940644]
Building energy management tasks require processing and learning from a variety of time-series data.<n>Existing solutions rely on bespoke task- and data-specific models to perform these tasks.<n>Inspired by the transformative success of Large Language Models (LLMs), Time-Series Foundation Models (TSFMs) have the potential to change this.
arXiv Detail & Related papers (2025-06-12T19:45:10Z) - Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning [29.377178687865136]
Time Series Foundation Models pre-train vast parameters and achieve remarkable zero-shot forecasting performance.<n>Surprisingly, even after fine-tuning, TSFMs cannot consistently outperform smaller, specialized models trained on full-shot downstream data.<n>We propose a structured pruning method to regularize the subsequent fine-tuning process by focusing it on a more relevant and compact parameter space.
arXiv Detail & Related papers (2025-05-29T07:33:49Z) - Investigating Compositional Reasoning in Time Series Foundation Models [16.421597202235112]
We study the impact of TSFM architecture design on compositional reasoning and generalization.<n>We find that patch-based Transformers have the best reasoning performance.<n>In some zero-shot out-of-distribution scenarios, these models can outperform moving average and exponential smoothing statistical baselines trained on in-distribution data.
arXiv Detail & Related papers (2025-02-09T21:21:55Z) - Time Series Foundational Models: Their Role in Anomaly Detection and Prediction [0.0]
Time series foundational models (TSFM) have gained prominence in time series forecasting.<n>This paper critically evaluates the efficacy of TSFM in anomaly detection and prediction tasks.
arXiv Detail & Related papers (2024-12-26T17:15:30Z) - MFF-FTNet: Multi-scale Feature Fusion across Frequency and Temporal Domains for Time Series Forecasting [18.815152183468673]
Time series forecasting is crucial in many fields, yet current deep learning models struggle with noise, data sparsity, and capturing complex patterns.
This paper presents MFF-FTNet, a novel framework addressing these challenges by combining contrastive learning with multi-scale feature extraction.
Extensive experiments on five real-world datasets demonstrate that MFF-FTNet significantly outperforms state-of-the-art models.
arXiv Detail & Related papers (2024-11-26T12:41:42Z) - Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
We propose a novel textbfSelf-textbfPerceptinon textbfTuning (textbfSPT) method for anomaly segmentation.<n>The SPT method incorporates a self-drafting tuning strategy, which generates an initial coarse draft of the anomaly mask, followed by a refinement process.
arXiv Detail & Related papers (2024-11-26T08:33:25Z) - Visual Fourier Prompt Tuning [63.66866445034855]
We propose the Visual Fourier Prompt Tuning (VFPT) method as a general and effective solution for adapting large-scale transformer-based models.
Our approach incorporates the Fast Fourier Transform into prompt embeddings and harmoniously considers both spatial and frequency domain information.
Our results demonstrate that our approach outperforms current state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2024-11-02T18:18:35Z) - TSFM-Bench: A Comprehensive and Unified Benchmark of Foundation Models for Time Series Forecasting [35.505530132151]
Time Series Forecasting (TSF) is key functionality in numerous fields, such as financial investment, weather services, and energy management.<n>Many TSF methods require domain-specific data collection and model training and do not generalize well when applied in other domains.<n>Time Series Foundation Models (TSFMs) that are pre-trained on massive heterogeneous time series data aim to overcome these limitations.
arXiv Detail & Related papers (2024-10-15T17:23:49Z) - ViTime: Foundation Model for Time Series Forecasting Powered by Vision Intelligence [49.60944381032587]
Time series forecasting (TSF) possesses great practical values in various fields, including power and energy, transportation, etc.<n>TSF models have long been known to be problem-specific and lacking application generalizability.<n>This paper proposes a vision intelligence-powered framework, ViTime, for the first time.
arXiv Detail & Related papers (2024-07-10T02:11:01Z) - Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting [46.63798583414426]
Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis.
Our study demonstrates, through both analytical and empirical evidence, that decomposition is key to containing excessive model inflation.
Remarkably, by tailoring decomposition to the intrinsic dynamics of time series data, our proposed model outperforms existing benchmarks.
arXiv Detail & Related papers (2024-01-22T13:15:40Z) - Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual
Downstream Tasks [55.36987468073152]
This paper proposes a novel Dual-Guided Spatial-Channel-Temporal (DG-SCT) attention mechanism.
The DG-SCT module incorporates trainable cross-modal interaction layers into pre-trained audio-visual encoders.
Our proposed model achieves state-of-the-art results across multiple downstream tasks, including AVE, AVVP, AVS, and AVQA.
arXiv Detail & Related papers (2023-11-09T05:24:20Z) - MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource
Visual Question Answering [66.05768870785548]
Finetuning pretrained Vision-Language Models (VLMs) has been a prevailing paradigm for achieving state-of-the-art performance in Visual Question Answering (VQA)
Current parameter-efficient tuning methods dramatically reduce the number of tunable parameters, but there still exists a significant performance gap with full finetuning.
We propose MixPHM, a redundancy-aware parameter-efficient tuning method that outperforms full finetuning in low-resource VQA.
arXiv Detail & Related papers (2023-03-02T13:28:50Z) - Model ensemble instead of prompt fusion: a sample-specific knowledge
transfer method for few-shot prompt tuning [85.55727213502402]
We focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks.
We propose Sample-specific Ensemble of Source Models (SESoM)
SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs.
arXiv Detail & Related papers (2022-10-23T01:33:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.