Estimating Time Series Foundation Model Transferability via In-Context Learning
- URL: http://arxiv.org/abs/2509.23695v1
- Date: Sun, 28 Sep 2025 07:07:13 GMT
- Title: Estimating Time Series Foundation Model Transferability via In-Context Learning
- Authors: Qingren Yao, Ming Jin, Chengqi Zhang, Chao-Han Huck Yang, Jun Qi, Shirui Pan,
- Abstract summary: Time series foundation models (TSFMs) offer strong zero-shot forecasting via large-scale pre-training.<n>Fine-tuning remains critical for boosting performance in domains with limited public data.<n>We introduce TimeTic, a transferability estimation framework that recasts model selection as an in-context-learning problem.
- Score: 74.65355820906355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series foundation models (TSFMs) offer strong zero-shot forecasting via large-scale pre-training, yet fine-tuning remains critical for boosting performance in domains with limited public data. With the growing number of TSFMs, efficiently identifying the best model for downstream fine-tuning becomes increasingly challenging. In this work, we introduce TimeTic, a transferability estimation framework that recasts model selection as an in-context-learning problem: given observations on known (source) datasets, it predicts how a TSFM will perform after fine-tuning on a downstream (target) dataset. TimeTic flexibly organizes the observed model-data relationships as contextual information, allowing it to adapt seamlessly to various test-time scenarios. Leveraging the natural tabular structure formed by dataset meta-features, model characteristics, and fine-tuned performance, we employ tabular foundation models to serve as in-context learners. We further introduce a novel model characterization based on entropy evolution across model layers, capturing embedding-space distinctions and enabling TimeTic to generalize across arbitrary model sets. We establish a comprehensive benchmark for transferability estimation including 10 datasets, 10 foundation models, and 3 forecasting tasks. On this benchmark, TimeTic's estimation demonstrates strong alignment with actual fine-tuned performance for previously unseen datasets, achieving a mean rank correlation of approximately 0.6 and a 30% improvement compared to using zero-shot performance as the transferability score.
Related papers
- It's TIME: Towards the Next Generation of Time Series Forecasting Benchmarks [87.7937890373758]
Time series foundation models (TSFMs) are revolutionizing the forecasting landscape from specific dataset modeling to generalizable task evaluation.<n>We introduce TIME, a next-generation task-centric benchmark comprising 50 fresh datasets and 98 forecasting tasks.<n>We propose a novel pattern-level evaluation perspective that moves beyond traditional dataset-level evaluations based on static meta labels.
arXiv Detail & Related papers (2026-02-12T16:31:01Z) - Lightweight Time Series Data Valuation on Time Series Foundation Models via In-Context Finetuning [40.495409835752746]
Time series foundation models (TSFMs) have demonstrated increasing capabilities due to their extensive pretraining on large volumes of diverse time series data.<n>We propose LTSV, a Lightweight Time Series Valuation on TSFMS via in-context finetuning.<n>We show that LTSV consistently provides reliable and strong valuation performance, while maintaining manageable computational requirements.
arXiv Detail & Related papers (2025-11-10T13:06:46Z) - Measuring Pre-training Data Quality without Labels for Time Series Foundation Models [10.64362760848387]
We introduce contrastive accuracy, a new measure to evaluate the quality of the representation space learned by the foundation model.<n>Our experiments reveal the positive correlation between the proposed measure and the accuracy of the model on a collection of downstream tasks.
arXiv Detail & Related papers (2024-12-09T10:38:30Z) - Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data [39.40116554523575]
We present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network.
It learns to approximate Bayesian inference on synthetic datasets drawn from a prior.
It improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration.
arXiv Detail & Related papers (2024-11-15T23:49:23Z) - GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation [90.53485251837235]
Time series foundation models excel in zero-shot forecasting, handling diverse tasks without explicit training.
GIFT-Eval is a pioneering benchmark aimed at promoting evaluation across diverse datasets.
GIFT-Eval encompasses 23 datasets over 144,000 time series and 177 million data points.
arXiv Detail & Related papers (2024-10-14T11:29:38Z) - ViTime: Foundation Model for Time Series Forecasting Powered by Vision Intelligence [49.60944381032587]
Time series forecasting (TSF) possesses great practical values in various fields, including power and energy, transportation, etc.<n>TSF models have long been known to be problem-specific and lacking application generalizability.<n>This paper proposes a vision intelligence-powered framework, ViTime, for the first time.
arXiv Detail & Related papers (2024-07-10T02:11:01Z) - Chronos: Learning the Language of Time Series [79.38691251254173]
Chronos is a framework for pretrained probabilistic time series models.
We show that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks.
arXiv Detail & Related papers (2024-03-12T16:53:54Z) - EXPRTS: Exploring and Probing the Robustness of Time Series Forecasting Models [1.23187154417297]
We develop an interpretable and simple framework for generating time series.<n>Our method combines time-series decompositions with analytic functions, and is able to generate time series with characteristics matching both in- and out-of-distribution data.<n>We show how our framework can generate meaningful OOD time series that improve model robustness.
arXiv Detail & Related papers (2024-03-06T07:34:47Z) - TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting [24.834846119163885]
We propose a novel framework, TEMPO, that can effectively learn time series representations.
TEMPO expands the capability for dynamically modeling real-world temporal phenomena from data within diverse domains.
arXiv Detail & Related papers (2023-10-08T00:02:25Z) - Unified Long-Term Time-Series Forecasting Benchmark [0.6526824510982802]
We present a comprehensive dataset designed explicitly for long-term time-series forecasting.
We incorporate a collection of datasets obtained from diverse, dynamic systems and real-life records.
To determine the most effective model in diverse scenarios, we conduct an extensive benchmarking analysis using classical and state-of-the-art models.
Our findings reveal intriguing performance comparisons among these models, highlighting the dataset-dependent nature of model effectiveness.
arXiv Detail & Related papers (2023-09-27T18:59:00Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.