On Identifying Why and When Foundation Models Perform Well on Time-Series Forecasting Using Automated Explanations and Rating
- URL: http://arxiv.org/abs/2508.20437v1
- Date: Thu, 28 Aug 2025 05:27:45 GMT
- Title: On Identifying Why and When Foundation Models Perform Well on Time-Series Forecasting Using Automated Explanations and Rating
- Authors: Michael Widener, Kausik Lakkaraju, John Aydin, Biplav Srivastava,
- Abstract summary: Time-series forecasting models (TSFM) have evolved from classical statistical methods to sophisticated foundation models.<n>This work addresses concerns by combining traditional explainable AI (XAI) methods with Rating Driven Explanations (RDE)<n>We evaluate four distinct model architectures: ARIMA, Gradient Boosting, Chronos (time-series specific foundation model), Llama (general-purpose; both fine-tuned and base models)
- Score: 7.375605655806626
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Time-series forecasting models (TSFM) have evolved from classical statistical methods to sophisticated foundation models, yet understanding why and when these models succeed or fail remains challenging. Despite this known limitation, time series forecasting models are increasingly used to generate information that informs real-world actions with equally real consequences. Understanding the complexity, performance variability, and opaque nature of these models then becomes a valuable endeavor to combat serious concerns about how users should interact with and rely on these models' outputs. This work addresses these concerns by combining traditional explainable AI (XAI) methods with Rating Driven Explanations (RDE) to assess TSFM performance and interpretability across diverse domains and use cases. We evaluate four distinct model architectures: ARIMA, Gradient Boosting, Chronos (time-series specific foundation model), Llama (general-purpose; both fine-tuned and base models) on four heterogeneous datasets spanning finance, energy, transportation, and automotive sales domains. In doing so, we demonstrate that feature-engineered models (e.g., Gradient Boosting) consistently outperform foundation models (e.g., Chronos) in volatile or sparse domains (e.g., power, car parts) while providing more interpretable explanations, whereas foundation models excel only in stable or trend-driven contexts (e.g., finance).
Related papers
- Assessing Electricity Demand Forecasting with Exogenous Data in Time Series Foundation Models [0.0]
This paper empirically evaluates foundation models capable of modeling cross-channel correlations against a baseline LSTM.<n>We find that the simple baseline frequently outperforms all foundation models in Singapore's stable climate, particularly for short-term horizons.<n>These results challenge assumptions about universal foundation model superiority and highlight the need for domain-specific models.
arXiv Detail & Related papers (2026-02-05T07:17:21Z) - TABL-ABM: A Hybrid Framework for Synthetic LOB Generation [0.0]
Recent application of deep learning models to financial trading has heightened the need for high fidelity financial time series data.<n>State-of-the-art models for the generative application often rely on huge amounts of historical data and large, complicated models.<n>Agent-based approaches to modelling limit order book dynamics can also recreate trading activity.
arXiv Detail & Related papers (2025-10-26T14:04:49Z) - Understanding the Implicit Biases of Design Choices for Time Series Foundation Models [90.894232610821]
Time series foundation models (TSFMs) are a class of potentially powerful, general-purpose tools for time series forecasting and related temporal tasks.<n>Their behavior is strongly shaped by subtle inductive biases in their design.<n>We show how these biases can be intuitive or very counterintuitive, depending on properties of the model and data.
arXiv Detail & Related papers (2025-10-22T04:42:35Z) - How Foundational are Foundation Models for Time Series Forecasting? [2.692427265051276]
We argue that the inherent diversity of time series data makes foundation models less suited for building effective models.<n>We show that the zero-shot capabilities of a time series foundation model are significantly influenced and tied to the specific domains it has been pretrained on.
arXiv Detail & Related papers (2025-10-01T10:25:43Z) - Estimating Time Series Foundation Model Transferability via In-Context Learning [74.65355820906355]
Time series foundation models (TSFMs) offer strong zero-shot forecasting via large-scale pre-training.<n>Fine-tuning remains critical for boosting performance in domains with limited public data.<n>We introduce TimeTic, a transferability estimation framework that recasts model selection as an in-context-learning problem.
arXiv Detail & Related papers (2025-09-28T07:07:13Z) - Tailored Architectures for Time Series Forecasting: Evaluating Deep Learning Models on Gaussian Process-Generated Data [0.5573267589690007]
Research aims at uncovering clear connections between time series characteristics and particular models.<n>We present TimeFlex, a new model that incorporates a modular architecture tailored to handle diverse temporal dynamics.<n>This model is compared to current state-of-the-art models, offering a deeper understanding of how models perform under varied time series conditions.
arXiv Detail & Related papers (2025-06-10T16:46:02Z) - Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z) - Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models [104.17057231661371]
Time series analysis is crucial for understanding dynamics of complex systems.<n>Recent advances in foundation models have led to task-agnostic Time Series Foundation Models (TSFMs) and Large Language Model-based Time Series Models (TSLLMs)<n>Their success depends on large, diverse, and high-quality datasets, which are challenging to build due to regulatory, diversity, quality, and quantity constraints.<n>This survey provides a comprehensive review of synthetic data for TSFMs and TSLLMs, analyzing data generation strategies, their role in model pretraining, fine-tuning, and evaluation, and identifying future research directions.
arXiv Detail & Related papers (2025-03-14T13:53:46Z) - Exploring Representations and Interventions in Time Series Foundation Models [17.224575072056627]
Time series foundation models (TSFMs) promise to be powerful tools for a wide range of applications.<n>Their internal representations and learned concepts are still not well understood.<n>This study investigates the structure and redundancy of representations across various TSFMs.
arXiv Detail & Related papers (2024-09-19T17:11:27Z) - PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis [14.298235969992877]
We introduce a comprehensive framework for perturbation response modeling in single cells.<n>Our approach includes a modular and user-friendly model development and evaluation platform.<n>We highlight the limitations of widely used models, such as mode collapse.
arXiv Detail & Related papers (2024-08-20T07:40:20Z) - The Bayesian Context Trees State Space Model for time series modelling and forecasting [7.018547803286913]
A hierarchical Bayesian framework is introduced for developing tree-based mixture models for time series.<n>We call this the Bayesian Context Trees State Space Model, or the BCT-X framework.
arXiv Detail & Related papers (2023-08-02T02:40:42Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Deep incremental learning models for financial temporal tabular datasets
with distribution shifts [0.9790236766474201]
The framework uses a simple basic building block (decision trees) to build self-similar models of any required complexity.
We demonstrate our scheme using XGBoost models trained on the Numerai dataset and show that a two layer deep ensemble of XGBoost models over different model snapshots delivers high quality predictions.
arXiv Detail & Related papers (2023-03-14T14:10:37Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.