TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis
- URL: http://arxiv.org/abs/2510.01538v2
- Date: Mon, 06 Oct 2025 19:00:27 GMT
- Title: TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis
- Authors: Haokun Zhao, Xiang Zhang, Jiaqi Wei, Yiwei Xu, Yuting He, Siqi Sun, Chenyu You,
- Abstract summary: TimeSeriesScientist (TSci) is a general, domain-agnostic framework for time series forecasting.<n>It reduces forecast error by an average of 10.4% and 38.2%, respectively.<n>With transparent natural-language rationales and comprehensive reports, TSci transforms the forecasting into a white-box system.
- Score: 25.377586527585503
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series forecasting is central to decision-making in domains as diverse as energy, finance, climate, and public health. In practice, forecasters face thousands of short, noisy series that vary in frequency, quality, and horizon, where the dominant cost lies not in model fitting, but in the labor-intensive preprocessing, validation, and ensembling required to obtain reliable predictions. Prevailing statistical and deep learning models are tailored to specific datasets or domains and generalize poorly. A general, domain-agnostic framework that minimizes human intervention is urgently in demand. In this paper, we introduce TimeSeriesScientist (TSci), the first LLM-driven agentic framework for general time series forecasting. The framework comprises four specialized agents: Curator performs LLM-guided diagnostics augmented by external tools that reason over data statistics to choose targeted preprocessing; Planner narrows the hypothesis space of model choice by leveraging multi-modal diagnostics and self-planning over the input; Forecaster performs model fitting and validation and, based on the results, adaptively selects the best model configuration as well as ensemble strategy to make final predictions; and Reporter synthesizes the whole process into a comprehensive, transparent report. With transparent natural-language rationales and comprehensive reports, TSci transforms the forecasting workflow into a white-box system that is both interpretable and extensible across tasks. Empirical results on eight established benchmarks demonstrate that TSci consistently outperforms both statistical and LLM-based baselines, reducing forecast error by an average of 10.4% and 38.2%, respectively. Moreover, TSci produces a clear and rigorous report that makes the forecasting workflow more transparent and interpretable.
Related papers
- It's TIME: Towards the Next Generation of Time Series Forecasting Benchmarks [87.7937890373758]
Time series foundation models (TSFMs) are revolutionizing the forecasting landscape from specific dataset modeling to generalizable task evaluation.<n>We introduce TIME, a next-generation task-centric benchmark comprising 50 fresh datasets and 98 forecasting tasks.<n>We propose a novel pattern-level evaluation perspective that moves beyond traditional dataset-level evaluations based on static meta labels.
arXiv Detail & Related papers (2026-02-12T16:31:01Z) - STAR : Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction [78.0692157478247]
We propose STAR, a framework that bridges data-driven STatistical expectations with knowledge-driven Agentic Reasoning.<n>We show that STAR consistently outperforms all baselines on both score-based and rank-based metrics.
arXiv Detail & Related papers (2026-02-12T16:30:07Z) - Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles [7.787518725874443]
Time series foundation models (TSFMs) have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation.<n>This paper investigates a suite of statistical and ensemble-based enhancement techniques to improve robustness and accuracy.
arXiv Detail & Related papers (2025-08-18T04:06:26Z) - BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models [47.66064662912721]
We introduce a novel pre-training corpus designed to enhance data diversity through a balanced sampling strategy.<n>BLT incorporates 321 billion observations from publicly available datasets and employs a comprehensive suite of statistical metrics to characterize time series patterns.<n>Our findings highlight the pivotal role of data diversity in improving both training efficiency and model performance for the universal forecasting task.
arXiv Detail & Related papers (2025-05-23T13:20:47Z) - Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - DAM: Towards A Foundation Model for Time Series Forecasting [0.8231118867997028]
We propose a neural model that takes randomly sampled histories and outputs an adjustable basis composition as a continuous function of time.
It involves three key components: (1) a flexible approach for using randomly sampled histories from a long-tail distribution; (2) a transformer backbone that is trained on these actively sampled histories to produce, as representational output; and (3) the basis coefficients of a continuous function of time.
arXiv Detail & Related papers (2024-07-25T08:48:07Z) - LoMEF: A Framework to Produce Local Explanations for Global Model Time
Series Forecasts [2.3096751699592137]
Global Forecasting Models (GFM) that are trained across a set of multiple time series have shown superior results in many forecasting competitions and real-world applications.
However, GFMs typically lack interpretability, especially towards particular time series.
We propose a novel local model-agnostic interpretability approach to explain the forecasts from GFMs.
arXiv Detail & Related papers (2021-11-13T00:17:52Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - A Worrying Analysis of Probabilistic Time-series Models for Sales
Forecasting [10.690379201437015]
Probabilistic time-series models become popular in the forecasting field as they help to make optimal decisions under uncertainty.
We analyze the performance of three prominent probabilistic time-series models for sales forecasting.
arXiv Detail & Related papers (2020-11-21T03:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.