Related papers: TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness

TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness

URL: http://arxiv.org/abs/2506.06482v1
Date: Fri, 06 Jun 2025 19:11:48 GMT
Title: TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness
Authors: Zhiyuan Zhao, Juntong Ni, Shangqing Xu, Haoxin Liu, Wei Jin, B. Aditya Prakash,
Abstract summary: TimeRecipe is a framework that systematically evaluates time-series forecasting methods at the module level.<n>TimeRecipe conducts over 10,000 experiments to assess the effectiveness of individual components.<n>Our results reveal that exhaustive exploration of the design space can yield models that outperform existing state-of-the-art methods.
Score: 23.143208640116253
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time-series forecasting is an essential task with wide real-world applications across domains. While recent advances in deep learning have enabled time-series forecasting models with accurate predictions, there remains considerable debate over which architectures and design components, such as series decomposition or normalization, are most effective under varying conditions. Existing benchmarks primarily evaluate models at a high level, offering limited insight into why certain designs work better. To mitigate this gap, we propose TimeRecipe, a unified benchmarking framework that systematically evaluates time-series forecasting methods at the module level. TimeRecipe conducts over 10,000 experiments to assess the effectiveness of individual components across a diverse range of datasets, forecasting horizons, and task settings. Our results reveal that exhaustive exploration of the design space can yield models that outperform existing state-of-the-art methods and uncover meaningful intuitions linking specific design choices to forecasting scenarios. Furthermore, we release a practical toolkit within TimeRecipe that recommends suitable model architectures based on these empirical insights. The benchmark is available at: https://github.com/AdityaLab/TimeRecipe.

Related papers

Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting [64.45587649141842]
Time-series forecasting plays a critical role in many real-world applications.<n>No single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases.<n>We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models.
arXiv Detail & Related papers (2025-05-24T00:45:07Z)
The Relevance of AWS Chronos: An Evaluation of Standard Methods for Time Series Forecasting with Limited Tuning [0.0]
Chronos is a transformer-based time series forecasting framework.<n>Our analysis reveals that while Chronos demonstrates superior performance for longer-term predictions, traditional models show significant degradation as context length increases.<n>This study provides a case for deploying Chronos in real-world applications where limited model tuning is feasible.
arXiv Detail & Related papers (2025-01-17T14:23:54Z)
In-Context Fine-Tuning for Time-Series Foundation Models [18.348874079298298]
In particular, we design a pretrained foundation model that can be prompted with multiple time-series examples. Our foundation model is specifically trained to utilize examples from multiple related time-series in its context window. We show that such a foundation model that uses in-context examples at inference time can obtain much better performance on popular forecasting benchmarks.
arXiv Detail & Related papers (2024-10-31T16:20:04Z)
Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z)
GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation [90.53485251837235]
Time series foundation models excel in zero-shot forecasting, handling diverse tasks without explicit training. GIFT-Eval is a pioneering benchmark aimed at promoting evaluation across diverse datasets. GIFT-Eval encompasses 23 datasets over 144,000 time series and 177 million data points.
arXiv Detail & Related papers (2024-10-14T11:29:38Z)
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling. Our research explores task-specific model pruning to inform decisions about designing SMoE architectures. We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z)
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines [86.36060279469304]
We introduce PredBench, a benchmark tailored for the holistic evaluation of prediction-temporal networks. This benchmark integrates 12 widely adopted methods with diverse datasets across multiple application domains. Its multi-dimensional evaluation framework broadens the analysis with a comprehensive set of metrics.
arXiv Detail & Related papers (2024-07-11T11:51:36Z)
TSPP: A Unified Benchmarking Tool for Time-series Forecasting [3.5415344166235534]
We propose a unified benchmarking framework that exposes the crucial modelling and machine learning decisions involved in developing time series forecasting models. This framework fosters seamless integration of models and datasets, aiding both practitioners and researchers in their development efforts. We benchmark recently proposed models within this framework, demonstrating that carefully implemented deep learning models with minimal effort can rival gradient-boosting decision trees.
arXiv Detail & Related papers (2023-12-28T16:23:58Z)
Unified Long-Term Time-Series Forecasting Benchmark [0.6526824510982802]
We present a comprehensive dataset designed explicitly for long-term time-series forecasting. We incorporate a collection of datasets obtained from diverse, dynamic systems and real-life records. To determine the most effective model in diverse scenarios, we conduct an extensive benchmarking analysis using classical and state-of-the-art models. Our findings reveal intriguing performance comparisons among these models, highlighting the dataset-dependent nature of model effectiveness.
arXiv Detail & Related papers (2023-09-27T18:59:00Z)
Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data. Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step. When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z)
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms. We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance. We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
The Effectiveness of Discretization in Forecasting: An Empirical Study on Neural Time Series Models [15.281725756608981]
We investigate the effect of data input and output transformations on the predictive performance of neural forecasting architectures. We find that binning almost always improves performance compared to using normalized real-valued inputs.
arXiv Detail & Related papers (2020-05-20T15:09:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.