A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models
- URL: http://arxiv.org/abs/2501.01394v1
- Date: Thu, 02 Jan 2025 18:12:42 GMT
- Title: A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models
- Authors: Jingjing Xu, Caesar Wu, Yuan-Fang Li, Grégoire Danoy, Pascal Bouvry,
- Abstract summary: Transformer-based models for time series forecasting (TSF) have attracted significant attention in recent years due to their effectiveness and versatility.
We present one such pipeline and conduct extensive experiments on several state-of-the-art (SOTA) transformer-based TSF models.
Our pipeline is generalizable beyond transformer-based architectures and can be applied to other SOTA models, such as Mamba and TimeMixer.
- Score: 36.31269406067809
- License:
- Abstract: Transformer-based models for time series forecasting (TSF) have attracted significant attention in recent years due to their effectiveness and versatility. However, these models often require extensive hyperparameter optimization (HPO) to achieve the best possible performance, and a unified pipeline for HPO in transformer-based TSF remains lacking. In this paper, we present one such pipeline and conduct extensive experiments on several state-of-the-art (SOTA) transformer-based TSF models. These experiments are conducted on standard benchmark datasets to evaluate and compare the performance of different models, generating practical insights and examples. Our pipeline is generalizable beyond transformer-based architectures and can be applied to other SOTA models, such as Mamba and TimeMixer, as demonstrated in our experiments. The goal of this work is to provide valuable guidance to both industry practitioners and academic researchers in efficiently identifying optimal hyperparameters suited to their specific domain applications. The code and complete experimental results are available on GitHub.
Related papers
- Visual Fourier Prompt Tuning [63.66866445034855]
We propose the Visual Fourier Prompt Tuning (VFPT) method as a general and effective solution for adapting large-scale transformer-based models.
Our approach incorporates the Fast Fourier Transform into prompt embeddings and harmoniously considers both spatial and frequency domain information.
Our results demonstrate that our approach outperforms current state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2024-11-02T18:18:35Z) - A Systematic Review for Transformer-based Long-term Series Forecasting [7.414422194379818]
Transformer architecture has proven to be the most successful solution to extract semantic correlations.
Various variants have enabled transformer architecture to handle long-term time series forecasting tasks.
arXiv Detail & Related papers (2023-10-31T06:37:51Z) - A Transformer-based Framework For Multi-variate Time Series: A Remaining
Useful Life Prediction Use Case [4.0466311968093365]
This work proposed an encoder-transformer architecture-based framework for time series prediction.
We validated the effectiveness of the proposed framework on all four sets of the C-MAPPS benchmark dataset.
To enable the model awareness of the initial stages of the machine life and its degradation path, a novel expanding window method was proposed.
arXiv Detail & Related papers (2023-08-19T02:30:35Z) - PETformer: Long-term Time Series Forecasting via Placeholder-enhanced
Transformer [5.095882718779794]
This study investigates key issues when applying Transformer to long-term time series forecasting tasks.
We introduce the Placeholder-enhanced Technique (PET) to enhance the computational efficiency and predictive accuracy of Transformer in LTSF tasks.
PETformer achieves state-of-the-art performance on eight commonly used public datasets for LTSF, surpassing all existing models.
arXiv Detail & Related papers (2023-08-09T08:30:22Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - CLMFormer: Mitigating Data Redundancy to Revitalize Transformer-based
Long-Term Time Series Forecasting System [46.39662315849883]
Long-term time-series forecasting (LTSF) plays a crucial role in various practical applications.
Existing Transformer-based models, such as Fedformer and Informer, often achieve their best performances on validation sets after just a few epochs.
We propose a novel approach to address this issue by employing curriculum learning and introducing a memory-driven decoder.
arXiv Detail & Related papers (2022-07-16T04:05:15Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Development of Deep Transformer-Based Models for Long-Term Prediction of
Transient Production of Oil Wells [9.832272256738452]
We propose a novel approach to data-driven modeling of a transient production of oil wells.
We apply the transformer-based neural networks trained on the multivariate time series composed of various parameters of oil wells.
We generalize the single-well model based on the transformer architecture for multiple wells to simulate complex transient oilfield-level patterns.
arXiv Detail & Related papers (2021-10-12T15:00:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.