Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models
- URL: http://arxiv.org/abs/2511.00053v1
- Date: Tue, 28 Oct 2025 14:48:25 GMT
- Title: Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models
- Authors: Hao Wang, Licheng Pan, Yuan Lu, Zhichao Chen, Tianqiao Liu, Shuting He, Zhixuan Chu, Qingsong Wen, Haoxuan Li, Zhouchen Lin,
- Abstract summary: Existing training objectives mostly treat each future step as an independent, equally weighted task.<n>We propose a novel quadratic-form weighted training objective, addressing both of the issues simultaneously.<n> Experiments show that our QDF effectively improves performance of various forecast models.
- Score: 88.18038107198218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The design of training objective is central to training time-series forecasting models. Existing training objectives such as mean squared error mostly treat each future step as an independent, equally weighted task, which we found leading to the following two issues: (1) overlook the label autocorrelation effect among future steps, leading to biased training objective; (2) fail to set heterogeneous task weights for different forecasting tasks corresponding to varying future steps, limiting the forecasting performance. To fill this gap, we propose a novel quadratic-form weighted training objective, addressing both of the issues simultaneously. Specifically, the off-diagonal elements of the weighting matrix account for the label autocorrelation effect, whereas the non-uniform diagonals are expected to match the most preferable weights of the forecasting tasks with varying future steps. To achieve this, we propose a Quadratic Direct Forecast (QDF) learning algorithm, which trains the forecast model using the adaptively updated quadratic-form weighting matrix. Experiments show that our QDF effectively improves performance of various forecast models, achieving state-of-the-art results. Code is available at https://anonymous.4open.science/r/QDF-8937.
Related papers
- Enhanced Multi-model Online Conformal Prediction [25.495949162960624]
Conformal prediction is a framework for uncertainty quantification that constructs prediction sets for previously unseen data.<n>The efficiency of these prediction sets, measured by their size, depends on the choice of the underlying learning model.<n>This work develops a novel multi-model online conformal prediction algorithm that reduces computational complexity and improves prediction efficiency.
arXiv Detail & Related papers (2026-01-04T23:44:43Z) - Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training [11.179110411255708]
We propose a direct framework to model the scaling of benchmark performance from the training budget.<n>Our results show that the direct approach extrapolates better than the previously proposed two-stage procedure.<n>We release the complete set of pretraining losses and downstream evaluation results.
arXiv Detail & Related papers (2025-12-09T18:33:48Z) - Enhancing Training Data Attribution with Representational Optimization [57.61977909113113]
Training data attribution methods aim to measure how training data impacts a model's predictions.<n>We propose AirRep, a representation-based approach that closes this gap by learning task-specific and model-aligned representations explicitly for TDA.<n>AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence.
arXiv Detail & Related papers (2025-05-24T05:17:53Z) - Time-o1: Time-Series Forecasting Needs Transformed Label Alignment [50.54348432664401]
Time-o1 is a transformation-augmented learning objective tailored for time-series forecasting.<n>The central idea is to transform the label sequence into decorrelated components with discriminated significance.<n>Time-o1 achieves state-of-the-art performance and is compatible with various forecast models.
arXiv Detail & Related papers (2025-05-23T13:00:35Z) - Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs? [42.608899417822656]
We construct a dataset using 50 1B parameter LLM variants with systematically varied pre-training configurations.<n>We introduce novel unsupervised and supervised proxy metrics derived from pre-training that successfully reduce the relative performance prediction error rate by over 50%.
arXiv Detail & Related papers (2025-04-16T21:19:09Z) - Establishing Task Scaling Laws via Compute-Efficient Model Ladders [136.76316239300363]
We develop task scaling laws and model ladders to predict the individual task performance of pretrained language models (LMs) in the overtrained setting.<n>We train a set of small-scale "ladder" models, collect data points to fit the parameterized functions of the two prediction steps, and make predictions for two target models.<n>On four multiple-choice tasks formatted as ranked classification, we can predict the accuracy of both target models within 2 points of absolute error.
arXiv Detail & Related papers (2024-12-05T18:21:49Z) - TaskMet: Task-Driven Metric Learning for Model Learning [29.0053868393653]
Deep learning models are often deployed in downstream tasks that the training procedure may not be aware of.
We propose take the task loss signal one level deeper than the parameters of the model and use it to learn the parameters of the loss function the model is trained on.
This approach does not alter the optimal prediction model itself, but rather changes the model learning to emphasize the information important for the downstream task.
arXiv Detail & Related papers (2023-12-08T18:59:03Z) - Forecasting Workload in Cloud Computing: Towards Uncertainty-Aware
Predictions and Transfer Learning [1.5749416770494704]
We show that modelling the uncertainty of predictions has a positive impact on performance.
We investigate whether our models benefit transfer learning capabilities across different domains.
arXiv Detail & Related papers (2023-02-24T14:51:30Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.