Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting
- URL: http://arxiv.org/abs/2511.10200v1
- Date: Fri, 14 Nov 2025 01:38:28 GMT
- Title: Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting
- Authors: Jieting Wang, Huimei Shi, Feijiang Li, Xiaolei Shang,
- Abstract summary: Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling.<n>We propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss.<n>Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models.
- Score: 11.320830769077027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Time series forecasting is an important task that involves analyzing temporal dependencies and underlying patterns (such as trends, cyclicality, and seasonality) in historical data to predict future values or trends. Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling. Despite enabling direct value prediction, this method offers no uncertainty estimation and exhibits poor outlier robustness. To address these limitations, we propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss, preserving prediction order while quantifying uncertainty through probability output. Specifically, OCE-TS begins by discretizing observed values into ordered intervals and deriving their probabilities via a parametric distribution as supervision signals. Using a simple linear model, we then predict probability distributions for each timestep. The OCE loss is computed between the cumulative distributions of predicted and ground-truth probabilities, explicitly preserving ordinal relationships among forecasted values. Through theoretical analysis using influence functions, we establish that cross-entropy (CE) loss exhibits superior stability and outlier robustness compared to MSE loss. Empirically, we compared OCE-TS with five baseline models-Autoformer, DLinear, iTransformer, TimeXer, and TimeBridge-on seven public time series datasets. Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models. The code will be published.
Related papers
- Trend-Adjusted Time Series Models with an Application to Gold Price Forecasting [0.0]
Time series data play a critical role in various fields, including finance, healthcare, marketing, and engineering.<n>We propose the Trend-Adjusted Time Series (TATS) model, which adjusts the forecasted values based on the predicted trend.
arXiv Detail & Related papers (2026-01-19T04:09:53Z) - Predictive inference for time series: why is split conformal effective despite temporal dependence? [8.032656343027146]
Conformal prediction methods provide distribution-free coverage for any iid or exchangeable data distribution.<n>Using predictors with "memory" -- i.e., predictors that utilize past observations, such as autoregressive models -- further exacerbates this problem.<n>Our results bound the loss of coverage of these methods in terms of a new "switch coefficient", measuring the extent to which temporal dependence within the time series creates violations of exchangeability.
arXiv Detail & Related papers (2025-10-02T18:24:04Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - RDIT: Residual-based Diffusion Implicit Models for Probabilistic Time Series Forecasting [4.140149411004857]
RDIT is a plug-and-play framework that combines point estimation and residual-based conditional diffusion with a bidirectional Mamba network.<n>We show that RDIT achieves lower CRPS, rapid inference, and improved coverage compared to strong baselines.
arXiv Detail & Related papers (2025-09-02T14:06:29Z) - Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles [7.787518725874443]
Time series foundation models (TSFMs) have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation.<n>This paper investigates a suite of statistical and ensemble-based enhancement techniques to improve robustness and accuracy.
arXiv Detail & Related papers (2025-08-18T04:06:26Z) - Error-quantified Conformal Inference for Time Series [55.11926160774831]
Uncertainty quantification in time series prediction is challenging due to the temporal dependence and distribution shift on sequential data.<n>We propose itError-quantified Conformal Inference (ECI) by smoothing the quantile loss function.<n>ECI can achieve valid miscoverage control and output tighter prediction sets than other baselines.
arXiv Detail & Related papers (2025-02-02T15:02:36Z) - Addressing Prediction Delays in Time Series Forecasting: A Continuous GRU Approach with Derivative Regularization [27.047129496488292]
Predictions succeeding ground-truth observations are not practically meaningful although their MSEs can be low.
We introduce a continuous-time gated recurrent unit (GRU) based on the neural ordinary differential equation (NODE) which can supervise explicit time-derivatives.
Our method outperforms in metrics such as MSE, Dynamic Time Warping (DTW) and Time Distortion Index (TDI)
arXiv Detail & Related papers (2024-06-29T05:36:04Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2023-10-17T20:30:16Z) - When Rigidity Hurts: Soft Consistency Regularization for Probabilistic
Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting.
Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions.
We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2022-06-16T06:13:53Z) - Uncertainty estimation of pedestrian future trajectory using Bayesian
approximation [137.00426219455116]
Under dynamic traffic scenarios, planning based on deterministic predictions is not trustworthy.
The authors propose to quantify uncertainty during forecasting using approximation which deterministic approaches fail to capture.
The effect of dropout weights and long-term prediction on future state uncertainty has been studied.
arXiv Detail & Related papers (2022-05-04T04:23:38Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.