Related papers: Sequential Regression for Continuous Value Prediction using Residual Quantization

Sequential Regression for Continuous Value Prediction using Residual Quantization

URL: http://arxiv.org/abs/2602.23012v1
Date: Thu, 26 Feb 2026 13:52:54 GMT
Title: Sequential Regression for Continuous Value Prediction using Residual Quantization
Authors: Runpeng Cui, Zhipeng Sun, Chi Lu, Peng Jiang,
Abstract summary: Continuous value prediction plays a crucial role in industrial-scale recommendation systems.<n>Existing generative approaches rely on rigid parametric distribution assumptions.<n>We propose a residual quantization (RQ)-based sequence learning framework that represents target continuous values as a sum of ordered quantization codes.
Score: 8.96389388600604
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continuous value prediction plays a crucial role in industrial-scale recommendation systems, including tasks such as predicting users' watch-time and estimating the gross merchandise value (GMV) in e-commerce transactions. However, it remains challenging due to the highly complex and long-tailed nature of the data distributions. Existing generative approaches rely on rigid parametric distribution assumptions, which fundamentally limits their performance when such assumptions misalign with real-world data. Overly simplified forms cannot adequately model real-world complexities, while more intricate assumptions often suffer from poor scalability and generalization. To address these challenges, we propose a residual quantization (RQ)-based sequence learning framework that represents target continuous values as a sum of ordered quantization codes, predicted recursively from coarse to fine granularity with diminishing quantization errors. We introduce a representation learning objective that aligns RQ code embedding space with the ordinal structure of target values, allowing the model to capture continuous representations for quantization codes and further improving prediction accuracy. We perform extensive evaluations on public benchmarks for lifetime value (LTV) and watch-time prediction, alongside a large-scale online experiment for GMV prediction on an industrial short-video recommendation platform. The results consistently show that our approach outperforms state-of-the-art methods, while demonstrating strong generalization across diverse continuous value prediction tasks in recommendation systems.

Related papers

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning [39.920697401868885]
We propose to unlock the potential of decoding-based regression via Reinforcement Learning (RL)<n>We formulate the generation process as a Markov Decision Process, utilizing sequence-level rewards to enforce global numerical coherence.<n>Our analysis further reveals that RL significantly enhances sampling efficiency and predictive precision, establishing decoding-based regression as a robust and accurate paradigm for general-purpose numerical prediction.
arXiv Detail & Related papers (2025-12-06T18:57:38Z)
SimDiff: Simpler Yet Better Diffusion Model for Time Series Point Forecasting [8.141505251306622]
Diffusion models have recently shown promise in time series forecasting.<n>They often fail to achieve state-of-the-art point estimation performance.<n>We propose SimDiff, a single-stage, end-to-end framework for point estimation.
arXiv Detail & Related papers (2025-11-24T16:09:55Z)
How to model Human Actions distribution with Event Sequence Data [22.25731364559209]
We study the forecasting of the future distribution of events in human action sequences.<n>We find that a simple explicit distribution forecasting objective consistently surpasses complex implicit baselines.<n>This work provides a principled framework for selecting modeling strategies and offers practical guidance for building more accurate and robust forecasting systems.
arXiv Detail & Related papers (2025-10-07T12:24:54Z)
Uncertainty Quantification for Regression using Proper Scoring Rules [76.24649098854219]
We introduce a unified UQ framework for regression based on proper scoring rules, such as CRPS, logarithmic, squared error, and quadratic scores.<n>We derive closed-form expressions for the uncertainty measures under practical parametric assumptions and show how to estimate them using ensembles of models.<n>Our broad evaluation on synthetic and real-world regression datasets provides guidance for selecting reliable UQ measures.
arXiv Detail & Related papers (2025-09-30T17:52:12Z)
Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z)
Beyond Model Ranking: Predictability-Aligned Evaluation for Time Series Forecasting [18.018179328110048]
We introduce a predictability-aligned diagnostic framework grounded in spectral coherence.<n>We provide the first systematic evidence of "predictability drift", demonstrating that a task's forecasting difficulty varies sharply over time.<n>Our evaluation reveals a key architectural trade-off: complex models are superior for low-predictability data, whereas linear models are highly effective on more predictable tasks.
arXiv Detail & Related papers (2025-09-27T02:56:06Z)
Error-quantified Conformal Inference for Time Series [55.11926160774831]
Uncertainty quantification in time series prediction is challenging due to the temporal dependence and distribution shift on sequential data.<n>We propose itError-quantified Conformal Inference (ECI) by smoothing the quantile loss function.<n>ECI can achieve valid miscoverage control and output tighter prediction sets than other baselines.
arXiv Detail & Related papers (2025-02-02T15:02:36Z)
Generative Regression Based Watch Time Prediction for Short-Video Recommendation [36.95095097454143]
Watch time prediction has emerged as a pivotal task in short video recommendation systems.<n>Recent studies have attempted to address these issues by converting the continuous watch time estimation into an ordinal regression task.<n>We propose a novel Generative Regression (GR) framework that reformulates WTP as a sequence generation task.
arXiv Detail & Related papers (2024-12-28T16:48:55Z)
Conditional Quantile Estimation for Uncertain Watch Time in Short-Video Recommendation [2.3166433227657186]
We propose Conditional Quantile Estimation (CQE) to model the entire conditional distribution of watch time.<n>CQE characterizes the complex watch-time distribution for each user-video pair, providing a flexible and comprehensive approach to understanding user behavior.
arXiv Detail & Related papers (2024-07-17T00:25:35Z)
Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores. We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z)
Stochastically forced ensemble dynamic mode decomposition for forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system. We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony. Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications. Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions. The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.