Related papers: A Probabilistic Framework for Temporal Distribution Generalization in Industry-Scale Recommender Systems

A Probabilistic Framework for Temporal Distribution Generalization in Industry-Scale Recommender Systems

URL: http://arxiv.org/abs/2511.21032v1
Date: Wed, 26 Nov 2025 04:02:23 GMT
Title: A Probabilistic Framework for Temporal Distribution Generalization in Industry-Scale Recommender Systems
Authors: Yuxuan Zhu, Cong Fu, Yabo Ni, Anxiang Zeng, Yuan Fang,
Abstract summary: Temporal distribution shift erodes the long-term accuracy of recommender systems.<n>We propose a probabilistic framework that integrates seamlessly into industry-scale incremental learning pipelines.<n>Our method achieves superior temporal generalization, yielding a 2.33% uplift in GMV per user.
Score: 14.592975643628188
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Temporal distribution shift (TDS) erodes the long-term accuracy of recommender systems, yet industrial practice still relies on periodic incremental training, which struggles to capture both stable and transient patterns. Existing approaches such as invariant learning and self-supervised learning offer partial solutions but often suffer from unstable temporal generalization, representation collapse, or inefficient data utilization. To address these limitations, we propose ELBO$_\text{TDS}$, a probabilistic framework that integrates seamlessly into industry-scale incremental learning pipelines. First, we identify key shifting factors through statistical analysis of real-world production data and design a simple yet effective data augmentation strategy that resamples these time-varying factors to extend the training support. Second, to harness the benefits of this extended distribution while preventing representation collapse, we model the temporal recommendation scenario using a causal graph and derive a self-supervised variational objective, ELBO$_\text{TDS}$, grounded in the causal structure. Extensive experiments supported by both theoretical and empirical analysis demonstrate that our method achieves superior temporal generalization, yielding a 2.33\% uplift in GMV per user and has been successfully deployed in Shopee Product Search. Code is available at https://github.com/FuCongResearchSquad/ELBO4TDS.

Related papers

GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler [54.10960908347221]
We model latent thought exploration as conditional sampling from learnable densities and instantiate this idea as a Gaussian Thought Sampler (GTS)<n>GTS predicts context-dependent perturbation distributions over continuous reasoning states and is trained with GRPO-style policy optimization while keeping the backbone frozen.
arXiv Detail & Related papers (2026-02-15T09:57:47Z)
Variational Approach for Job Shop Scheduling [2.256375838037721]
This paper proposes a novel Variational Graph-to-Scheduler (VG2S) framework for solving the Job Shop Scheduling Problem (JSSP)<n>The proposed method exhibits superior zero-shot generalization compared with state-of-the-art DRL baselines and traditional dispatching rules.
arXiv Detail & Related papers (2026-01-30T23:55:18Z)
EVEREST: An Evidential, Tail-Aware Transformer for Rare-Event Time-Series Forecasting [4.551615447454767]
EVEREST is a transformer-based architecture for probabilistic rare-event forecasting.<n>It delivers calibrated predictions and tail-aware risk estimation.<n>It is applicable to high-stakes domains such as industrial monitoring, weather, and satellite diagnostics.
arXiv Detail & Related papers (2026-01-26T23:15:20Z)
Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach [78.4812458793128]
We propose textbfTACO, a test-time-scaling framework that applies a lightweight pseudo-count estimator as a high-fidelity verifier of action chunks.<n>Our method resembles the classical anti-exploration principle in offline reinforcement learning (RL), and being gradient-free, it incurs significant computational benefits.
arXiv Detail & Related papers (2025-12-02T14:42:54Z)
Robust Probabilistic Load Forecasting for a Single Household: A Comparative Study from SARIMA to Transformers on the REFIT Dataset [0.0]
This paper tackles the challenge using the volatile REFIT household dataset.<n>We first address this by conducting a rigorous comparative experiment to select a Seasonal Imputation method.<n>We then systematically evaluate a hierarchy of models, progressing from classical baselines to machine learning.<n>Our findings reveal that classical models fail to capture the data's non-linear, regime-switching behavior.
arXiv Detail & Related papers (2025-11-30T12:05:18Z)
Modeling Uncertainty Trends for Timely Retrieval in Dynamic RAG [35.96258615258145]
We introduce Entropy-Trend Constraint (ETC), a training-free method that determines optimal retrieval timing by modeling the dynamics of token-level uncertainty.<n>ETC consistently outperforms strong baselines while reducing retrieval frequency.<n>It is plug-and-play, model-agnostic, and readily integrable into existing decoding pipelines.
arXiv Detail & Related papers (2025-11-13T05:28:02Z)
Sequential Data Augmentation for Generative Recommendation [54.765568804267645]
Generative recommendation plays a crucial role in personalized systems, predicting users' future interactions from their historical behavior sequences.<n>Data augmentation, the process of constructing training data from user interaction histories, is a critical yet underexplored factor in training these models.<n>We propose GenPAS, a principled framework that models augmentation as a sampling process and enables flexible control of the resulting training distribution.<n>Our experiments on benchmark and industrial datasets demonstrate that GenPAS yields superior accuracy, data efficiency, and parameter efficiency compared to existing strategies.
arXiv Detail & Related papers (2025-09-17T02:53:25Z)
SEVA: Leveraging Single-Step Ensemble of Vicinal Augmentations for Test-Time Adaptation [29.441669360316418]
Test-Time adaptation (TTA) aims to enhance model robustness against distribution shifts through rapid model adaptation during inference.<n> augmentation strategies can effectively unleash the potential of reliable samples, but the rapidly growing computational cost impedes their real-time application.<n>We propose a novel TTA approach named Single-step Ensemble of Vicinal Augmentations (SEVA) which can take advantage of data augmentations without increasing the computational burden.
arXiv Detail & Related papers (2025-05-07T02:58:37Z)
Understanding the Limits of Deep Tabular Methods with Temporal Shift [28.738848567072004]
We introduce a plug-and-play temporal embedding method based on Fourier series expansion to learn and incorporate temporal patterns.<n>Our experiments demonstrate that this temporal embedding, combined with the improved training protocol, provides a more effective and robust framework for learning from temporal data.
arXiv Detail & Related papers (2025-02-27T16:48:53Z)
Generative Regression Based Watch Time Prediction for Short-Video Recommendation [36.95095097454143]
Watch time prediction has emerged as a pivotal task in short video recommendation systems.<n>Recent studies have attempted to address these issues by converting the continuous watch time estimation into an ordinal regression task.<n>We propose a novel Generative Regression (GR) framework that reformulates WTP as a sequence generation task.
arXiv Detail & Related papers (2024-12-28T16:48:55Z)
Bridging SFT and DPO for Diffusion Model Alignment with Self-Sampling Preference Optimization [67.8738082040299]
Self-Sampling Preference Optimization (SSPO) is a new alignment method for post-training reinforcement learning.<n>SSPO eliminates the need for paired data and reward models while retaining the training stability of SFT.<n>SSPO surpasses all previous approaches on the text-to-image benchmarks and demonstrates outstanding performance on the text-to-video benchmarks.
arXiv Detail & Related papers (2024-10-07T17:56:53Z)
Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs [50.25683648762602]
We introduce Koopman VAE, a new generative framework that is based on a novel design for the model prior. Inspired by Koopman theory, we represent the latent conditional prior dynamics using a linear map. KoVAE outperforms state-of-the-art GAN and VAE methods across several challenging synthetic and real-world time series generation benchmarks.
arXiv Detail & Related papers (2023-10-04T07:14:43Z)
Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling. We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z)
Towards Flexible Time-to-event Modeling: Optimizing Neural Networks via Rank Regression [17.684526928033065]
We introduce the Deep AFT Rank-regression model for Time-to-event prediction (DART) This model uses an objective function based on Gehan's rank statistic, which is efficient and reliable for representation learning. The proposed method is a semiparametric approach to AFT modeling that does not impose any distributional assumptions on the survival time distribution.
arXiv Detail & Related papers (2023-07-16T13:58:28Z)
Sample-Efficient Optimisation with Probabilistic Transformer Surrogates [66.98962321504085]
This paper investigates the feasibility of employing state-of-the-art probabilistic transformers in Bayesian optimisation. We observe two drawbacks stemming from their training procedure and loss definition, hindering their direct deployment as proxies in black-box optimisation. We introduce two components: 1) a BO-tailored training prior supporting non-uniformly distributed points, and 2) a novel approximate posterior regulariser trading-off accuracy and input sensitivity to filter favourable stationary points for improved predictive performance.
arXiv Detail & Related papers (2022-05-27T11:13:17Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.