Inference Time Feature Injection: A Lightweight Approach for Real-Time Recommendation Freshness
- URL: http://arxiv.org/abs/2512.14734v1
- Date: Thu, 11 Dec 2025 04:13:02 GMT
- Title: Inference Time Feature Injection: A Lightweight Approach for Real-Time Recommendation Freshness
- Authors: Qiang Chen, Venkatesh Ganapati Hegde, Hongfei Li,
- Abstract summary: We present a model-agnostic approach for intra-day personalization that selectively injects recent watch history at inference time without requiring model retraining.<n>Our approach selectively overrides stale user features at inference time using the recent watch history, allowing the system to adapt instantly to evolving preferences.
- Score: 4.614925383918567
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Many recommender systems in long-form video streaming reply on batch-trained models and batch-updated features, where user features are updated daily and served statically throughout the day. While efficient, this approach fails to incorporate a user's most recent actions, often resulting in stale recommendations. In this work, we present a lightweight, model-agnostic approach for intra-day personalization that selectively injects recent watch history at inference time without requiring model retraining. Our approach selectively overrides stale user features at inference time using the recent watch history, allowing the system to adapt instantly to evolving preferences. By reducing the personalization feedback loop from daily to intra-day, we observed a statistically significant 0.47% increase in key user engagement metrics which ranked among the most substantial engagement gains observed in recent experimentation cycles. To our knowledge, this is the first published evidence that intra-day personalization can drive meaningful impact in long-form video streaming service, providing a compelling alternative to full real-time architectures where model retraining is required.
Related papers
- Discrete-event Tensor Factorization: Learning a Smooth Embedding for Continuous Domains [0.0]
This paper analyzes how time can be encoded in factorization-style recommendation models.<n>By including absolute time as a feature, our models can learn varying user preferences and changing item perception over time.
arXiv Detail & Related papers (2025-08-06T08:54:57Z) - Pre-training for Recommendation Unlearning [14.514770044236375]
UnlearnRec is a model-agnostic pre-training paradigm that prepares systems for efficient unlearning operations.<n>Our method delivers exceptional unlearning effectiveness while providing more than 10x speedup compared to retraining approaches.
arXiv Detail & Related papers (2025-05-28T17:57:11Z) - Slow Thinking for Sequential Recommendation [88.46598279655575]
We present a novel slow thinking recommendation model, named STREAM-Rec.<n>Our approach is capable of analyzing historical user behavior, generating a multi-step, deliberative reasoning process, and delivering personalized recommendations.<n>In particular, we focus on two key challenges: (1) identifying the suitable reasoning patterns in recommender systems, and (2) exploring how to effectively stimulate the reasoning capabilities of traditional recommenders.
arXiv Detail & Related papers (2025-04-13T15:53:30Z) - Interactive Visualization Recommendation with Hier-SUCB [52.11209329270573]
We propose an interactive personalized visualization recommendation (PVisRec) system that learns on user feedback from previous interactions.<n>For more interactive and accurate recommendations, we propose Hier-SUCB, a contextual semi-bandit in the PVisRec setting.
arXiv Detail & Related papers (2025-02-05T17:14:45Z) - Modeling the Heterogeneous Duration of User Interest in Time-Dependent Recommendation: A Hidden Semi-Markov Approach [11.392605386729699]
We propose a hidden semi-Markov model to track the change of users' interests.<n>This model allows for capturing the different durations of user stays in a (latent) interest state.<n>We derive an algorithm to estimate the parameters and predict users' actions.
arXiv Detail & Related papers (2024-12-15T09:17:45Z) - Conditional Quantile Estimation for Uncertain Watch Time in Short-Video Recommendation [2.3166433227657186]
We propose Conditional Quantile Estimation (CQE) to model the entire conditional distribution of watch time.<n>CQE characterizes the complex watch-time distribution for each user-video pair, providing a flexible and comprehensive approach to understanding user behavior.
arXiv Detail & Related papers (2024-07-17T00:25:35Z) - Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - Effective and Efficient Training for Sequential Recommendation using
Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective.
We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z) - PinnerFormer: Sequence Modeling for User Representation at Pinterest [60.335384724891746]
We introduce PinnerFormer, a user representation trained to predict a user's future long-term engagement.
Unlike prior approaches, we adapt our modeling to a batch infrastructure via our new dense all-action loss.
We show that by doing so, we significantly close the gap between batch user embeddings that are generated once a day and realtime user embeddings generated whenever a user takes an action.
arXiv Detail & Related papers (2022-05-09T18:26:51Z) - Incremental Learning for Personalized Recommender Systems [8.020546404087922]
We present an incremental learning solution to provide both the training efficiency and the model quality.
The solution is deployed in LinkedIn and directly applicable to industrial scale recommender systems.
arXiv Detail & Related papers (2021-08-13T04:21:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.