Generative Recommendation for Large-Scale Advertising
- URL: http://arxiv.org/abs/2602.22732v2
- Date: Wed, 04 Mar 2026 09:28:17 GMT
- Title: Generative Recommendation for Large-Scale Advertising
- Authors: Ben Xue, Dan Liu, Lixiang Wang, Mingjie Sun, Peng Wang, Pengfei Zhang, Shaoyun Shi, Tianyu Xu, Yunhao Sha, Zhiqiang Liu, Bo Kong, Bo Wang, Hang Yang, Jieting Xue, Junhao Wang, Shengyu Wang, Shuping Hui, Wencai Ye, Xiao Lin, Yongzhi Li, Yuhang Chen, Zhihui Yin, Quan Chen, Shiyang Wen, Wenjin Wu, Han Li, Guorui Zhou, Changcheng Li, Peng Jiang,
- Abstract summary: We present a production-oriented generative recommender co-designed across architecture, learning, and serving.<n> GR4AD has been fully deployed in Kuaishou advertising system with over 400 million users.
- Score: 43.694084612630554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative recommendation has recently attracted widespread attention in industry due to its potential for scaling and stronger model capacity. However, deploying real-time generative recommendation in large-scale advertising requires designs beyond large-language-model (LLM)-style training and serving recipes. We present a production-oriented generative recommender co-designed across architecture, learning, and serving, named GR4AD (Generative Recommendation for ADdvertising). As for tokenization, GR4AD proposes UA-SID (Unified Advertisement Semantic ID) to capture complicated business information. Furthermore, GR4AD introduces LazyAR, a lazy autoregressive decoder that relaxes layer-wise dependencies for short, multi-candidate generation, preserving effectiveness while reducing inference cost, which facilitates scaling under fixed serving budgets. To align optimization with business value, GR4AD employs VSL (Value-Aware Supervised Learning) and proposes RSPO (Ranking-Guided Softmax Preference Optimization), a ranking-aware, list-wise reinforcement learning algorithm that optimizes value-based rewards under list-level metrics for continual online updates. For online inference, we further propose dynamic beam serving, which adapts beam width across generation levels and online load to control compute. Large-scale online A/B tests show up to 4.2% ad revenue improvement over an existing DLRM-based stack, with consistent gains from both model scaling and inference-time scaling. GR4AD has been fully deployed in Kuaishou advertising system with over 400 million users and achieves high-throughput real-time serving.
Related papers
- Learning to Reflect and Correct: Towards Better Decoding Trajectories for Large-Scale Generative Recommendation [14.679550929790151]
Generative Recommendation (GR) has become a promising paradigm for large-scale recommendation systems.<n>We propose a structured reflection-correction framework for GR that extends standard decoding into a Generation-Reflection-Correction (GRC) process.<n>For efficient online serving, we propose an Entropy-Guided Reflection Scheduling (EGRS) strategy that dynamically allocates more correction budget to high-uncertainty decoding trajectories.
arXiv Detail & Related papers (2026-02-27T03:22:58Z) - Bringing Reasoning to Generative Recommendation Through the Lens of Cascaded Ranking [107.09842504618369]
Generative Recommendation (GR) has become a promising end-to-end approach with high FLOPS utilization for resource-efficient recommendation.<n>We show that current GR models suffer from a critical textbfbias amplification issue, where token-level bias escalates as token generation progresses.<n>To combat the bias amplification issue, it is crucial for GR to 1) incorporate more heterogeneous information, and 2) allocate greater computational resources at each token generation step.
arXiv Detail & Related papers (2026-02-03T16:10:54Z) - WebAnchor: Anchoring Agent Planning to Stabilize Long-Horizon Web Reasoning [82.12501258760814]
Large Language Model(LLM)-based agents have shown strong capabilities in web information seeking.<n>Plan anchor is where the first reasoning step disproportionately impacts downstream behavior in long-horizon web reasoning tasks.<n>We propose Anchor-GRPO, a two-stage RL framework that decouples planning and execution.
arXiv Detail & Related papers (2026-01-06T16:36:40Z) - COFFEE: COdesign Framework for Feature Enriched Embeddings in Ads-Ranking Systems [2.1182747626493885]
We present a novel framework for enhancing user-ad representations without increasing model inference or serving complexity.<n>The proposed method can boost the area under curve (AUC) and the slope of scaling curves for ad-impression sources by 1.56 to 2 times.
arXiv Detail & Related papers (2026-01-06T08:29:12Z) - GPR: Towards a Generative Pre-trained One-Model Paradigm for Large-Scale Advertising Recommendation [38.48999566011862]
We propose GPR (Generative Pre-trained Recommender), a one-model framework that redefines advertising recommendation as an end-to-end generative task.<n>We introduce three key innovations spanning unified representation, network architecture, and training strategy.<n>GPR has been fully deployed in the Tencent Weixin Channels advertising system, delivering significant improvements in key business metrics.
arXiv Detail & Related papers (2025-11-13T09:50:53Z) - Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems [54.709976343045824]
Current offline reinforcement learning (RL) methods face substantial challenges when applied to sparse advertising scenarios.<n>We propose MTORL, a novel multi-task offline RL model that targets two key objectives.<n>We employ multi-task learning to decode actions and rewards, simultaneously addressing channel recommendation and budget allocation.
arXiv Detail & Related papers (2025-06-29T05:05:13Z) - MTGR: Industrial-Scale Generative Recommendation Framework in Meituan [32.12374665716164]
We propose MTGR (Meituan Generative Recommendation) to address this issue.<n> MTGR achieves training and inference acceleration through user-level compression to ensure efficient scaling.<n>This breakthrough was successfully deployed on Meituan, the world's largest food delivery platform.
arXiv Detail & Related papers (2025-05-24T11:47:28Z) - Scaling New Frontiers: Insights into Large Recommendation Models [74.77410470984168]
Meta's generative recommendation model HSTU illustrates the scaling laws of recommendation systems by expanding parameters to thousands of billions.<n>We conduct comprehensive ablation studies to explore the origins of these scaling laws.<n>We offer insights into future directions for large recommendation models.
arXiv Detail & Related papers (2024-12-01T07:27:20Z) - Continuous Input Embedding Size Search For Recommender Systems [60.89189829112067]
Continuous input embedding size search (CIESS) is a novel RL-based method that operates on a continuous search space with arbitrary embedding sizes to choose from.<n> CIESS is also model-agnostic and hence generalizable to a variety of latent factor RSs.<n> experiments on two real-world datasets have shown state-of-the-art performance of CIESS under different memory budgets.
arXiv Detail & Related papers (2023-04-07T06:46:37Z) - Deep Reinforcement Learning-Based Product Recommender for Online
Advertising [1.7778609937758327]
This paper compares value-based and policy-based deep RL algorithms for designing recommender systems for online advertising.
The designed recommender systems aim at maximising the click-through rate (CTR) for the recommended items.
arXiv Detail & Related papers (2021-01-30T23:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.