HiGR: Efficient Generative Slate Recommendation via Hierarchical Planning and Multi-Objective Preference Alignment
- URL: http://arxiv.org/abs/2512.24787v1
- Date: Wed, 31 Dec 2025 11:16:24 GMT
- Title: HiGR: Efficient Generative Slate Recommendation via Hierarchical Planning and Multi-Objective Preference Alignment
- Authors: Yunsheng Pang, Zijian Liu, Yudong Li, Shaojie Zhu, Zijian Luo, Chenyun Yu, Sikai Wu, Shichen Shen, Cong Xu, Bin Wang, Kai Jiang, Hongyong Yu, Chengxiang Zhuo, Zang Li,
- Abstract summary: HiGR is an efficient generative slate recommendation framework that integrates hierarchical planning with listwise preference alignment.<n> Experiments on our large-scale commercial media platform demonstrate that HiGR delivers consistent improvements in both offline evaluations and online deployment.
- Score: 22.73838860623495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Slate recommendation, where users are presented with a ranked list of items simultaneously, is widely adopted in online platforms. Recent advances in generative models have shown promise in slate recommendation by modeling sequences of discrete semantic IDs autoregressively. However, existing autoregressive approaches suffer from semantically entangled item tokenization and inefficient sequential decoding that lacks holistic slate planning. To address these limitations, we propose HiGR, an efficient generative slate recommendation framework that integrates hierarchical planning with listwise preference alignment. First, we propose an auto-encoder utilizing residual quantization and contrastive constraints to tokenize items into semantically structured IDs for controllable generation. Second, HiGR decouples generation into a list-level planning stage for global slate intent, followed by an item-level decoding stage for specific item selection. Third, we introduce a listwise preference alignment objective to directly optimize slate quality using implicit user feedback. Experiments on our large-scale commercial media platform demonstrate that HiGR delivers consistent improvements in both offline evaluations and online deployment. Specifically, it outperforms state-of-the-art methods by over 10% in offline recommendation quality with a 5x inference speedup, while further achieving a 1.22% and 1.73% increase in Average Watch Time and Average Video Views in online A/B tests.
Related papers
- RankGR: Rank-Enhanced Generative Retrieval with Listwise Direct Preference Optimization in Recommendation [36.297513746770456]
We propose RankGR, a Generative Retrieval method that incorporates listwise direct preference optimization for recommendation.<n>In IAP, we incorporate a novel listwise direct preference optimization strategy into GR, thus facilitating a more comprehensive understanding of the hierarchical user preferences.<n>We implement several practical improvements in training and deployment, ultimately achieving a real-time system capable of handling nearly ten thousand requests per second.
arXiv Detail & Related papers (2026-02-09T12:13:43Z) - SimGR: Escaping the Pitfalls of Generative Decoding in LLM-based Recommendation [68.00727783181289]
A core objective in recommender systems is to accurately model the distribution of user preferences over items to enable personalized recommendations.<n>We observe that existing methods inevitably introduce systematic bias when estimating item-level preference distributions.<n>We propose textbfSimply textbfGenerative textbfRecommendation (textbfSimGR), a framework that directly models item-level preference distributions in a shared latent space.
arXiv Detail & Related papers (2026-02-08T07:26:52Z) - Multimodal Generative Recommendation for Fusing Semantic and Collaborative Signals [17.608491612845306]
Sequential recommender systems rank relevant items by modeling a user's interaction history and computing the inner product between the resulting user representation and stored item embeddings.<n>To avoid the significant memory overhead of storing large item sets, the generative recommendation paradigm instead models each item as a series of discrete semantic codes.<n>These methods have yet to surpass traditional sequential recommenders on large item sets, limiting their adoption in the very scenarios they were designed to address.<n>We propose MSCGRec, a Multimodal Semantic and Collaborative Generative Recommender.
arXiv Detail & Related papers (2026-02-03T16:39:35Z) - Bringing Reasoning to Generative Recommendation Through the Lens of Cascaded Ranking [107.09842504618369]
Generative Recommendation (GR) has become a promising end-to-end approach with high FLOPS utilization for resource-efficient recommendation.<n>We show that current GR models suffer from a critical textbfbias amplification issue, where token-level bias escalates as token generation progresses.<n>To combat the bias amplification issue, it is crucial for GR to 1) incorporate more heterogeneous information, and 2) allocate greater computational resources at each token generation step.
arXiv Detail & Related papers (2026-02-03T16:10:54Z) - Unifying Ranking and Generation in Query Auto-Completion via Retrieval-Augmented Generation and Multi-Objective Alignment [8.610245271469267]
Query Auto-Completion (QAC) suggests query completions as users type, helping them articulate intent and reach results more efficiently.<n>Traditional retrieve-and-rank pipelines have limited long-tail coverage and require extensive feature engineering.<n>We present a unified framework that reformulates QAC as end-to-end list generation through Retrieval-Augmented Generation (RAG) and multi-objective Direct Preference Optimization (DPO)
arXiv Detail & Related papers (2026-02-01T05:15:07Z) - Masked Diffusion Generative Recommendation [14.679550929790151]
Generative recommendation (GR) typically first quantizes continuous item embeddings into multi-level semantic IDs (SIDs)<n>We propose MDGR, a Masked Diffusion Generative Recommendation framework that reshapes the GR pipeline from three perspectives: codebook, training, and inference.
arXiv Detail & Related papers (2026-01-27T11:39:02Z) - PROMISE: Process Reward Models Unlock Test-Time Scaling Laws in Generative Recommendations [52.67948063133533]
Generative Recommendation has emerged as a promising paradigm, reformulating recommendation as a sequence-to-sequence generation task over hierarchical Semantic IDs.<n>Existing methods suffer from a critical issue we term Semantic Drift, where errors in early, high-level tokens irreversibly divert the generation trajectory into irrelevant semantic subspaces.<n>We propose Promise, a novel framework that integrates dense, step-by-step verification into generative models.
arXiv Detail & Related papers (2026-01-08T07:38:46Z) - Listwise Preference Diffusion Optimization for User Behavior Trajectories Prediction [41.53271688465831]
We formulate User Behavior Trajectory Prediction (UBTP) as a new task setting that explicitly models long-term user preferences.<n>We introduce Listwise Preference Diffusion Optimization (LPDO), a diffusion-based training framework that directly optimize structured preferences over entire item sequences.<n>To rigorously evaluate multi-step prediction quality, we propose the task-specific metric Sequential Match (SeqMatch), which measures exact trajectory agreement, and adopt Perplexity (PPL), which assesses probabilistic fidelity.
arXiv Detail & Related papers (2025-11-01T12:16:24Z) - GReF: A Unified Generative Framework for Efficient Reranking via Ordered Multi-token Prediction [12.254397628788647]
Reranking plays a crucial role in modeling intra-list correlations among items.<n>Recent research follows a two-stage (generator-evaluator) paradigm.<n>We propose a Unified Generative Efficient Reranking Framework (GReF) to address the two primary challenges.
arXiv Detail & Related papers (2025-10-29T06:54:42Z) - Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning [70.6126069527741]
ConvRec-R1 is a two-stage framework for end-to-end training of conversational recommender systems.<n>In Stage 1, we construct a behavioral-cloning dataset with a Remap-Reflect-Adjust pipeline.<n>In Stage 2, we propose Rank-GRPO, a principled extension of group relative policy optimization.
arXiv Detail & Related papers (2025-10-23T02:56:00Z) - End-to-End Personalization: Unifying Recommender Systems with Large Language Models [0.0]
We propose a novel hybrid recommendation framework that combines Graph Attention Networks (GATs) with Large Language Models (LLMs)<n>LLMs are first used to enrich user and item representations by generating semantically meaningful profiles based on metadata such as titles, genres, and overviews.<n>We evaluate our model on benchmark datasets, including MovieLens 100k and 1M, where it consistently outperforms strong baselines.
arXiv Detail & Related papers (2025-08-02T22:46:50Z) - Boosting Chart-to-Code Generation in MLLM via Dual Preference-Guided Refinement [16.22363384653305]
Multimodal Large Language Models (MLLMs) perform fine-grained visual parsing, precise code synthesis, and robust cross-modal reasoning.<n>We propose a dual preference-guided refinement framework that combines a feedback-driven, dual-modality reward mechanism with iterative preference learning.<n>Our framework significantly enhances the performance of general-purpose open-source MLLMs, enabling them to generate high-quality plotting code.
arXiv Detail & Related papers (2025-04-03T07:51:20Z) - OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment [9.99840965933561]
We propose OneRec, which replaces the cascaded learning framework with a unified generative model.<n>OneRec includes: 1) an encoder-decoder structure, which encodes the user's historical behavior sequences and gradually decodes the videos that the user may be interested in.
arXiv Detail & Related papers (2025-02-26T09:25:10Z) - Efficient Inference for Large Language Model-based Generative Recommendation [78.38878421030522]
Large Language Model (LLM)-based generative recommendation has achieved notable success, yet its practical deployment is costly.<n>Applying Speculative Decoding (SD) to generative recommendation presents unique challenges due to the requirement of generating top-K items.<n>We propose an alignment framework named AtSpeed, which presents the AtSpeed-S optimization objective for top-K alignment under the strict top-K verification.
arXiv Detail & Related papers (2024-10-07T16:23:36Z) - Unleash LLMs Potential for Recommendation by Coordinating Twin-Tower Dynamic Semantic Token Generator [60.07198935747619]
We propose Twin-Tower Dynamic Semantic Recommender (T TDS), the first generative RS which adopts dynamic semantic index paradigm.
To be more specific, we for the first time contrive a dynamic knowledge fusion framework which integrates a twin-tower semantic token generator into the LLM-based recommender.
The proposed T TDS recommender achieves an average improvement of 19.41% in Hit-Rate and 20.84% in NDCG metric, compared with the leading baseline methods.
arXiv Detail & Related papers (2024-09-14T01:45:04Z) - Generative Recommender with End-to-End Learnable Item Tokenization [51.82768744368208]
We introduce ETEGRec, a novel End-To-End Generative Recommender that unifies item tokenization and generative recommendation into a cohesive framework.<n>ETEGRec consists of an item tokenizer and a generative recommender built on a dual encoder-decoder architecture.<n>We develop an alternating optimization technique to ensure stable and efficient end-to-end training of the entire framework.
arXiv Detail & Related papers (2024-09-09T12:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.