Related papers: RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

URL: http://arxiv.org/abs/2602.12968v1
Date: Fri, 13 Feb 2026 14:38:02 GMT
Title: RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems
Authors: Junhua Liu, Yang Jihao, Cheng Chang, Kunrong LI, Bin Fu, Kwan Hui Lim,
Abstract summary: We propose RGAlign-Rec, a closed-loop alignment framework for proactive intent prediction.<n>We also introduce Ranking-Guided Alignment (RGA), a multi-stage training paradigm.<n>Our framework achieves a 0.12% gain in GAUC, leading to a significant 3.52% relative reduction in error rate, and a 0.56% improvement in Recall@3.
Score: 25.34524038198569
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Proactive intent prediction is a critical capability in modern e-commerce chatbots, enabling "zero-query" recommendations by anticipating user needs from behavioral and contextual signals. However, existing industrial systems face two fundamental challenges: (1) the semantic gap between discrete user features and the semantic intents within the chatbot's Knowledge Base, and (2) the objective misalignment between general-purpose LLM outputs and task-specific ranking utilities. To address these issues, we propose RGAlign-Rec, a closed-loop alignment framework that integrates an LLM-based semantic reasoner with a Query-Enhanced (QE) ranking model. We also introduce Ranking-Guided Alignment (RGA), a multi-stage training paradigm that utilizes downstream ranking signals as feedback to refine the LLM's latent reasoning. Extensive experiments on a large-scale industrial dataset from Shopee demonstrate that RGAlign-Rec achieves a 0.12% gain in GAUC, leading to a significant 3.52% relative reduction in error rate, and a 0.56% improvement in Recall@3. Online A/B testing further validates the cumulative effectiveness of our framework: the Query-Enhanced model (QE-Rec) initially yields a 0.98% improvement in CTR, while the subsequent Ranking-Guided Alignment stage contributes an additional 0.13% gain. These results indicate that ranking-aware alignment effectively synchronizes semantic reasoning with ranking objectives, significantly enhancing both prediction accuracy and service quality in real-world proactive recommendation systems.

Related papers

PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization [6.17916814159778]
In search systems, effectively coordinating the two core objectives of search relevance matching and click-through rate (CTR) prediction is crucial.<n>We propose PRECTR-V2, which mitigates the low-activity users' sparse behavior problem by mining global relevance preferences.<n>This encoder replaces the frozen BERT module, enabling better adaptation to CTR fine-tuning and advancing beyond the traditional Emb+MLP paradigm.
arXiv Detail & Related papers (2026-02-24T08:26:17Z)
CaliCausalRank: Calibrated Multi-Objective Ad Ranking with Robust Counterfactual Utility Optimization [9.601427882648116]
CaliCausalRank is a framework that integrates training-time scale calibration, constraint-based multi-objective optimization, and robust counterfactual utility estimation.<n>Our approach treats score calibration as a first-class training objective rather than post-hoc processing, employs Lagrangian relaxation for constraint satisfaction, and achieves variance-reduced counterfactual estimators for reliable offline evaluation.
arXiv Detail & Related papers (2026-02-21T10:35:12Z)
Generative Reasoning Re-ranker [24.386586034456673]
Generative Reasoning Reranker (GR2) is an end-to-end framework with a three-stage training pipeline tailored for reranking.<n> GR2 generates high-quality reasoning traces through carefully designed prompting and rejection sampling.<n>Experiments on two real-world datasets demonstrate GR2's effectiveness.
arXiv Detail & Related papers (2026-02-08T02:12:24Z)
Practical RAG Evaluation: A Rarity-Aware Set-Based Metric and Cost-Latency-Quality Trade-offs [0.0]
This paper addresses the guessing game in building production RAG.<n>There is no standardized, reproducible way to build and audit golden sets.<n>Rath-gs (MIT) is a lean golden-set pipeline with Plackett-Luce listwise refinement.
arXiv Detail & Related papers (2025-11-12T18:49:21Z)
Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models [50.84995206660551]
We introduce Conditional advANtage estimatiON (CANON) to amplify the impact of a target metric without presuming its direction.<n>CANON based on entropy consistently outperforms prior methods on both math reasoning and high-complexity logic tasks.
arXiv Detail & Related papers (2025-09-28T16:33:07Z)
RAG-Zeval: Towards Robust and Interpretable Evaluation on RAG Responses through End-to-End Rule-Guided Reasoning [64.46921169261852]
RAG-Zeval is a novel end-to-end framework that formulates faithfulness and correctness evaluation as a rule-guided reasoning task.<n>Our approach trains evaluators with reinforcement learning, facilitating compact models to generate comprehensive and sound assessments.<n>Experiments demonstrate RAG-Zeval's superior performance, achieving the strongest correlation with human judgments.
arXiv Detail & Related papers (2025-05-28T14:55:33Z)
In-context Ranking Preference Optimization [65.5489745857577]
We propose an In-context Ranking Preference Optimization (IRPO) framework to optimize large language models (LLMs) based on ranking lists constructed during inference.<n>We show IRPO outperforms standard DPO approaches in ranking performance, highlighting its effectiveness in aligning LLMs with direct in-context ranking preferences.
arXiv Detail & Related papers (2025-04-21T23:06:12Z)
Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task.<n>Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
The Dual-use Dilemma in LLMs: Do Empowering Ethical Capacities Make a Degraded Utility? [54.18519360412294]
Large Language Models (LLMs) must balance between rejecting harmful requests for safety and accommodating legitimate ones for utility.<n>This paper presents a Direct Preference Optimization (DPO) based alignment framework that achieves better overall performance.<n>We analyze experimental results obtained from testing DeepSeek-R1 on our benchmark and reveal the critical ethical concerns raised by this highly acclaimed model.
arXiv Detail & Related papers (2025-01-20T06:35:01Z)
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead [24.611413814466978]
Large language models (LLMs) enhanced with retrieval-augmented generation (RAG) have introduced a new paradigm for web search. Existing methods to enhance context awareness are often inefficient, incurring time or memory overhead during inference. We propose Position-Embedding-Agnostic attention Re-weighting (PEAR) which enhances the context awareness of LLMs with zero inference overhead.
arXiv Detail & Related papers (2024-09-29T15:40:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.