Related papers: Reasoning to Rank: An End-to-End Solution for Exploiting Large Language Models for Recommendation

Reasoning to Rank: An End-to-End Solution for Exploiting Large Language Models for Recommendation

URL: http://arxiv.org/abs/2602.12530v1
Date: Fri, 13 Feb 2026 02:22:48 GMT
Title: Reasoning to Rank: An End-to-End Solution for Exploiting Large Language Models for Recommendation
Authors: Kehan Zheng, Deyao Hong, Qian Li, Jun Zhang, Huan Yu, Jie Jiang, Hongning Wang,
Abstract summary: Reasoning to Rank is an end-to-end training framework that internalizes recommendation utility optimization into the learning of step-by-step reasoning in language models.<n>Our framework performs reasoning at the user-item level and employs reinforcement learning for end-to-end training of the language model.
Score: 44.51582748617213
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recommender systems are tasked to infer users' evolving preferences and rank items aligned with their intents, which calls for in-depth reasoning beyond pattern-based scoring. Recent efforts start to leverage large language models (LLMs) for recommendation, but how to effectively optimize the model for improved recommendation utility is still under explored. In this work, we propose Reasoning to Rank, an end-to-end training framework that internalizes recommendation utility optimization into the learning of step-by-step reasoning in LLMs. To avoid position bias in LLM reasoning and enable direct optimization of the reasoning process, our framework performs reasoning at the user-item level and employs reinforcement learning for end-to-end training of the LLM. Experiments on three Amazon datasets and a large-scale industrial dataset showed consistent gains over strong conventional and LLM-based solutions. Extensive in-depth analyses validate the necessity of the key components in the proposed framework and shed lights on the future developments of this line of work.

Related papers

USB-Rec: An Effective Framework for Improving Conversational Recommendation Capability of Large Language Model [14.628459519236658]
Large Language Models (LLMs) have been widely employed in Conversational Recommender Systems (CRSs)<n>In this work, we propose an integrated training-inference framework, User-Simulator-Based framework (USB-Rec)<n>Our method consistently outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-09-20T22:34:55Z)
Towards Comprehensible Recommendation with Large Language Model Fine-tuning [41.218487308635126]
We propose a novel Content Understanding from a Collaborative Perspective framework (CURec) for recommendation systems.<n>Curec generates collaborative-aligned content features for more comprehensive recommendations.<n>Experiments on public benchmarks demonstrate the superiority of CURec over existing methods.
arXiv Detail & Related papers (2025-08-11T03:55:31Z)
Evaluating Position Bias in Large Language Model Recommendations [3.430780143519032]
Large Language Models (LLMs) are being increasingly explored as general-purpose tools for recommendation tasks.<n>We show that LLM-based recommendation models suffer from position bias, where the order of candidate items in a prompt can disproportionately influence the recommendations produced by LLMs.<n>We introduce a new prompting strategy to mitigate the position bias of LLM recommendation models called Ranking via Iterative SElection.
arXiv Detail & Related papers (2025-08-04T03:30:26Z)
DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation [83.21140655248624]
Large language models (LLMs) have been introduced into recommender systems (RSs)<n>We propose DeepRec, a novel LLM-based RS that enables autonomous multi-turn interactions between LLMs and TRMs for deep exploration of the item space.<n> Experiments on public datasets demonstrate that DeepRec significantly outperforms both traditional and LLM-based baselines.
arXiv Detail & Related papers (2025-05-22T15:49:38Z)
A Survey of Direct Preference Optimization [103.59317151002693]
Large Language Models (LLMs) have demonstrated unprecedented generative capabilities.<n>Their alignment with human values remains critical for ensuring helpful and harmless deployments.<n>Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative.
arXiv Detail & Related papers (2025-03-12T08:45:15Z)
Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment [69.11529841118671]
We propose a new Deliberative Recommendation task, which incorporates explicit reasoning about user preferences as an additional alignment goal.<n>We then introduce the Reasoning-powered Recommender framework for deliberative user preference alignment.
arXiv Detail & Related papers (2025-02-04T07:17:54Z)
Direct Preference Optimization for LLM-Enhanced Recommendation Systems [33.54698201942643]
Large Language Models (LLMs) have exhibited remarkable performance across a wide range of domains.<n>We propose DPO4Rec, a framework that integrates DPO into LLM-enhanced recommendation systems.<n>Extensive experiments show that DPO4Rec significantly improves re-ranking performance over strong baselines.
arXiv Detail & Related papers (2024-10-08T11:42:37Z)
Tapping the Potential of Large Language Models as Recommender Systems: A Comprehensive Framework and Empirical Analysis [91.5632751731927]
Large Language Models such as ChatGPT have showcased remarkable abilities in solving general tasks.<n>We propose a general framework for utilizing LLMs in recommendation tasks, focusing on the capabilities of LLMs as recommenders.<n>We analyze the impact of public availability, tuning strategies, model architecture, parameter scale, and context length on recommendation results.
arXiv Detail & Related papers (2024-01-10T08:28:56Z)
A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.