Reason-to-Recommend: Using Interaction-of-Thought Reasoning to Enhance LLM Recommendation
- URL: http://arxiv.org/abs/2506.05069v2
- Date: Mon, 09 Jun 2025 09:08:12 GMT
- Title: Reason-to-Recommend: Using Interaction-of-Thought Reasoning to Enhance LLM Recommendation
- Authors: Keyu Zhao, Fengli Xu, Yong Li,
- Abstract summary: $textbfR2Rec$ is a reasoning-enhanced recommendation framework.<n>It samples interaction chains from the user-item graph and converts them into structured interaction-of-thoughts.
- Score: 9.282278040339138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Driven by advances in Large Language Models (LLMs), integrating them into recommendation tasks has gained interest due to their strong semantic understanding and prompt flexibility. Prior work encoded user-item interactions or metadata into prompts for recommendations. In parallel, LLM reasoning, boosted by test-time scaling and reinforcement learning, has excelled in fields like mathematics and code, where reasoning traces and correctness signals are clear, enabling high performance and interpretability. However, directly applying these reasoning methods to recommendation is ineffective because user feedback is implicit and lacks reasoning supervision. To address this, we propose $\textbf{R2Rec}$, a reasoning-enhanced recommendation framework that samples interaction chains from the user-item graph and converts them into structured interaction-of-thoughts via a progressive masked prompting strategy, with each thought representing stepwise reasoning grounded in interaction context. This allows LLMs to simulate step-by-step decision-making based on implicit patterns. We design a two-stage training pipeline: supervised fine-tuning teaches basic reasoning from high-quality traces, and reinforcement learning refines reasoning via reward signals, alleviating sparse explicit supervision. Experiments on three real-world datasets show R2Rec outperforms classical and LLM-based baselines with an average $\textbf{10.48%}$ improvement in HitRatio@1 and $\textbf{131.81%}$ gain over the original LLM. Furthermore, the explicit reasoning chains enhance interpretability by revealing the decision process. Our code is available at: https://anonymous.4open.science/r/R2Rec-7C5D.
Related papers
- Perceptual Decoupling for Scalable Multi-modal Reasoning via Reward-Optimized Captioning [78.17782197231325]
We propose a reasoning-guided reinforcement learning strategy that aligns the extractor's captioning behavior with the reasoning objective.<n> Experiments on multi-modal math and science benchmarks show that the proposed RACRO method achieves state-of-the-art average performance.
arXiv Detail & Related papers (2025-06-05T02:28:07Z) - Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models [26.401130750061323]
Chain-of-thought (CoT) is expected to universally improve capabilities of large language models (LLMs)<n>We propose RAIF, a systematic method to boost LLMs in dealing with complex instructions via incentivizing reasoning for test-time compute scaling.<n>We address the shallow, non-essential nature of reasoning under complex instructions via sample-wise contrast for superior CoT enforcement.
arXiv Detail & Related papers (2025-06-02T08:11:44Z) - Reinforced Latent Reasoning for LLM-based Recommendation [83.18146814163308]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities in complex problem-solving tasks.<n>Existing methods typically rely on fine-tuning with explicit chain-of-thought (CoT) data.<n>In this work, we explore an alternative approach that shifts from explicit CoT reasoning to compact, information-dense latent reasoning.
arXiv Detail & Related papers (2025-05-25T11:03:45Z) - $\text{R}^2\text{ec}$: Towards Large Recommender Models with Reasoning [50.291998724376654]
We propose name, a unified large recommender model with intrinsic reasoning capabilities.<n> RecPO is a corresponding reinforcement learning framework that optimize name both the reasoning and recommendation capabilities simultaneously in a single policy update.<n> Experiments on three datasets with various baselines verify the effectiveness of name, showing relative improvements of 68.67% in Hit@5 and 45.21% in NDCG@20.
arXiv Detail & Related papers (2025-05-22T17:55:43Z) - LARES: Latent Reasoning for Sequential Recommendation [96.26996622771593]
We present LARES, a novel and scalable LAtent REasoning framework for Sequential recommendation.<n>Our proposed approach employs a recurrent architecture that allows flexible expansion of reasoning depth without increasing parameter complexity.<n>Our framework exhibits seamless compatibility with existing advanced models, further improving their recommendation performance.
arXiv Detail & Related papers (2025-05-22T16:22:54Z) - Dual Reasoning: A GNN-LLM Collaborative Framework for Knowledge Graph Question Answering [38.31983923708175]
We propose Dual-Reasoning, a novel framework that integrates an external system based on Graph Neural Network (GNN) for explicit reasoning on Knowledge Graphs (KGs)<n>We show that DualR achieves state-of-the-art performance while maintaining high efficiency and interpretability.
arXiv Detail & Related papers (2024-06-03T09:38:28Z) - FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering [46.41364317172677]
Large Language Models (LLMs) are often challenged by generating erroneous or hallucinated responses.<n>We propose a unified framework, FiDeLiS, designed to improve the factuality of LLM responses by anchoring answers to verifiable reasoning steps retrieved from Knowledge Graphs.<n>Our method, as a training-free framework, not only improve the performance but also enhance the factuality and interpretability across different benchmarks.
arXiv Detail & Related papers (2024-05-22T17:56:53Z) - DRDT: Dynamic Reflection with Divergent Thinking for LLM-based
Sequential Recommendation [53.62727171363384]
We introduce a novel reasoning principle: Dynamic Reflection with Divergent Thinking.
Our methodology is dynamic reflection, a process that emulates human learning through probing, critiquing, and reflecting.
We evaluate our approach on three datasets using six pre-trained LLMs.
arXiv Detail & Related papers (2023-12-18T16:41:22Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.