Related papers: Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with LLMs

Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with LLMs

URL: http://arxiv.org/abs/2503.09382v1
Date: Wed, 12 Mar 2025 13:28:23 GMT
Title: Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with LLMs
Authors: Jiani Huang, Shijie Wang, Liang-bo Ning, Wenqi Fan, Shuaiqiang Wang, Dawei Yin, Qing Li,
Abstract summary: Large language models (LLMs) have revolutionized the foundational architecture of RecSys.<n>Most existing studies rely on fixed task-specific prompt templates to generate recommendations.<n>This is because commonly used datasets lack high-quality textual user queries that reflect real-world recommendation scenarios.<n>We introduce RecBench+, a new dataset benchmark designed to access LLMs' ability to handle intricate user recommendation needs.
Score: 38.83854553636802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recommender systems (RecSys) are widely used across various modern digital platforms and have garnered significant attention. Traditional recommender systems usually focus only on fixed and simple recommendation scenarios, making it difficult to generalize to new and unseen recommendation tasks in an interactive paradigm. Recently, the advancement of large language models (LLMs) has revolutionized the foundational architecture of RecSys, driving their evolution into more intelligent and interactive personalized recommendation assistants. However, most existing studies rely on fixed task-specific prompt templates to generate recommendations and evaluate the performance of personalized assistants, which limits the comprehensive assessments of their capabilities. This is because commonly used datasets lack high-quality textual user queries that reflect real-world recommendation scenarios, making them unsuitable for evaluating LLM-based personalized recommendation assistants. To address this gap, we introduce RecBench+, a new dataset benchmark designed to access LLMs' ability to handle intricate user recommendation needs in the era of LLMs. RecBench+ encompasses a diverse set of queries that span both hard conditions and soft preferences, with varying difficulty levels. We evaluated commonly used LLMs on RecBench+ and uncovered below findings: 1) LLMs demonstrate preliminary abilities to act as recommendation assistants, 2) LLMs are better at handling queries with explicitly stated conditions, while facing challenges with queries that require reasoning or contain misleading information. Our dataset has been released at https://github.com/jiani-huang/RecBench.git.

Related papers

From Prompting to Alignment: A Generative Framework for Query Recommendation [36.541332088115105]
We propose a Generative Query Recommendation (GQR) framework that aligns query generation with user preference. Specifically, we unify diverse query recommendation tasks by a universal prompt framework. We also present a CTR-alignment framework, which involves training a query-wise CTR predictor as a process reward model.
arXiv Detail & Related papers (2025-04-14T13:21:29Z)
Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning [57.28766250993726]
This work explores adapting to dynamic user interests without any model updates. Existing Large Language Model (LLM)-based recommenders often lose the in-context learning ability during recommendation tuning. We propose RecICL, which customizes recommendation-specific in-context learning for real-time recommendations.
arXiv Detail & Related papers (2024-10-30T15:48:36Z)
Direct Preference Optimization for LLM-Enhanced Recommendation Systems [33.54698201942643]
Large Language Models (LLMs) have exhibited remarkable performance across a wide range of domains. We propose DPO4Rec, a framework that integrates DPO into LLM-enhanced recommendation systems. Extensive experiments show that DPO4Rec significantly improves re-ranking performance over strong baselines.
arXiv Detail & Related papers (2024-10-08T11:42:37Z)
Enhancing High-order Interaction Awareness in LLM-based Recommender Model [3.7623606729515133]
This paper presents an enhanced LLM-based recommender (ELMRec) We enhance whole-word embeddings to substantially enhance LLMs' interpretation of graph-constructed interactions for recommendations. Our ELMRec outperforms state-of-the-art (SOTA) methods in both direct and sequential recommendations.
arXiv Detail & Related papers (2024-09-30T06:07:12Z)
Laser: Parameter-Efficient LLM Bi-Tuning for Sequential Recommendation with Collaborative Information [76.62949982303532]
We propose a parameter-efficient Large Language Model Bi-Tuning framework for sequential recommendation with collaborative information (Laser) In our Laser, the prefix is utilized to incorporate user-item collaborative information and adapt the LLM to the recommendation task, while the suffix converts the output embeddings of the LLM from the language space to the recommendation space for the follow-up item recommendation. M-Former is a lightweight MoE-based querying transformer that uses a set of query experts to integrate diverse user-specific collaborative information encoded by frozen ID-based sequential recommender systems.
arXiv Detail & Related papers (2024-09-03T04:55:03Z)
Beyond Inter-Item Relations: Dynamic Adaption for Enhancing LLM-Based Sequential Recommendation [83.87767101732351]
Sequential recommender systems (SRS) predict the next items that users may prefer based on user historical interaction sequences. Inspired by the rise of large language models (LLMs) in various AI applications, there is a surge of work on LLM-based SRS. We propose DARec, a sequential recommendation model built on top of coarse-grained adaption for capturing inter-item relations.
arXiv Detail & Related papers (2024-08-14T10:03:40Z)
LLMRS: Unlocking Potentials of LLM-Based Recommender Systems for Software Purchase [0.6597195879147557]
Large Language Models (LLM) offer promising results for analyzing user queries. We propose LLMRS, an LLM-based zero-shot recommender system where we employ pre-trained LLM to encode user reviews into a review score and generate user-tailored recommendations.
arXiv Detail & Related papers (2024-01-12T16:33:17Z)
LLMRec: Benchmarking Large Language Models on Recommendation Task [54.48899723591296]
The application of Large Language Models (LLMs) in the recommendation domain has not been thoroughly investigated. We benchmark several popular off-the-shelf LLMs on five recommendation tasks, including rating prediction, sequential recommendation, direct recommendation, explanation generation, and review summarization. The benchmark results indicate that LLMs displayed only moderate proficiency in accuracy-based tasks such as sequential and direct recommendation.
arXiv Detail & Related papers (2023-08-23T16:32:54Z)
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z)
GenRec: Large Language Model for Generative Recommendation [41.22833600362077]
This paper presents an innovative approach to recommendation systems using large language models (LLMs) based on text data. GenRec uses LLM's understanding ability to interpret context, learn user preferences, and generate relevant recommendation. Our research underscores the potential of LLM-based generative recommendation in revolutionizing the domain of recommendation systems.
arXiv Detail & Related papers (2023-07-02T02:37:07Z)
A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.