AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
- URL: http://arxiv.org/abs/2505.19623v2
- Date: Wed, 28 May 2025 14:32:56 GMT
- Title: AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
- Authors: Yu Shang, Peijie Liu, Yuwei Yan, Zijing Wu, Leheng Sheng, Yuanqing Yu, Chumeng Jiang, An Zhang, Fengli Xu, Yu Wang, Min Zhang, Yong Li,
- Abstract summary: Agentic recommender systems are powered by Large Language Models (LLMs)<n>LLMs' advanced reasoning and role-playing capabilities enable autonomous, adaptive decision-making.<n>The field currently lacks standardized evaluation protocols to assess these methods.
- Score: 17.329692234349768
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emergence of agentic recommender systems powered by Large Language Models (LLMs) represents a paradigm shift in personalized recommendations, leveraging LLMs' advanced reasoning and role-playing capabilities to enable autonomous, adaptive decision-making. Unlike traditional recommendation approaches, agentic recommender systems can dynamically gather and interpret user-item interactions from complex environments, generating robust recommendation strategies that generalize across diverse scenarios. However, the field currently lacks standardized evaluation protocols to systematically assess these methods. To address this critical gap, we propose: (1) an interactive textual recommendation simulator incorporating rich user and item metadata and three typical evaluation scenarios (classic, evolving-interest, and cold-start recommendation tasks); (2) a unified modular framework for developing and studying agentic recommender systems; and (3) the first comprehensive benchmark comparing 10 classical and agentic recommendation methods. Our findings demonstrate the superiority of agentic systems and establish actionable design guidelines for their core components. The benchmark environment has been rigorously validated through an open challenge and remains publicly available with a continuously maintained leaderboard~\footnote[2]{https://tsinghua-fib-lab.github.io/AgentSocietyChallenge/pages/overview.html}, fostering ongoing community engagement and reproducible research. The benchmark is available at: \hyperlink{https://huggingface.co/datasets/SGJQovo/AgentRecBench}{https://huggingface.co/datasets/SGJQovo/AgentRecBench}.
Related papers
- A Survey on LLM-powered Agents for Recommender Systems [16.463945811669245]
Large Language Model (LLM)-powered agents offer a promising approach by enabling natural language interactions and interpretable reasoning.<n>This survey provides a systematic review of the emerging applications of LLM-powered agents in recommender systems.
arXiv Detail & Related papers (2025-02-14T09:57:07Z) - Preference Discerning with LLM-Enhanced Generative Retrieval [28.309905847867178]
We propose a new paradigm, which we term preference discerning.<n>In preference dscerning, we explicitly condition a generative sequential recommendation system on user preferences within its context.<n>We generate user preferences using Large Language Models (LLMs) based on user reviews and item-specific data.
arXiv Detail & Related papers (2024-12-11T18:26:55Z) - Generative Recommender with End-to-End Learnable Item Tokenization [51.82768744368208]
We introduce ETEGRec, a novel End-To-End Generative Recommender that unifies item tokenization and generative recommendation into a cohesive framework.<n>ETEGRec consists of an item tokenizer and a generative recommender built on a dual encoder-decoder architecture.<n>We develop an alternating optimization technique to ensure stable and efficient end-to-end training of the entire framework.
arXiv Detail & Related papers (2024-09-09T12:11:53Z) - LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation [15.972926854420619]
Leveraging large language models (LLMs) offers new opportunities for comprehensive recommendation logic generation.
Fine-tuning LLM models for recommendation tasks incurs high computational costs and alignment issues with existing systems.
In this work, our proposed effective strategy LANE aligns LLMs with online recommendation systems without additional LLMs tuning.
arXiv Detail & Related papers (2024-07-03T06:20:31Z) - Enhancing Sequential Recommender with Large Language Models for Joint Video and Comment Recommendation [77.42486522565295]
We propose a novel recommendation approach called LSVCR to jointly perform personalized video and comment recommendation.<n>Our approach comprises two key components: sequential recommendation (SR) model and supplemental large language model (LLM) recommender.<n>In particular, we attain a cumulative gain of 4.13% in comment watch time.
arXiv Detail & Related papers (2024-03-20T13:14:29Z) - On Generative Agents in Recommendation [58.42840923200071]
Agent4Rec is a user simulator in recommendation based on Large Language Models.
Each agent interacts with personalized recommender models in a page-by-page manner.
arXiv Detail & Related papers (2023-10-16T06:41:16Z) - MISSRec: Pre-training and Transferring Multi-modal Interest-aware
Sequence Representation for Recommendation [61.45986275328629]
We propose MISSRec, a multi-modal pre-training and transfer learning framework for sequential recommendation.
On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests.
On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation.
arXiv Detail & Related papers (2023-08-22T04:06:56Z) - ClusterSeq: Enhancing Sequential Recommender Systems with Clustering
based Meta-Learning [3.168790535780547]
ClusterSeq is a Meta-Learning Clustering-Based Sequential Recommender System.
It exploits dynamic information in the user sequence to enhance item prediction accuracy, even in the absence of side information.
Our proposed approach achieves a substantial improvement of 16-39% in Mean Reciprocal Rank (MRR)
arXiv Detail & Related papers (2023-07-25T18:53:24Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.