Fugu-MT 論文翻訳(概要): DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation

論文の概要: DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation

arxiv url: http://arxiv.org/abs/2505.16810v1
Date: Thu, 22 May 2025 15:49:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-23 17:12:48.41623
Title: DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation
Title（参考訳）: DeepRec: 大規模言語モデルに基づくレコメンデーションによるアイテムスペースへのディープダイブ
Authors: Bowen Zheng, Xiaolei Wang, Enze Liu, Xi Wang, Lu Hongyu, Yu Chen, Wayne Xin Zhao, Ji-Rong Wen,
Abstract要約: 大型言語モデル (LLM) はレコメンダシステム (RS) に導入された。本稿では, LLM と TRM の自律的マルチターンインタラクションを実現する新しい RS である DeepRec を提案する。公開データセットの実験では、DeepRecは従来のものとLLMベースのベースラインの両方で大幅にパフォーマンスが向上している。
参考スコア（独自算出の注目度）: 83.21140655248624
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, large language models (LLMs) have been introduced into recommender systems (RSs), either to enhance traditional recommendation models (TRMs) or serve as recommendation backbones. However, existing LLM-based RSs often do not fully exploit the complementary advantages of LLMs (e.g., world knowledge and reasoning) and TRMs (e.g., recommendation-specific knowledge and efficiency) to fully explore the item space. To address this, we propose DeepRec, a novel LLM-based RS that enables autonomous multi-turn interactions between LLMs and TRMs for deep exploration of the item space. In each interaction turn, LLMs reason over user preferences and interact with TRMs to retrieve candidate items. After multi-turn interactions, LLMs rank the retrieved items to generate the final recommendations. We adopt reinforcement learning(RL) based optimization and propose novel designs from three aspects: recommendation model based data rollout, recommendation-oriented hierarchical rewards, and a two-stage RL training strategy. For data rollout, we introduce a preference-aware TRM, with which LLMs interact to construct trajectory data. For rewards, we design a hierarchical reward function that involves both process-level and outcome-level rewards to optimize the interaction process and recommendation performance, respectively. For RL training, we develop a two-stage training strategy, where the first stage aims to guide LLMs to interact with TRMs and the second stage focuses on performance improvement. Experiments on public datasets demonstrate that DeepRec significantly outperforms both traditional and LLM-based baselines, offering a new paradigm for deep exploration in recommendation systems.
Abstract（参考訳）: 近年,大規模言語モデル (LLM) がレコメンデーションシステム (RS) に導入され,従来のレコメンデーションモデル (TRM) を強化するか,あるいはレコメンデーションバックボーンとして機能する。しかし、既存の LLM ベースの RS は LLM と TRM の補完的優位性(例えば、世界的知識と推論)を十分に活用していないことが多い。そこで本稿では, LLM と TRM の自律的マルチターンインタラクションを実現するための新しい LLM ベースの RS である DeepRec を提案する。各インタラクションターンにおいて、LLMはユーザの好みを判断し、RTMと対話して候補アイテムを検索する。マルチターンインタラクションの後、LLMは検索したアイテムをランク付けし、最終的なレコメンデーションを生成する。本稿では,レコメンデーションモデルに基づくデータロールアウト,レコメンデーション指向階層型報酬,2段階RLトレーニング戦略の3つの側面から,強化学習(RL)に基づく最適化を採用し,新しい設計を提案する。データロールアウトには、LLMが相互作用してトラジェクトリデータを構成する、好み対応のTRMを導入する。報酬を得るためには、プロセスレベルと結果レベルの両方の報酬を含む階層的な報酬関数を設計し、それぞれがインタラクションプロセスとレコメンデーションパフォーマンスを最適化する。 RL トレーニングでは,第1段階は LLM を TRM と対話するためのガイド,第2段階はパフォーマンス改善に焦点を当てた2段階のトレーニング戦略を開発する。公開データセットの実験では、DeepRecは従来のものとLLMベースのベースラインの両方で大幅にパフォーマンスが向上し、レコメンデーションシステムの深い探索のための新しいパラダイムを提供する。

論文の概要: DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation

関連論文リスト