Fugu-MT 論文翻訳(概要): Retrieval Augmented Conversational Recommendation with Reinforcement Learning

論文の概要: Retrieval Augmented Conversational Recommendation with Reinforcement Learning

arxiv url: http://arxiv.org/abs/2604.04457v1
Date: Mon, 06 Apr 2026 06:08:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:19.113325
Title: Retrieval Augmented Conversational Recommendation with Reinforcement Learning
Title（参考訳）: 強化学習による検索強化会話推薦
Authors: Zhenrui Yue, Honglei Zhuang, Zhen Qin, Zhankui He, Huimin Zeng, Julian McAuley, Dong Wang,
Abstract要約: 大規模言語モデル(LLM)は、言語理解と生成の強化された能力を示す。本稿では,新たな2段階検索拡張会話推薦フレームワークであるRARを紹介する。
参考スコア（独自算出の注目度）: 43.40183980321253
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) exhibit enhanced capabilities in language understanding and generation. By utilizing their embedded knowledge, LLMs are increasingly used as conversational recommender systems (CRS), achieving improved performance across diverse scenarios. However, existing LLM-based methods rely on pretrained knowledge without external retrieval mechanisms for novel items. Additionally, the lack of a unified corpus poses challenges for integrating retrieval augmentation into CRS. Motivated by these challenges, we present RAR, a novel two-stage retrieval augmented conversational recommendation framework that aligns retrieval and generation to enhance both performance and factuality. To support this framework and provide a unified corpus, we construct a large-scale movie corpus, comprising over 300k movies with rich metadata, such as titles, casts and plot summaries. Leveraging this data, our primary contribution is RAR, the first framework to departs from standard two-stage CRS by dynamically bridging retrieval and generation. First, a retriever model generates candidate items based on user history; in the subsequent stage, an LLM refines the recommendations by incorporating conversational context with retrieved results. In addition, we introduce a novel reinforcement learning (RL) method that leverages LLM feedback to iteratively update the retriever. By creating a collaborative feedback loop that reinforces sampled candidate sets with higher ranking metrics, RAR effectively mitigates the misalignment between the retrieval and generation stages. Furthermore, grounding the LLM in factual metadata allows our RL-driven approach to capture subtle user intentions and generate context-aware recommendations with reduced hallucinations. We validate our approach through extensive experiments on multiple benchmarks, where RAR consistently outperforms state-of-the-art baseline methods.
Abstract（参考訳）: 大規模言語モデル(LLM)は、言語理解と生成の強化された能力を示す。組込み知識を利用することで、LLMは会話レコメンデーションシステム(CRS)として利用され、様々なシナリオにおけるパフォーマンスの向上を実現している。しかし,既存のLCM手法は,新規項目の外部検索機構を使わずに事前学習した知識に頼っている。さらに、統一コーパスの欠如は、検索強化をCRSに統合する上での課題となっている。これらの課題に触発されたRARは、検索と生成を整合させて、パフォーマンスと現実性の両方を高める、新しい2段階検索拡張会話レコメンデーションフレームワークである。この枠組みをサポートし、統一されたコーパスを提供するために、タイトル、キャスト、プロットサマリーなどの豊富なメタデータを持つ300万本以上の映画からなる大規模な映画コーパスを構築した。このデータを活用すれば、RARは、検索と生成を動的にブリッジすることで、標準の2段階CRSから離脱する最初のフレームワークになります。まず、検索者モデルは、ユーザ履歴に基づいて候補項目を生成し、その後の段階で、LLMは、会話コンテキストを検索結果に組み込んだレコメンデーションを洗練する。さらに,LLMフィードバックを利用してリトリーバーを反復的に更新する新しい強化学習(RL)手法を提案する。 RARは、サンプリングされた候補セットを高いランクのメトリクスで強化する協調的なフィードバックループを作成することにより、検索と生成ステージ間のミスアライメントを効果的に軽減する。さらに、LLMを実際のメタデータでグラウンド化することで、RL駆動のアプローチで微妙なユーザ意図を捉え、幻覚を減らしてコンテキスト認識のレコメンデーションを生成することができます。我々は、RARが最先端のベースラインメソッドを一貫して上回るような、複数のベンチマークでの広範な実験を通じて、我々のアプローチを検証する。

論文の概要: Retrieval Augmented Conversational Recommendation with Reinforcement Learning

関連論文リスト