Fugu-MT 論文翻訳(概要): LLMRec: Benchmarking Large Language Models on Recommendation Task

論文の概要: LLMRec: Benchmarking Large Language Models on Recommendation Task

arxiv url: http://arxiv.org/abs/2308.12241v1
Date: Wed, 23 Aug 2023 16:32:54 GMT
ステータス: 翻訳完了
システム内更新日: 2023-08-24 13:25:19.793152
Title: LLMRec: Benchmarking Large Language Models on Recommendation Task
Title（参考訳）: LLMRec: 推奨タスクによる大規模言語モデルのベンチマーク
Authors: Junling Liu, Chao Liu, Peilin Zhou, Qichen Ye, Dading Chong, Kang Zhou, Yueqi Xie, Yuwei Cao, Shoujin Wang, Chenyu You, Philip S.Yu
Abstract要約: 推奨領域におけるLarge Language Models (LLMs) の適用について, 十分に検討されていない。我々は、評価予測、シーケンシャルレコメンデーション、直接レコメンデーション、説明生成、レビュー要約を含む5つのレコメンデーションタスクにおいて、市販のLLMをベンチマークする。ベンチマークの結果,LLMは逐次的・直接的推薦といった精度に基づくタスクにおいて適度な熟練度しか示さないことがわかった。
参考スコア（独自算出の注目度）: 54.48899723591296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, the fast development of Large Language Models (LLMs) such as ChatGPT has significantly advanced NLP tasks by enhancing the capabilities of conversational models. However, the application of LLMs in the recommendation domain has not been thoroughly investigated. To bridge this gap, we propose LLMRec, a LLM-based recommender system designed for benchmarking LLMs on various recommendation tasks. Specifically, we benchmark several popular off-the-shelf LLMs, such as ChatGPT, LLaMA, ChatGLM, on five recommendation tasks, including rating prediction, sequential recommendation, direct recommendation, explanation generation, and review summarization. Furthermore, we investigate the effectiveness of supervised finetuning to improve LLMs' instruction compliance ability. The benchmark results indicate that LLMs displayed only moderate proficiency in accuracy-based tasks such as sequential and direct recommendation. However, they demonstrated comparable performance to state-of-the-art methods in explainability-based tasks. We also conduct qualitative evaluations to further evaluate the quality of contents generated by different models, and the results show that LLMs can truly understand the provided information and generate clearer and more reasonable results. We aspire that this benchmark will serve as an inspiration for researchers to delve deeper into the potential of LLMs in enhancing recommendation performance. Our codes, processed data and benchmark results are available at https://github.com/williamliujl/LLMRec.
Abstract（参考訳）: 近年,ChatGPTのような大規模言語モデル(LLM)の急速な開発は,対話型モデルの能力を高めることで,NLPタスクを大幅に進歩させた。しかし,レコメンデーション領域におけるLSMの応用は十分には研究されていない。このギャップを埋めるため,様々なレコメンデーションタスク上でLLMをベンチマークするためのレコメンデーションシステムであるLLMRecを提案する。具体的には,評価予測,逐次レコメンデーション,直接レコメンデーション,説明生成,要約のレビューなど5つのレコメンデーションタスクについて,チャットgpt,llama,chatglmなどの一般市販llmをベンチマークした。さらに,LLMの命令コンプライアンス能力を向上させるために,教師付き微調整の有効性を検討する。その結果,llmは逐次的および直接的レコメンデーションなどの正確性に基づくタスクの適度な熟練度しか示さなかった。しかし、彼らは説明可能性ベースのタスクで最先端のメソッドに匹敵するパフォーマンスを示した。また,異なるモデルが生成するコンテンツの質を評価するための質的評価を行い,LLMが提供した情報を真に理解し,より明確で合理的な結果を得ることができることを示した。このベンチマークは、研究者がレコメンデーションパフォーマンスを高めるLLMの可能性を深く掘り下げるためのインスピレーションになることを期待しています。私たちのコード、処理されたデータ、ベンチマークの結果はhttps://github.com/williamliujl/llmrec.comで閲覧できます。

論文の概要: LLMRec: Benchmarking Large Language Models on Recommendation Task

関連論文リスト