Fugu-MT 論文翻訳(概要): Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

論文の概要: Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

arxiv url: http://arxiv.org/abs/2307.11019v2
Date: Sun, 23 Jul 2023 16:52:59 GMT
ステータス: 翻訳完了
システム内更新日: 2023-07-25 11:12:53.047536
Title: Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
Title（参考訳）: 検索強化による大規模言語モデルの事実知識境界の検討
Authors: Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang
Abstract要約: 大規模言語モデル(LLM)は,質問に応答する能力に対して,波及しない自信を持っていることを示す。検索の強化は、LLMの知識境界に対する認識を高める効果的なアプローチであることが証明されている。また, LLM は, 回答の定式化に際し, 提案した検索結果に依存する傾向が認められた。
参考スコア（独自算出の注目度）: 91.30946119104111
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge-intensive tasks (e.g., open-domain question answering (QA)) require a substantial amount of factual knowledge and often rely on external information for assistance. Recently, large language models (LLMs) (e.g., ChatGPT), have demonstrated impressive prowess in solving a wide range of tasks with world knowledge, including knowledge-intensive tasks. However, it remains unclear how well LLMs are able to perceive their factual knowledge boundaries, particularly how they behave when incorporating retrieval augmentation. In this study, we present an initial analysis of the factual knowledge boundaries of LLMs and how retrieval augmentation affects LLMs on open-domain QA. Specially, we focus on three primary research questions and analyze them by examining QA performance, priori judgement and posteriori judgement of LLMs. We show evidence that LLMs possess unwavering confidence in their capabilities to respond to questions and the accuracy of their responses. Furthermore, retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries, thereby improving their judgemental abilities. Additionally, we also find that LLMs have a propensity to rely on the provided retrieval results when formulating answers, while the quality of these results significantly impacts their reliance. The code to reproduce this work is available at https://github.com/RUCAIBox/LLM-Knowledge-Boundary.
Abstract（参考訳）: 知識集約的なタスク(例えば、オープンドメイン質問応答(QA))は、かなりの量の事実知識を必要とし、しばしば援助のために外部情報に依存する。最近の大規模言語モデル(例えばchatgpt)は、知識集約的なタスクを含む、世界的知識による幅広いタスクの解決において印象的な能力を示している。しかし、LLMが実際の知識境界、特に検索強化を取り入れた場合の行動をどのように認識できるかは、まだ不明である。本研究では,オープンドメインQA上でのLLMの実態知識境界と検索の増大がLLMに与える影響について,初期分析を行った。特に,3つの主要な研究課題に焦点をあて,QA評価,事前判定,後部判定による分析を行った。 llmが質問に対する回答能力と回答の正確性に不当な自信を持っている証拠を示す。さらに,検索の強化は,llmsの知識境界に対する意識向上に有効なアプローチであることが証明され,その判断能力が向上した。さらに, LLMは, 回答の定式化に際し, 提案した検索結果に依存する傾向があり, これらの結果の質がそれらの信頼性に大きく影響することがわかった。この作業を再現するコードはhttps://github.com/RUCAIBox/LLM-Knowledge-Boundaryで公開されている。

関連論文リスト

Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style [13.968658352075334]
記憶力とエビデンス提示が外的エビデンスに対する大規模言語モデルの受容性に及ぼす影響について検討する。以上の結果から,LLMはメモリの強度が高い場合,内部メモリに依存しやすい可能性が示唆された。これらの知見は,検索機能向上と文脈認識型LCMの改善に寄与する。
論文参考訳（メタデータ） (2024-09-17T07:44:06Z)
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
大規模言語モデル(LLM)の知識文書は、時代遅れや誤った知識のためにLLMの記憶と矛盾する可能性がある。我々は,知識紛争解決のための新しいデータセットKNOTを構築した。
論文参考訳（メタデータ） (2024-04-04T16:40:11Z)
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
本稿では,スリムプロキシモデルを用いた大規模言語モデル (LLM) における知識不足を検知する新しい協調手法であるSlimPLMを提案する。パラメータがはるかに少ないプロキシモデルを採用し、回答を回答としています。ヒューリスティックな回答は、LLM内の既知の未知の知識と同様に、ユーザの質問に答えるために必要な知識を予測するのに使用される。
論文参考訳（メタデータ） (2024-02-19T11:11:08Z)
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation [66.01754585188739]
大規模言語モデル(LLM)は、特定の知識を持っていないことを知るのが困難であることが判明した。 Retrieval Augmentation (RA)はLLMの幻覚を緩和するために広く研究されている。本稿では,LLMの知識境界に対する認識を高めるためのいくつかの手法を提案する。
論文参考訳（メタデータ） (2024-02-18T04:57:19Z)
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge [69.79676144482792]
本研究の目的は,LLMが外部知識から信頼できる情報を識別する能力を評価することである。本ベンチマークは,質問応答とテキスト生成という2つのタスクから構成される。
論文参考訳（メタデータ） (2023-11-14T13:24:19Z)
Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism [0.0]
大規模言語モデル(LLM)は印象的な言語理解と生成能力を示している。これらのモデルは欠陥がなく、しばしばエラーや誤報を含む応答を生成する。本稿では,LLMに対して,誤りを避けるために,難解な質問への回答を拒否するように指示する拒絶機構を提案する。
論文参考訳（メタデータ） (2023-11-02T07:20:49Z)
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity [61.54815512469125]
本調査は,大規模言語モデル(LLM)における事実性の重要課題に対処する。 LLMが様々な領域にまたがる応用を見出すにつれ、その出力の信頼性と正確性は重要となる。
論文参考訳（メタデータ） (2023-10-11T14:18:03Z)
"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs [15.660128743249611]
大規模言語モデル(LLM)は、パラメトリック知識として知られる事前学習中に広範な知識を取得する。 LLMは必然的にユーザとの対話中に外部知識を必要とする。外部知識がパラメトリック知識に干渉した場合、LCMはどのように反応するのだろうか?
論文参考訳（メタデータ） (2023-09-15T17:47:59Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。