Fugu-MT 論文翻訳(概要): Context versus Prior Knowledge in Language Models

論文の概要: Context versus Prior Knowledge in Language Models

arxiv url: http://arxiv.org/abs/2404.04633v1
Date: Sat, 6 Apr 2024 13:46:53 GMT
ステータス: 翻訳完了
システム内更新日: 2024-04-09 20:19:42.374932
Title: Context versus Prior Knowledge in Language Models
Title（参考訳）: 言語モデルにおける文脈と事前知識
Authors: Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell,
Abstract要約: 言語モデルは、事前学習中に学んだ事前知識と、文脈で提示された新しい情報を統合する必要があることが多い。本稿では,モデルがコンテキストと先行するエンティティへの依存性を測定するための2つの相互情報ベースメトリクスを提案する。
参考スコア（独自算出の注目度）: 49.17879668110546
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. Following well-established measurement modeling methods, we empirically test for the validity and reliability of these metrics. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.
Abstract（参考訳）: 質問に答えるために、言語モデルはしばしば、事前学習中に学んだ事前知識と、文脈で提示された新しい情報を統合する必要がある。モデルは、トレーニングコーパスの露出が大きいため、より親しみやすいエンティティ(例えば、人、場所など)に関する質問に対する事前の知識に頼り、いくつかのコンテキストによってより容易に説得される、という仮説を立てています。この問題を定式化するために、あるコンテキストに対するモデルの依存性と、そのエンティティに関する先行性を測定するための2つの相互情報ベースのメトリクスを提案する。確立された測定モデリング手法に従って,これらの指標の有効性と信頼性を実証的に検証する。最後に、スコアとモデルが期待するエンティティとの親和性の関係を調べ、その利点を説明するための2つのユースケースを提供します。

関連論文リスト

Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
文脈と問合せの間のポイントワイドな相互情報は,言語モデルの性能向上に有効な指標であることを示す。本稿では,文書と質問のポイントワイドな相互情報を利用する2つの手法を提案する。
論文参考訳（メタデータ） (2024-11-12T13:14:09Z)
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
検索強化に関して,言語モデルの信頼性に焦点をあてる。検索強化言語モデルには,文脈的知識とパラメトリック的知識の両方に応じて応答を供給できる本質的な能力があると考えられる。言語モデルと人間の嗜好の整合性に着想を得て,検索強化言語モデルを外部証拠にのみ依存する状況に整合させるための第一歩を踏み出した。
論文参考訳（メタデータ） (2024-10-22T09:25:21Z)
Estimating Knowledge in Large Language Models Without Generating a Single Token [12.913172023910203]
大規模言語モデル(LLM)における知識を評価するための現在の手法は、モデルをクエリし、生成した応答を評価する。本研究では,モデルがテキストを生成する前に評価を行うことができるかどうかを問う。様々なLLMを用いた実験では、内部の主題表現を訓練した単純なプローブであるKEENが、両方のタスクで成功することが示された。
論文参考訳（メタデータ） (2024-06-18T14:45:50Z)
Lost in the Middle: How Language Models Use Long Contexts [88.78803442320246]
本研究では,言語モデルの性能を2つのタスクで解析する。関連する情報の位置を変えると,性能が著しく低下することがわかった。我々の分析は、言語モデルが入力コンテキストをどのように使用するかをよりよく理解し、将来の長文言語モデルのための新しい評価プロトコルを提供する。
論文参考訳（メタデータ） (2023-07-06T17:54:11Z)
The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources in Natural Language Understanding Systems [87.3207729953778]
我々は、データセット上で最先端のコア参照解決モデルを評価する。いくつかのモデルは、事前訓練時間と推論時間の両方で観察された知識について、オンザフライで推論するのに苦労している。それでも、最高のパフォーマンスモデルでさえ、推論時にのみ提示される知識を確実に統合するのは難しいようです。
論文参考訳（メタデータ） (2022-12-15T23:26:54Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
大規模言語モデル(LLM)は、自然言語処理(NLP)の一連のブレークスルーをもたらした。これらのモデルをさらに切り離すのは、事前訓練中に内在する膨大な量の世界的知識だ。モデルの世界知識が、文脈で提示された事実情報とどのように相互作用するかは、まだ解明されていない。
論文参考訳（メタデータ） (2022-11-09T18:58:29Z)
Representing Knowledge by Spans: A Knowledge-Enhanced Model for Information Extraction [7.077412533545456]
本稿では,エンティティとリレーションの両方の表現を同時に学習する事前学習モデルを提案する。スパンをスパンモジュールで効率的に符号化することで、私たちのモデルはエンティティとそれらの関係を表現できますが、既存のモデルよりもパラメータが少なくなります。
論文参考訳（メタデータ） (2022-08-20T07:32:25Z)
Multi-Modal Subjective Context Modelling and Recognition [19.80579219657159]
我々は,時間,場所,活動,社会的関係,対象の5次元を捉える新しい存在論的文脈モデルを提案する。実世界のデータに対する最初の文脈認識実験は、我々のモデルの約束を示唆している。
論文参考訳（メタデータ） (2020-11-19T05:42:03Z)
How Far are We from Effective Context Modeling? An Exploratory Study on Semantic Parsing in Context [59.13515950353125]
文法に基づく意味解析を行い,その上に典型的な文脈モデリング手法を適用する。我々は,2つの大きなクロスドメインデータセットに対して,13のコンテキストモデリング手法を評価した。
論文参考訳（メタデータ） (2020-02-03T11:28:10Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。