Fugu-MT 論文翻訳(概要): Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

論文の概要: Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

arxiv url: http://arxiv.org/abs/2410.10796v1
Date: Tue, 22 Oct 2024 17:35:03 GMT
ステータス: 翻訳完了
システム内更新日: 2024-10-29 19:34:54.136487
Title: Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance
Title（参考訳）: コンテキストパラメトリックインバージョン:なぜインストラクションの微調整がコンテキスト信頼性を実際に改善しないのか
Authors: Sachin Goyal, Christina Baek, J. Zico Kolter, Aditi Raghunathan,
Abstract要約: 本研究では,この文脈依存の根底にある理由,特に指導調律後の理解を試みている。命令チューニング中、コンテキスト依存は期待通りに増大するが、命令の微調整が進むにつれて徐々に減少する。我々はこの現象を、入力コンテキストがモデルのパラメトリック知識にすでに存在する情報を提供する命令微調整データ混合の例に結びつける。
参考スコア（独自算出の注目度）: 68.56701216210617
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models are instruction-finetuned to enhance their ability to follow user instructions and process the input context. However, even state-of-the-art models often struggle to follow the instruction, especially when the input context is not aligned with the model's parametric knowledge. This manifests as various failures, such as hallucinations where the responses are outdated, biased or contain unverified facts. In this work, we try to understand the underlying reason for this poor context reliance, especially after instruction tuning. We observe an intriguing phenomenon: during instruction tuning, the context reliance initially increases as expected, but then gradually decreases as instruction finetuning progresses. We call this phenomenon context-parametric inversion and observe it across multiple general purpose instruction tuning datasets like TULU, Alpaca and Ultrachat, as well as model families such as Llama, Mistral and Pythia. In a simple theoretical setup, we isolate why context-parametric inversion occurs along the gradient descent trajectory of instruction finetuning. We tie this phenomena to examples in the instruction finetuning data mixture where the input context provides information that is already present in the model's parametric knowledge. Our analysis suggests natural mitigation strategies that provide some limited gains, while also validating our theoretical insights. We hope that our work serves as a starting point in addressing this failure mode in a staple part of LLM training.
Abstract（参考訳）: 大規模言語モデルは、ユーザ命令に従う能力を高め、入力コンテキストを処理できるように、命令精細化されている。しかし、最先端モデルでさえ、特に入力コンテキストがモデルのパラメトリック知識と一致していない場合、命令に従うのに苦労することが多い。これは、応答が時代遅れ、偏見があり、検証されていない事実を含む幻覚など、様々な失敗として現れます。本研究では,この文脈依存の根底にある理由,特に指導調律後の理解を試みている。命令チューニング中、コンテキスト依存は期待通りに増大するが、命令の微調整が進むにつれて徐々に減少する。我々は、この現象を文脈パラメトリック・インバージョンと呼び、TULU、Alpaca、Ultrachatといった汎用的なチューニングデータセットと、Llama、Mistral、Pythiaといったモデルファミリで観測する。簡単な理論的な設定で、命令微調整の勾配降下軌道に沿って文脈パラメトリック逆転が起こる理由を分離する。我々はこの現象を、入力コンテキストがモデルのパラメトリック知識にすでに存在する情報を提供する命令微調整データ混合の例に結びつける。我々の分析は、限定的な利得を提供する自然な緩和戦略を示唆し、理論的な洞察を検証している。 LLMトレーニングの基本的な部分において、この障害モードに対処する上で、私たちの作業が出発点となることを願っています。

関連論文リスト

On the generalization of language models from in-context learning and finetuning: a controlled study [36.384796130439035]
言語モデルの文脈内学習は、異なる帰納バイアスを示し、場合によってはより一般化できることを示す。本研究では,微調整データに文脈内推論を追加することによって,微調整による一般化を改善する手法を提案する。この結果は,言語モデルにおける学習様式の違いによる帰納バイアスの理解に影響を及ぼす。
論文参考訳（メタデータ） (2025-05-01T17:02:27Z)
On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
教師付き微調整後の文脈認識の喪失について検討した。性能低下は,会話指導の微調整中に学んだ異なる役割に対する偏見と関連していることがわかった。一般命令微調整データセットから文脈依存例を識別する指標を提案する。
論文参考訳（メタデータ） (2024-11-05T00:16:01Z)
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
In this work introduced Context-aware Prompt Tuning (CPT) - ICL, PT, and adversarial attack。入力および出力フォーマットのユニークな構造を考慮して、特定のコンテキストトークンを変更する。敵の攻撃にインスパイアされた我々は、損失を最大化するのではなく、最小化に焦点をあてて、コンテキストに存在するラベルに基づいて入力を調整する。
論文参考訳（メタデータ） (2024-10-22T17:45:47Z)
Information Guided Regularization for Fine-tuning Language Models [11.831883526217942]
我々は、よりスムーズな転写学習のために、より外科的な正規化アプローチが存在する必要があると論じる。モデル正規化の改善と下流一般化のための新しい手法を考案する。
論文参考訳（メタデータ） (2024-06-20T05:18:37Z)
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction [75.25114727856861]
大規模言語モデル(LLM)は、スーパービジョンされた微調整プロセスの後半で劣化する傾向にある。この問題に対処するための単純な分散結合フレームワークを導入する。我々のフレームワークは、一連の標準知識と推論ベンチマークに基づいて、データキュレーションや正規化の訓練など、様々な高度な手法より優れています。
論文参考訳（メタデータ） (2024-05-22T08:18:19Z)
Studying Large Language Model Behaviors Under Context-Memory Conflicts With Real Documents [54.953320616069654]
Retrieval-augmented Generationは、完全なパラメトリック言語モデルの多くの問題を緩和する。 RAGでは、コンテキストで提供される文書からモデルの知識を更新することができる。本稿では,そのような知識紛争を現実的に研究するための枠組みを提案する。
論文参考訳（メタデータ） (2024-04-24T17:59:36Z)
Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
LLM編集のスケーラビリティと堅牢性を向上させるため,EREN(Reading Notesによる編集モデル)を提案する。既存の技術とは異なり、複数の編集から知識を統合することができ、構文的に類似しているが意味的に無関係な入力に正しく反応する。
論文参考訳（メタデータ） (2024-03-26T06:57:23Z)
R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
大きな言語モデル(LLM)は、優れたパフォーマンスで多くのドメインに革命をもたらしたが、それでもその課題に直面している。事前の指導チューニング方法は、モデルが知識を知っているかどうかに関わらず、モデルに文章を完成させるよう強制する。我々はRefusal-Aware Instruction Tuning (R-Tuning)と呼ばれる新しいアプローチを提案する。実験の結果、R-Tuningは、既知の質問に答えたり、未知の質問に答えるのを控えるモデルの能力を効果的に改善することを示した。
論文参考訳（メタデータ） (2023-11-16T08:45:44Z)
Influence Tuning: Demoting Spurious Correlations via Instance Attribution and Instance-Driven Updates [26.527311287924995]
インフルエンスチューニングは、データの急激なパターンからモデルを分解するのに役立ちます。制御された設定では、インフルエンスチューニングは、データの急激なパターンからモデルを分解するのに役立ちます。
論文参考訳（メタデータ） (2021-10-07T06:59:46Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。