Fugu-MT 論文翻訳(概要): From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

論文の概要: From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

arxiv url: http://arxiv.org/abs/2503.05919v1
Date: Fri, 07 Mar 2025 20:35:31 GMT
ステータス: 翻訳完了
システム内更新日: 2025-03-11 20:09:44.051962
Title: From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning
Title（参考訳）: スタイルからファクトへ:ファインタニングによる知識注入の境界のマッピング
Authors: Eric Zhao, Pranjal Awasthi, Nika Haghtalab,
Abstract要約: Finetuningは、特定のタスクやレスポンススタイルのために言語モデルをカスタマイズするスケーラブルで費用対効果の高い手段を提供する。対照的に、従来の知恵は、微調整によって知識を注入すると、不安定な性能と一般化が低下する。我々は、フロンティアGemini v1.5モデルファミリーをデータセットのスペクトル上で微調整する大規模な実験を行った。
参考スコア（独自算出の注目度）: 40.141932069582204
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Finetuning provides a scalable and cost-effective means of customizing language models for specific tasks or response styles, with greater reliability than prompting or in-context learning. In contrast, the conventional wisdom is that injecting knowledge via finetuning results in brittle performance and poor generalization. We argue that the dichotomy of "task customization" (e.g., instruction tuning) and "knowledge injection" (e.g., teaching new facts) is a distinction without a difference. We instead identify concrete factors that explain the heterogeneous effectiveness observed with finetuning. To this end, we conduct a large-scale experimental study of finetuning the frontier Gemini v1.5 model family on a spectrum of datasets that are artificially engineered to interpolate between the strengths and failure modes of finetuning. Our findings indicate that question-answer training data formats provide much stronger knowledge generalization than document/article-style training data, numerical information can be harder for finetuning to retain than categorical information, and models struggle to apply finetuned knowledge during multi-step reasoning even when trained on similar examples -- all factors that render "knowledge injection" to be especially difficult, even after controlling for considerations like data augmentation and information volume. On the other hand, our findings also indicate that it is not fundamentally more difficult to finetune information about a real-world event than information about what a model's writing style should be.
Abstract（参考訳）: Finetuningは、特定のタスクやレスポンススタイルのために言語モデルをカスタマイズするスケーラブルで費用対効果の高い手段を提供する。対照的に、従来の知恵は、微調整によって知識を注入すると、不安定な性能と一般化が低下する。我々は「タスクのカスタマイズ」(例えば、指導のチューニング)と「知識の注入」(例えば、新しい事実を教える)の二分法は違いのない区別であると主張している。その代わりに、ファインタニングによって観察される不均一な効果を説明するための具体的な因子を同定する。この目的のために,我々は,フロンティアのGemini v1.5モデルファミリーを,ファインタニングの強度と障害モードの補間のために人工的に設計したデータセットのスペクトル上に微調整する大規模実験を行った。以上の結果から,質問応答学習データ形式は,文書やアーティクルスタイルの学習データよりもはるかに強力な知識一般化を提供し,数値情報は分類情報よりも微調整が難しいことが示唆された。一方,本研究では,実際のイベントに関する情報を,モデルが書くべきスタイルに関する情報よりも微調整することが根本的に困難ではないことも示唆した。

論文の概要: From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

関連論文リスト