Fugu-MT 論文翻訳(概要): Boosting Instruction Following at Scale

論文の概要: Boosting Instruction Following at Scale

arxiv url: http://arxiv.org/abs/2510.14842v1
Date: Thu, 16 Oct 2025 16:15:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-17 21:15:14.943212
Title: Boosting Instruction Following at Scale
Title（参考訳）: 規模を拡大するインストラクションの促進
Authors: Ben Elder, Evelyn Duesterwald, Vinod Muthusamy,
Abstract要約: Instruction Boostingは2つの命令に対して最大7ポイント、10つの命令に対して最大4ポイントの命令追従率を改善する。また,より多くの命令が加えられるにつれて,性能が低下する傾向がよく見られる。
参考スコア（独自算出の注目度）: 4.400551468585969
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: A typical approach developers follow to influence an LLM's behavior in an application is through careful manipulation of the prompt, such as by adding or modifying instructions. However, merely adding more instructions provides little assurance that they will actually be followed. We introduce Instruction Boosting as a post-generation method to increase the reliability of LLM prompt instructions. We show that Instruction Boosting improves the instruction following rate by up to 7 points for two instructions and up to 4 points for ten instructions. To demonstrate these results we introduce SCALEDIF, a benchmark with a scaled instruction volume of up to ten instructions per data sample. We also present an analysis of the commonly observed trend that performance degrades as more instructions are added. We show that an important factor contributing to this trend is the degree of tension and conflict that arises as the number of instructions is increased. We contribute a quantitative conflict scoring tool that explains the observed performance trends and provides feedback to developers on the impact that additional prompt instructions have on a model's performance.
Abstract（参考訳）: アプリケーションにおけるLCMの動作に影響を与える典型的なアプローチは、命令の追加や修正など、プロンプトを慎重に操作することである。しかし、単に命令を追加するだけで、実際に従う保証がほとんどない。 LLMプロンプト命令の信頼性を高めるために,インストラクションブースティングをポストジェネレーション法として導入する。 Instruction Boostingは2つの命令に対して最大7ポイント、10つの命令に対して最大4ポイントの命令追従率を改善する。これらの結果を示すために,データサンプル毎に最大10命令のスケールされた命令ボリュームを持つベンチマークであるSCALEDIFを導入する。また,より多くの命令が加えられるにつれて,性能が低下する傾向がよく見られる。この傾向に寄与する重要な要因は,指示数の増加に伴って生じる緊張と対立の程度である。我々は、観測されたパフォーマンストレンドを説明する定量的なコンフリクトスコアリングツールを提供し、モデルのパフォーマンスに追加のインプロンプトが与える影響について開発者にフィードバックを提供する。

論文の概要: Boosting Instruction Following at Scale

関連論文リスト