Fugu-MT 論文翻訳(概要): A model of errors in transformers

論文の概要: A model of errors in transformers

arxiv url: http://arxiv.org/abs/2601.14175v1
Date: Tue, 20 Jan 2026 17:27:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-21 22:47:23.431216
Title: A model of errors in transformers
Title（参考訳）: 変圧器の誤差モデル
Authors: Suvrat Raju, Praneeth Netrapalli,
Abstract要約: 決定論的出力を必要とするタスクにおけるLLMの誤り率と,少人数の代替案から引き出されたトークンの繰り返し処理について検討する。注意機構の小さな誤差がしきい値を超えたときに、誤った予測が生じることを論じる。エラー率を減らすためにプロンプトを構築する方法を示す。
参考スコア（独自算出の注目度）: 14.482123927397135
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the error rate of LLMs on tasks like arithmetic that require a deterministic output, and repetitive processing of tokens drawn from a small set of alternatives. We argue that incorrect predictions arise when small errors in the attention mechanism accumulate to cross a threshold, and use this insight to derive a quantitative two-parameter relationship between the accuracy and the complexity of the task. The two parameters vary with the prompt and the model; they can be interpreted in terms of an elementary noise rate, and the number of plausible erroneous tokens that can be predicted. Our analysis is inspired by an ``effective field theory'' perspective: the LLM's many raw parameters can be reorganized into just two parameters that govern the error rate. We perform extensive empirical tests, using Gemini 2.5 Flash, Gemini 2.5 Pro and DeepSeek R1, and find excellent agreement between the predicted and observed accuracy for a variety of tasks, although we also identify deviations in some cases. Our model provides an alternative to suggestions that errors made by LLMs on long repetitive tasks indicate the ``collapse of reasoning'', or an inability to express ``compositional'' functions. Finally, we show how to construct prompts to reduce the error rate.
Abstract（参考訳）: 決定論的出力を必要とする算術演算や,少人数の代替案から引き出されたトークンの繰り返し処理といったタスクにおいて,LLMの誤り率について検討する。注意機構の小さな誤差がしきい値を超えたときに、誤った予測が生じ、この洞察を用いて、タスクの精度と複雑さの間の定量的な2パラメータの関係を導出する。 2つのパラメータはプロンプトとモデルによって異なり、基本的なノイズ率と予測可能な誤りトークンの数で解釈できる。 LLMの多くの生パラメータは、エラー率を管理する2つのパラメータに再編成することができる。我々は、Gemini 2.5 Flash、Gemini 2.5 Pro、DeepSeek R1を使用して広範な実証実験を行い、様々なタスクにおいて予測された精度と観測された精度の間に優れた一致を見出した。我々のモデルは、長い反復的なタスクにおいてLLMが犯した誤りが ` `collapse of reasoning'' を示す、あるいは ``compositional'' 関数を表現できないという提案に代わるものである。最後に、エラー率を減らすためにプロンプトを構築する方法を示す。

論文の概要: A model of errors in transformers

関連論文リスト