Fugu-MT 論文翻訳(概要): Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation

論文の概要: Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation

arxiv url: http://arxiv.org/abs/2603.17019v1
Date: Tue, 17 Mar 2026 18:02:28 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.335785
Title: Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation
Title（参考訳）: トランスフォーマーは、今まで見たことのないルールを学べる:補間を超えた計算の証明
Authors: Andy Gray,
Abstract要約: 2つの制御された設定で強みのみの仮説をテストする。実験1では、純粋なXOR遷移規則を持つセルオートマトンを用いる。実験2では、整数上のシンボリック作用素鎖を1つの作用素対が持ち上がった状態で研究する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A central question in the LLM debate is whether transformers can infer rules absent from training, or whether apparent generalisation reduces to similarity-based interpolation over observed examples. We test a strong interpolation-only hypothesis in two controlled settings: one where interpolation is ruled out by construction and proof, and one where success requires emitting intermediate symbolic derivations rather than only final answers. In Experiment 1, we use a cellular automaton with a pure XOR transition rule and remove specific local input patterns from training; since XOR is linearly inseparable, each held-out pattern's nearest neighbours have the opposite label, so similarity-based predictors fail on the held-out region. Yet a two-layer transformer recovers the rule (best 100%; 47/60 converged runs), and circuit extraction identifies XOR computation. Performance depends on multi-step constraint propagation: without unrolling, accuracy matches output bias (63.1%), while soft unrolling reaches 96.7%. In Experiment 2, we study symbolic operator chains over integers with one operator pair held out; the model must emit intermediate steps and a final answer in a proof-like format. Across all 49 holdout pairs, the transformer exceeds every interpolation baseline (mean 41.8%, up to 78.6%; mean KRR 4.3%; KNN and MLP score 0% on every pair), while removing intermediate-step supervision degrades performance. Together with a construction showing that a standard transformer block can implement exact local Boolean rules, these results provide an existence proof that transformers can learn rule structure not directly observed in training and express it explicitly, ruling out the strongest architectural form of interpolation-only accounts: that transformers cannot in principle discover and communicate unseen rules, while leaving open when such behaviour arises in large-scale language training.
Abstract（参考訳）: LLMの議論の中心的な問題は、トランスフォーマーがトレーニングから欠落した規則を推測できるのか、あるいは明らかな一般化が観察例よりも類似性に基づく補間に還元されるのかである。補間のみの仮説を2つの制御された環境で検証する: 補間を構成と証明によって除外する; 成功させるには最終回答だけでなく中間的記号導出を出力する必要がある。実験1では、純粋なXOR遷移規則を持つセルオートマトンを用いて、訓練から特定の局所的な入力パターンを除去する。しかし、2層トランスはルールを回復し(最多100%; 47/60収束実行)、回路抽出によりXOR計算が特定される。精度は出力バイアス(63.1%)と一致し、ソフトアンローリングは96.7%に達する。実験2では,1つの演算子ペアを持つ整数上のシンボリック演算子連鎖について検討する。全ての49組のホールトアウトペアの中で、トランスフォーマーは全ての補間ベースライン(平均41.8%、平均78.6%、平均KRR 4.3%、KNNとMLPのスコア0%)を超え、中間段階の監督は性能を低下させる。標準的なトランスフォーマーブロックが正確な局所ブール規則を実装可能であることを示す構成とともに、これらの結果は、トランスフォーマーがトレーニングで直接観察されていないルール構造を学習し、明示的に表現し、インターポーレーションのみのアカウントの最も強いアーキテクチャ形態を除外できることを示す。

論文の概要: Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation

関連論文リスト