Fugu-MT 論文翻訳(概要): Transformer See, Transformer Do: Copying as an Intermediate Step in Learning Analogical Reasoning

論文の概要: Transformer See, Transformer Do: Copying as an Intermediate Step in Learning Analogical Reasoning

arxiv url: http://arxiv.org/abs/2604.06501v1
Date: Tue, 07 Apr 2026 22:15:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.261483
Title: Transformer See, Transformer Do: Copying as an Intermediate Step in Learning Analogical Reasoning
Title（参考訳）: Transformer See, Transformer Do: アナロジカル推論学習における中間ステップとしてのコピー
Authors: Philipp Hellwig, Willem Zuidema, Claire E. Stevenson, Martha Lewis,
Abstract要約: メタラーニング(Meta-Learning for compositionality, MLC)を用いて, 類似推論タスクでトランスフォーマーを訓練する。我々は,最も情報に富む問題要素にモデルを導く際に,文字文字列の類推が学習可能となることを発見した。私たちの3層エンコーダデコーダモデルは、ほとんどのフロンティアモデルより優れています。
参考スコア（独自算出の注目度）: 2.1424510747711314
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Analogical reasoning is a hallmark of human intelligence, enabling us to solve new problems by transferring knowledge from one situation to another. Yet, developing artificial intelligence systems capable of robust human-like analogical reasoning has proven difficult. In this work, we train transformers using Meta-Learning for Compositionality (MLC) on an analogical reasoning task (letter-string analogies) and assess their generalization capabilities. We find that letter-string analogies become learnable when guiding the models to attend to the most informative problem elements induced by including copying tasks in the training data. Furthermore, generalization to new alphabets becomes better when models are trained with more heterogeneous datasets, where our 3-layer encoder-decoder model outperforms most frontier models. The MLC approach also enables some generalization to compositions of trained transformations, but not to completely novel transformations. To understand how the model operates, we identify an algorithm that approximates the model's computations. We verify this using interpretability analyses and show that the model can be steered precisely according to expectations derived from the algorithm. Finally, we discuss implications of our findings for generalization capabilities of larger models and parallels to human analogical reasoning.
Abstract（参考訳）: 分析推論は人間の知能の目印であり、ある状況から別の状況に知識を移すことで、新しい問題を解決することができる。しかし、堅牢な人間のようなアナロジー推論が可能な人工知能システムの開発は困難であることが証明されている。本研究では,メタラーニング・フォー・コンポジションネス(MLC)を類似推論タスク(レター・ストリング・アナロジー)で学習し,その一般化能力を評価する。トレーニングデータにタスクのコピーを含めることで、最も情報に富む問題要素にモデルを導く際に、文字文字列の類推が学習可能となる。さらに、3層エンコーダデコーダモデルはフロンティアモデルよりも優れており、モデルがより異質なデータセットで訓練されると、新しいアルファベットへの一般化がより良くなる。 MLCアプローチはまた、訓練された変換の合成にいくつかの一般化を可能にするが、完全に新しい変換は起こらない。モデルがどのように動作するかを理解するため、モデルの計算を近似するアルゴリズムを同定する。解釈可能性解析を用いてこれを検証し,アルゴリズムから得られる期待値に応じて,モデルが正確に操縦可能であることを示す。最後に,より大規模なモデルと,ヒトの類推的推論に類似する並列モデルの一般化能力について,本研究の意義について考察する。

論文の概要: Transformer See, Transformer Do: Copying as an Intermediate Step in Learning Analogical Reasoning

関連論文リスト