Fugu-MT 論文翻訳(概要): A mathematical theory of balancing relational generalization and memorization

論文の概要: A mathematical theory of balancing relational generalization and memorization

arxiv url: http://arxiv.org/abs/2605.22972v1
Date: Thu, 21 May 2026 19:04:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 17:29:20.066754
Title: A mathematical theory of balancing relational generalization and memorization
Title（参考訳）: 関係一般化と記憶のバランスの数学的理論
Authors: Luke Cheng, Samuel Lippl,
Abstract要約: 我々は、タスクパラダイムの欠如が、この本質的な能力の研究を妨げると論じている。我々は、ニューラルネットワーク学習の単純で理論的に抽出可能なモデルの振る舞いを解析的に特徴付ける。これらのモデルは、関係一般化と記憶のバランスをとることができる。
参考スコア（独自算出の注目度）: 4.297070083645049
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Humans, animals, and modern machine learning models exhibit impressive abilities to learn complex behaviors and generalize these behaviors to unseen situations. This ability requires us to learn rules and regularities that allow for such generalizations. At the same time, in most complex environments, any rule will have its exceptions. How do learning systems balance between learning general regularities and memorizing exceptions? We argue that a lack of task paradigms has hindered the study of this essential ability. To address this gap, we introduce a novel task, transitive inference with exceptions, that tests for relational generalization and memorization of an exception to the relational rule. We then analytically characterize the behavior of a simple, theoretically tractable model of neural network learning (kernel ridge regression) across a broad family of representations and task parameters. We find that these models can balance between relational generalization and memorization, but unlike for transitive inference without an exception, successful generalization is sensitive to the specific representational geometry. We explain why this task is more challenging mechanistically by drawing on our analytical theory. Finally, we validate our theoretical insights in pretrained language models that are finetuned on ordered relations, finding that these models successfully generalize according to the transitive rule, but also make the kinds of systematic mistakes predicted by our theory. Overall, our theory shows how learning systems can balance between relational generalization and memorization, explains how this can go wrong, and emphasizes the need for new task paradigms designed to probe this ability.
Abstract（参考訳）: 人間、動物、そして現代の機械学習モデルは、複雑な振る舞いを学び、これらの振る舞いを目に見えない状況に一般化する印象的な能力を示す。この能力は、そのような一般化を可能にする規則や規則を学ぶ必要がある。同時に、ほとんどの複雑な環境では、どんなルールにも例外がある。学習システムは一般正規性と暗記例外との間にどのようにバランスをとるか? 我々は、タスクパラダイムの欠如が、この本質的な能力の研究を妨げると論じている。このギャップに対処するために、リレーショナルな一般化のためのテストと、リレーショナルな規則の例外を記憶する新しいタスク、例外付き推移的推論を導入する。そこで我々は、ニューラルネットワーク学習(カーネルリッジ回帰)の単純で理論的に抽出可能なモデルの動作を、幅広い表現とタスクパラメータのファミリーにわたって解析的に特徴付ける。これらのモデルは関係一般化と記憶のバランスをとることができるが、例外のない推移的推論とは異なり、成功一般化は特定の表現幾何学に敏感である。この課題が、我々の分析理論に基づいて、機械的により困難である理由を説明する。最後に、順序関係を微調整した事前学習言語モデルの理論的洞察を検証し、これらのモデルが推移規則に従って一般化するのに成功し、また、我々の理論によって予測される体系的誤りの種別も検証する。全体として、我々の理論は、学習システムがリレーショナル一般化と記憶のバランスをとる方法を示し、これがいかにうまくいかないかを説明し、この能力を探求するために設計された新しいタスクパラダイムの必要性を強調している。

論文の概要: A mathematical theory of balancing relational generalization and memorization

関連論文リスト