Fugu-MT 論文翻訳(概要): Bearing Syntactic Fruit with Stack-Augmented Neural Networks

論文の概要: Bearing Syntactic Fruit with Stack-Augmented Neural Networks

arxiv url: http://arxiv.org/abs/2511.03547v1
Date: Wed, 05 Nov 2025 15:30:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-06 18:19:32.461715
Title: Bearing Syntactic Fruit with Stack-Augmented Neural Networks
Title（参考訳）: スタック強化ニューラルネットワークを用いた軸受合成果実
Authors: Brian DuSell, Ryan Cotterell,
Abstract要約: スタック拡張ニューラルネットワークは、標準的なアーキテクチャよりも、人間の言語習得の正確なモデルであることを示す。また、階層的一般化を改善するスタックRNNアーキテクチャの修正も提案する。
参考スコア（独自算出の注目度）: 59.49467149799849
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Any finite set of training data is consistent with an infinite number of hypothetical algorithms that could have generated it. Studies have shown that when human children learn language, they consistently favor hypotheses based on hierarchical syntactic rules without ever encountering disambiguating examples. A recent line of work has inquired as to whether common neural network architectures share this bias, finding that they do so only under special conditions: when syntactically supervised, when pre-trained on massive corpora, or when trained long past convergence. In this paper, we demonstrate, for the first time, neural network architectures that are able to generalize in human-like fashion without any of the aforementioned requirements: stack-augmented neural networks. We test three base architectures (transformer, simple RNN, LSTM) augmented with two styles of stack: the superposition stack of Joulin & Mikolov (2015) and a nondeterministic generalization of it proposed by DuSell & Chiang (2023). We find that transformers with nondeterministic stacks generalize best out of these architectures on a classical question formation task. We also propose a modification to the stack RNN architecture that improves hierarchical generalization. These results suggest that stack-augmented neural networks may be more accurate models of human language acquisition than standard architectures, serving as useful objects of psycholinguistic study. Our code is publicly available.
Abstract（参考訳）: 訓練データの有限集合は、それを生成できる無限個の仮説的アルゴリズムと一致している。研究では、人間の子供が言語を学ぶとき、不明瞭な例に遭遇することなく、階層的な統語規則に基づく仮説を一貫して好んでいることが示されている。最近の一連の研究は、一般的なニューラルネットワークアーキテクチャがこのバイアスを共有しているかどうかを問うもので、それが特別な条件下でのみ行われていることを発見した。本稿では、上記の要件を満たさずに人間のような方法で一般化できるニューラルネットワークアーキテクチャを初めて実証する。 We test three base architectures (transformer, simple RNN, LSTM) augmented with two style of stack: superposition stack of Joulin & Mikolov (2015) and a nonterministic generalization of it proposed by DuSell & Chiang (2023)。非決定論的スタックを持つ変換器は、古典的な疑問形成タスクにおいて、これらのアーキテクチャのベストを一般化する。また、階層的一般化を改善するスタックRNNアーキテクチャの修正も提案する。これらの結果は、スタック拡張ニューラルネットワークは、標準的なアーキテクチャよりも人間の言語習得の正確なモデルであり、精神言語学研究の有用な対象である可能性を示唆している。私たちのコードは公開されています。

論文の概要: Bearing Syntactic Fruit with Stack-Augmented Neural Networks

関連論文リスト