Fugu-MT 論文翻訳(概要): Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks

論文の概要: Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks

arxiv url: http://arxiv.org/abs/2512.02677v1
Date: Tue, 02 Dec 2025 12:04:51 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-03 21:04:45.852344
Title: Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks
Title（参考訳）: 帰納的論理問題を解くための大規模言語モデルにおける深さ一般化の探索
Authors: Zhiyuan He,
Abstract要約: トランスフォーマーアーキテクチャは、トレーニング中に遭遇するよりも深い再帰を伴う問題に苦しむことを示す。この制限はスタックのような振舞いを維持することができないことに起因する。我々は,問題を管理可能なサブコンポーネントに分解するループ式位置交換パイプラインを開発した。
参考スコア（独自算出の注目度）: 1.0378456753266476
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models have demonstrated remarkable capabilities across many tasks, yet face significant challenges when dealing with recursive reasoning problems, those requiring the resolution of nested hierarchical structures. While prior research has extensively studied length generalization (a model's ability to handle longer sequences than seen during training), we investigate a distinct and underexplored limitation: depth generalization. Here, depth refers to the number of nested levels in a hierarchical problem, such as the layers of parentheses in a mathematical expression or the nesting of logical clauses in a Boolean formula. Our work reveals that standard transformer architectures struggle with problems involving deeper recursion than encountered during training, even when they perform well on longer but non-nested sequences. This limitation stems from their inability to maintain stack-like behavior, the capacity to track and resolve multiple levels of nested dependencies. Through systematic analysis, we demonstrate how this architectural constraint leads to rapid performance decay as the depth of the recursion increases. To address this challenge, we develop a novel looped locate-and-replace pipeline that decomposes recursive problems into manageable subcomponents. The approach employs two specialized models: a locator that identifies solvable subexpressions and a replacer that evaluates these components while preserving the overall structure. We evaluated this method in three carefully designed domains: Boolean algebra, recursive arithmetic, and propositional logic, each with a controllable depth of recursion. We show that our method effectively alleviates the performance decay when tested on out-of-distribution recursion depth.
Abstract（参考訳）: 大規模言語モデルは、多くのタスクにわたって顕著な能力を示してきたが、再帰的推論問題、ネストされた階層構造の解決を必要とする問題を扱う際には重大な課題に直面している。従来の研究では、長さの一般化(トレーニング中に見られるよりも長いシーケンスを扱うモデルの能力)を広範囲に研究してきたが、未解明の限界である深さの一般化(deep generalization)について検討した。ここで、深さとは、数学的表現における括弧の層やブール式における論理節の入れ子など、階層的な問題における入れ子レベルの数を指す。我々の研究によると、標準的なトランスフォーマーアーキテクチャは、トレーニング中に遭遇するよりも深い再帰に関わる問題に苦しむ。この制限は、スタックのような振る舞いを維持することができず、複数のレベルのネストされた依存関係を追跡して解決する能力に起因している。系統的な解析を通じて、再帰の深さが増加するにつれて、この構造的制約が性能の急激な低下につながることを示す。この課題に対処するために、再帰的な問題を管理可能なサブコンポーネントに分解するループ式位置置換パイプラインを開発した。このアプローチでは、解決可能な部分表現を識別するロケータと、全体構造を維持しながらこれらのコンポーネントを評価する置換器の2つの特別なモデルが採用されている。我々は,この手法をブール代数,再帰算術,命題論理の3つの慎重に設計した領域で評価し,それぞれが制御可能な再帰深度を持つことを示した。本手法は, 分布外再帰深さ試験において, 性能劣化を効果的に軽減することを示す。

論文の概要: Exploring Depth Generalization in Large Language Models for Solving Recursive Logic Tasks

関連論文リスト