Fugu-MT 論文翻訳(概要): Context-Free Recognition with Transformers

論文の概要: Context-Free Recognition with Transformers

arxiv url: http://arxiv.org/abs/2601.01754v1
Date: Mon, 05 Jan 2026 03:14:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-06 16:25:22.748073
Title: Context-Free Recognition with Transformers
Title（参考訳）: 変圧器を用いた文脈自由認識
Authors: Selim Jerad, Anej Svete, Sophie Hao, Ryan Cotterell, William Merrill,
Abstract要約: 我々は、$mathcalO(log n)$ looping layerと$mathcalO(n6)$ padding tokensで全てのCFLを認識可能であることを示す。実験の結果を実証的に検証し,対数深度を必要とする言語にループが有効であることを示す。
参考スコア（独自算出の注目度）: 57.46376097734401
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformers excel on tasks that process well-formed inputs according to some grammar, such as natural language and code. However, it remains unclear how they can process grammatical syntax. In fact, under standard complexity conjectures, standard transformers cannot recognize context-free languages (CFLs), a canonical formalism to describe syntax, or even regular languages, a subclass of CFLs (Merrill et al., 2022). Merrill & Sabharwal (2024) show that $\mathcal{O}(\log n)$ looping layers (w.r.t. input length $n$) allows transformers to recognize regular languages, but the question of context-free recognition remained open. In this work, we show that looped transformers with $\mathcal{O}(\log n)$ looping layers and $\mathcal{O}(n^6)$ padding tokens can recognize all CFLs. However, training and inference with $\mathcal{O}(n^6)$ padding tokens is potentially impractical. Fortunately, we show that, for natural subclasses such as unambiguous CFLs, the recognition problem on transformers becomes more tractable, requiring $\mathcal{O}(n^3)$ padding. We empirically validate our results and show that looping helps on a language that provably requires logarithmic depth. Overall, our results shed light on the intricacy of CFL recognition by transformers: While general recognition may require an intractable amount of padding, natural constraints such as unambiguity yield efficient recognition algorithms.
Abstract（参考訳）: トランスフォーマーは、自然言語やコードなどの文法に従って、十分に整形された入力を処理するタスクを排他的に処理する。しかし、文法的な構文をどのように処理できるかは定かではない。実際、標準的な複雑性予想の下では、標準変換子は文脈自由言語(CFL)を認識できない。 Merrill & Sabharwal (2024) は、$\mathcal{O}(\log n)$ looping layer (w.r.t. input length $n$) が変換器が正規言語を認識できることを示したが、文脈自由認識の問題は未解決のままである。この研究では、$\mathcal{O}(\log n)$ looping layer と $\mathcal{O}(n^6)$ padding tokens が全ての CFL を認識可能であることを示す。しかし、$\mathcal{O}(n^6)$padding tokensによるトレーニングと推論は現実的ではない可能性がある。幸いなことに、不明瞭なCFLのような自然のサブクラスでは、変換器の認識問題はよりトラクタブルになり、$\mathcal{O}(n^3)$パディングが必要になる。実験の結果を実証的に検証し,対数深度を必要とする言語にループが有効であることを示す。一般認識には難易度の高いパディングを必要とするかもしれないが、曖昧さのような自然な制約は効率的な認識アルゴリズムをもたらす。

論文の概要: Context-Free Recognition with Transformers

関連論文リスト