Fugu-MT 論文翻訳(概要): Probing Length Generalization in Mamba via Image Reconstruction

論文の概要: Probing Length Generalization in Mamba via Image Reconstruction

arxiv url: http://arxiv.org/abs/2603.12499v1
Date: Thu, 12 Mar 2026 22:32:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-16 17:38:11.795639
Title: Probing Length Generalization in Mamba via Image Reconstruction
Title（参考訳）: 画像再構成によるマンバにおける長さ一般化の提案
Authors: Jan Rathjens, Robin Schiewer, Laurenz Wiskott, Anand Subramoney,
Abstract要約: また,マンバの性能は,トレーニング中に見られたものよりも推定シーケンスの長さが長い場合に劣化することを示した。シーケンス処理の異なる段階における再構成を解析することにより,マンバはトレーニング中に遭遇するシーケンス長の分布に質的に適応することを明らかにした。我々は,トレーニングシーケンスの長さをまたいだパフォーマンスを向上させる,長さ適応型Mambaを提案する。
参考スコア（独自算出の注目度）: 0.434964016971127
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mamba has attracted widespread interest as a general-purpose sequence model due to its low computational complexity and competitive performance relative to transformers. However, its performance can degrade when inference sequence lengths exceed those seen during training. We study this phenomenon using a controlled vision task in which Mamba reconstructs images from sequences of image patches. By analyzing reconstructions at different stages of sequence processing, we reveal that Mamba qualitatively adapts its behavior to the distribution of sequence lengths encountered during training, resulting in strategies that fail to generalize beyond this range. To support our analysis, we introduce a length-adaptive variant of Mamba that improves performance across training sequence lengths. Our results provide an intuitive perspective on length generalization in Mamba and suggest directions for improving the architecture.
Abstract（参考訳）: マンバは、計算の複雑さが低く、トランスフォーマーと比較して競争性能が高いため、汎用シーケンスモデルとして広く関心を集めている。しかし、推論シーケンスの長さがトレーニング中に見られるものを超えると、その性能は低下する。我々は,マンバが画像パッチのシーケンスから画像を再構成する制御された視覚タスクを用いて,この現象を研究する。シーケンス処理の異なる段階における再構成を解析することにより、マンバはトレーニング中に遭遇するシーケンス長の分布に質的に適応し、その結果、この範囲を超えて一般化できない戦略を導出する。分析を支援するために,トレーニングシーケンス長をまたいだ性能向上を行うMambaの長適応型を導入する。この結果は,マンバにおける長さ一般化の直感的な視点を提供し,アーキテクチャ改善の方向性を提案する。

論文の概要: Probing Length Generalization in Mamba via Image Reconstruction

関連論文リスト