Fugu-MT 論文翻訳(概要): Compressing Observation History into Agent Memory: Distilling Transformers into Recurrent Transformers

論文の概要: Compressing Observation History into Agent Memory: Distilling Transformers into Recurrent Transformers

arxiv url: http://arxiv.org/abs/2606.21562v1
Date: Fri, 19 Jun 2026 15:58:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 12:42:59.989776
Title: Compressing Observation History into Agent Memory: Distilling Transformers into Recurrent Transformers
Title（参考訳）: 観測履歴をエージェントメモリに圧縮する:蒸留トランスをリカレントトランスに変換する
Authors: Philippe Weinzaepfel, Christian Wolf, Bülent Mert Sariyildiz, Guillaume Bono, Gianluca Monaci,
Abstract要約: 我々は、地図のないポーズ推定のような長距離ストリーミングビジョンとロボット工学の応用に焦点を当てる。このギャップは、アーキテクチャ上の制約ではなく、これらのモデルが過去の情報を圧縮する方法の違いによるものである、と我々は主張する。提案手法により,線形時間複雑度でリカレント潜伏型ロボットメモリのトレーニングが可能であり,フルヒストリー変換器の性能ギャップを著しく狭めることができることを示す。
参考スコア（独自算出の注目度）: 25.583827940095322
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformers are AI's workhorse with strong performance in modeling sequential data, but their computational cost becomes prohibitive when processing long sequences. We target long-horizon streaming vision and robotics applications like map-free pose estimation, where it is particularly impractical to store and maintain a history of observations. Recurrent Transformers address this limitation by maintaining fixed-size memory but their performance lags behind that of transformers operating over the full observation history. We argue that this gap does not stem from architectural limitations, but from differences in how these models learn to compress past information. Without access to an observation history, recurrent models must explicitly decide what to retain in memory at each step, a significantly harder learning problem. In this work, we propose a distillation approach that transfers the compression strategy of a classical full-history transformer to a recurrent variant. We enable this by designing a teacher model that explicitly compresses its observation history into a fixed-size bottleneck representation. By directly supervising the student's memory with this bottleneck representation, we align the two compression mechanisms. We show that this approach allows to train a recurrent latent robotic memory with linear-time complexity while substantially narrowing the performance gap to full-history transformers.
Abstract（参考訳）: トランスフォーマーは、シーケンシャルデータのモデリングにおいて強力なパフォーマンスを持つAIのワークホースであるが、長いシーケンスを処理する場合、その計算コストは禁じられる。我々は、特に観測履歴の保存と維持が不可能な、地図のないポーズ推定のような長距離ストリーミングビジョンとロボット工学の応用をターゲットにしている。リカレントトランスフォーマーは、固定サイズのメモリを維持することでこの制限に対処するが、その性能はフル観測履歴上で動作しているトランスフォーマーよりも遅れている。このギャップは、アーキテクチャ上の制約ではなく、これらのモデルが過去の情報を圧縮する方法の違いによるものである、と我々は主張する。観測履歴にアクセスできない場合、再帰モデルは各ステップでメモリに保持すべきものを明示的に決定する必要がある。本研究では,古典的フルヒストリー変圧器の圧縮戦略を再帰変圧器に伝達する蒸留手法を提案する。我々は,観察履歴を固定サイズのボトルネック表現に明示的に圧縮する教師モデルを設計することで,これを実現した。このボトルネック表現で生徒の記憶を直接監視することにより、2つの圧縮機構を整列する。提案手法により,線形時間複雑度でリカレント潜伏型ロボットメモリのトレーニングが可能であり,フルヒストリー変換器の性能ギャップを著しく狭めることができることを示す。

論文の概要: Compressing Observation History into Agent Memory: Distilling Transformers into Recurrent Transformers

関連論文リスト