Fugu-MT 論文翻訳(概要): Memory Retrieval in Visuomotor Policies for Long-Horizon Robot Control

論文の概要: Memory Retrieval in Visuomotor Policies for Long-Horizon Robot Control

arxiv url: http://arxiv.org/abs/2606.25136v1
Date: Tue, 23 Jun 2026 20:07:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 17:05:30.135936
Title: Memory Retrieval in Visuomotor Policies for Long-Horizon Robot Control
Title（参考訳）: 長軸ロボット制御のためのビジュモータ法における記憶検索
Authors: Rutav Shah, Yisu Li, Femi Bello, Yuke Zhu, Roberto Martín-Martín,
Abstract要約: 家庭などの部分的に観測可能な環境で動く汎用ロボットは、自律性をサポートするために記憶を必要とする。本稿では,長期制御のための注意に基づくメモリ検索機構を備えたビジュモータポリシーであるHALOを紹介する。
参考スコア（独自算出の注目度）: 33.5619212312672
License: http://creativecommons.org/licenses/by/4.0/
Abstract: General-purpose robots operating in partially observable environments, such as homes, require memory to support autonomy. They must recall diverse information from the past, such as where objects were placed, which tasks a human partner has completed, and when an appliance was turned on. Achieving this versatility requires a general memory retrieval mechanism. Transformer architectures that use attention over long contexts for memory retrieval provide a promising approach, as they learn retrieval from data rather than relying on task-specific or hand-designed rules. However, directly incorporating them into imitation learning from offline data introduces two key challenges: (1) the policy may learn spurious correlations between past information and predicted actions, and (2) errors accumulate in memory due to prediction inaccuracies and their compounding interactions with the environment, causing model drift and cascading failures. To address both challenges, we introduce HALO, a visuomotor policy with an attention-based memory retrieval mechanism for long-horizon control. First, to suppress spurious correlations, HALO distills vision-language model (VLM) priors into the policy. It generates memory-dependent question--answer pairs from demonstration trajectories and trains jointly with a video question--answering objective, steering retrieval toward task-relevant information. Second, to reduce the impact of accumulated errors in memory during closed-loop control, HALO uses sparse attention that restricts retrieval to only the most relevant parts of the history. Together, these components enable more reliable long-horizon control by guiding the policy to retrieve task-relevant information from up to eight minutes of past experience. Project website: https://robin-lab.cs.utexas.edu/HALO
Abstract（参考訳）: 家庭などの部分的に観測可能な環境で動く汎用ロボットは、自律性をサポートするために記憶を必要とする。それらは、オブジェクトがどこに置かれているか、人間のパートナーが完了したタスク、アプライアンスがオンになったときなど、過去のさまざまな情報を思い出さなければならない。この汎用性を達成するには、一般的なメモリ検索機構が必要である。長期にわたるメモリ検索に注意を払っているトランスフォーマーアーキテクチャは、タスク固有のルールや手設計のルールに頼るのではなく、データから検索を学ぶという、有望なアプローチを提供する。しかし, オフラインデータからの模倣学習に直接組み込むことは, 1) 過去の情報と予測行動との間の素早い相関関係を学習し, (2) 予測不正確性や環境との複合的相互作用による記憶に蓄積されたエラーをモデルドリフトやカスケード障害の原因とする2つの重要な課題をもたらす。両課題に対処するために,長軸制御のための注意に基づくメモリ検索機構を備えたビジュモータポリシーであるHALOを導入する。まず,突発的な相関を抑えるために,HALOは視覚言語モデル(VLM)をそのポリシーに導入する。デモトラジェクトリからメモリ依存の質問対を生成し、ビデオ質問対象と共同で訓練し、タスク関連情報に対する検索を操る。第二に、閉ループ制御におけるメモリの蓄積エラーの影響を低減するため、HALOは、検索を履歴の最も関連性の高い部分のみに制限するスパースアテンションを使用する。これらのコンポーネントは、過去の経験から最大8分間のタスク関連情報を検索するためのポリシーを導くことで、より信頼性の高いロングホライゾン制御を可能にする。プロジェクトウェブサイト:https://robin-lab.cs.utexas.edu/HALO

論文の概要: Memory Retrieval in Visuomotor Policies for Long-Horizon Robot Control

関連論文リスト