Fugu-MT 論文翻訳(概要): From Local Corrections to Generalized Skills: Improving Neuro-Symbolic Policies with MEMO

論文の概要: From Local Corrections to Generalized Skills: Improving Neuro-Symbolic Policies with MEMO

arxiv url: http://arxiv.org/abs/2603.04560v1
Date: Wed, 04 Mar 2026 19:44:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:41.911481
Title: From Local Corrections to Generalized Skills: Improving Neuro-Symbolic Policies with MEMO
Title（参考訳）: 局所的補正から一般スキルへ:MEMOによる神経-細胞性政策の改善
Authors: Benjamin A. Christie, Yinlong Dai, Mohammad Bararjanianbahnamiri, Simon Stepputtis, Dylan P. Losey,
Abstract要約: メモリ拡張操作(MEMO)は、人間のフィードバックとタスクの成功から集められたスキルブックを構築し、維持する。 MEMOは、このスキルブックから関連するテキストとコードを取得し、ロボットのポリシーは、マルチタスクの人間のフィードバックを推論しながら、新しいスキルを生成することができる。
参考スコア（独自算出の注目度）: 9.795234317898002
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent works use a neuro-symbolic framework for general manipulation policies. The advantage of this framework is that -- by applying off-the-shelf vision and language models -- the robot can break complex tasks down into semantic subtasks. However, the fundamental bottleneck is that the robot needs skills to ground these subtasks into embodied motions. Skills can take many forms (e.g., trajectory snippets, motion primitives, coded functions), but regardless of their form skills act as a constraint. The high-level policy can only ground its language reasoning through the available skills; if the robot cannot generate the right skill for the current task, its policy will fail. We propose to address this limitation -- and dynamically expand the robot's skills -- by leveraging user feedback. When a robot fails, humans can intuitively explain what went wrong (e.g., ``no, go higher''). While a simple approach is to recall this exact text the next time the robot faces a similar situation, we hypothesize that by collecting, clustering, and re-phrasing natural language corrections across multiple users and tasks, we can synthesize more general text guidance and coded skill templates. Applying this hypothesis we develop Memory Enhanced Manipulation (MEMO). MEMO builds and maintains a retrieval-augmented skillbook gathered from human feedback and task successes. At run time, MEMO retrieves relevant text and code from this skillbook, enabling the robot's policy to generate new skills while reasoning over multi-task human feedback. Our experiments demonstrate that using MEMO to aggregate local feedback into general skill templates enables generalization to novel tasks where existing baselines fall short. See supplemental material here: https://collab.me.vt.edu/memo
Abstract（参考訳）: 近年の研究では、一般的な操作ポリシーにニューロシンボリック・フレームワークを使用している。このフレームワークの利点は、オフザシェルフビジョンと言語モデルを適用することで、ロボットが複雑なタスクをセマンティックなサブタスクに分解できることだ。しかし、基本的なボトルネックは、ロボットがこれらのサブタスクを具体化して動作させるスキルを必要とすることだ。スキルは多くのフォーム(トラジェクトリスニペット、モーションプリミティブ、コード化された関数など)を取ることができるが、それらのフォームスキルは制約として機能する。ロボットが現在のタスクに適切なスキルを生成できない場合、そのポリシーは失敗する。我々は,この制限に対処し,ユーザのフィードバックを活用してロボットのスキルを動的に拡張することを提案する。ロボットが失敗すると、人間が直感的に何が起きたのかを説明することができる(例: ``no, go higher'')。ロボットが同様の状況に直面したときに、この正確なテキストをリコールすることが簡単なアプローチであるが、複数のユーザやタスクにわたって自然言語の修正を収集、クラスタリング、再表現することにより、より一般的なテキストガイダンスとコーディングされたスキルテンプレートを合成できると仮定する。この仮説を適用して、メモリ拡張マニピュレーション(MEMO)を開発する。 MEMOは、人間のフィードバックとタスクの成功から集められた、検索強化されたスキルブックを構築し、維持する。実行時に、MEMOは、このスキルブックから関連するテキストとコードを取得し、マルチタスクの人間のフィードバックを推論しながら、ロボットのポリシーが新しいスキルを生成することができる。実験により,MEMOを用いて局所的なフィードバックを一般的なスキルテンプレートに集約することで,既存のベースラインが不足したタスクに一般化できることが実証された。補足資料はこちらを参照。

論文の概要: From Local Corrections to Generalized Skills: Improving Neuro-Symbolic Policies with MEMO

関連論文リスト