Fugu-MT 論文翻訳(概要): Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

論文の概要: Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

arxiv url: http://arxiv.org/abs/2505.21851v1
Date: Wed, 28 May 2025 00:48:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-29 17:35:50.343399
Title: Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories
Title（参考訳）: ストリームフローポリシー:アクショントラジェクトリをフロートラジェクトリとして扱うことで拡散$/$フローマッチングポリシを簡略化する
Authors: Sunshine Jiang, Xiaolin Fang, Nicholas Roy, Tomás Lozano-Pérez, Leslie Pack Kaelbling, Siddharth Ancha,
Abstract要約: 動作軌跡をフロー軌跡として扱うことで拡散$/$flowポリシーを簡素化する。我々のアルゴリズムは、最後のアクションの周囲の狭いガウシアンからサンプリングする。フローマッチングによって学習された速度場を漸進的に統合し、単一の軌道を構成する一連のアクションを生成する。
参考スコア（独自算出の注目度）: 40.67946168216781
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in diffusion$/$flow-matching policies have enabled imitation learning of complex, multi-modal action trajectories. However, they are computationally expensive because they sample a trajectory of trajectories: a diffusion$/$flow trajectory of action trajectories. They discard intermediate action trajectories, and must wait for the sampling process to complete before any actions can be executed on the robot. We simplify diffusion$/$flow policies by treating action trajectories as flow trajectories. Instead of starting from pure noise, our algorithm samples from a narrow Gaussian around the last action. Then, it incrementally integrates a velocity field learned via flow matching to produce a sequence of actions that constitute a single trajectory. This enables actions to be streamed to the robot on-the-fly during the flow sampling process, and is well-suited for receding horizon policy execution. Despite streaming, our method retains the ability to model multi-modal behavior. We train flows that stabilize around demonstration trajectories to reduce distribution shift and improve imitation learning performance. Streaming flow policy outperforms prior methods while enabling faster policy execution and tighter sensorimotor loops for learning-based robot control. Project website: https://streaming-flow-policy.github.io/
Abstract（参考訳）: 拡散$/$flow-matchingポリシーの最近の進歩は、複雑なマルチモーダルな行動軌跡の模倣学習を可能にした。しかし、それらは運動軌跡の軌跡(拡散$/$flow軌跡)をサンプリングするため、計算的に高価である。彼らは中間動作軌跡を破棄し、サンプリングプロセスが完了するまでロボット上で何らかの動作を実行するのを待たなければならない。動作軌跡をフロー軌跡として扱うことで拡散$/$flowポリシーを簡素化する。純粋なノイズから始める代わりに、我々のアルゴリズムは最後のアクションの周囲の狭いガウスからサンプルをサンプリングする。そして、フローマッチングによって学習された速度場を漸進的に統合し、単一の軌道を構成する一連のアクションを生成する。これにより、フローサンプリングプロセス中にアクションをロボットにオンザフライでストリーミングすることができ、水平方針実行の後退に適している。ストリーミングにもかかわらず,本手法はマルチモーダル動作をモデル化する能力を維持している。実演軌道の周囲を安定させる流れを訓練し,分布シフトを低減し,模倣学習性能を向上させる。ストリーミングフローポリシーは、より高速なポリシー実行と学習に基づくロボット制御のためのより緊密な感覚運動回路を実現するとともに、従来の手法よりも優れる。プロジェクトウェブサイト: https://streaming-flow-policy.github.io/

論文の概要: Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

関連論文リスト