Fugu-MT 論文翻訳(概要): Dense-Jump Flow Matching with Non-Uniform Time Scheduling for Robotic Policies: Mitigating Multi-Step Inference Degradation

論文の概要: Dense-Jump Flow Matching with Non-Uniform Time Scheduling for Robotic Policies: Mitigating Multi-Step Inference Degradation

arxiv url: http://arxiv.org/abs/2509.13574v1
Date: Tue, 16 Sep 2025 22:28:27 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-18 18:41:50.665124
Title: Dense-Jump Flow Matching with Non-Uniform Time Scheduling for Robotic Policies: Mitigating Multi-Step Inference Degradation
Title（参考訳）: ロボットポリシーの非一様時間スケジューリングによるダンプ・ジャンプ流のマッチング:マルチステップ推論劣化の軽減
Authors: Zidong Chen, Zihao Guo, Peng Wang, ThankGod Itua Egbe, Yan Lyu, Chenghao Qian,
Abstract要約: フローマッチングは、ロボット工学で高品質な生成ポリシーを学ぶための競争フレームワークとして登場した。推論における統合ステップの数の増加は、政策性能を反故意に、そして普遍的に低下させる。本研究では,非一様時間スケジューリング(例えば,U字型)を訓練中に活用する新政策を提案する。
参考スコア（独自算出の注目度）: 9.24627229208295
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Flow matching has emerged as a competitive framework for learning high-quality generative policies in robotics; however, we find that generalisation arises and saturates early along the flow trajectory, in accordance with recent findings in the literature. We further observe that increasing the number of Euler integration steps during inference counter-intuitively and universally degrades policy performance. We attribute this to (i) additional, uniformly spaced integration steps oversample the late-time region, thereby constraining actions towards the training trajectories and reducing generalisation; and (ii) the learned velocity field becoming non-Lipschitz as integration time approaches 1, causing instability. To address these issues, we propose a novel policy that utilises non-uniform time scheduling (e.g., U-shaped) during training, which emphasises both early and late temporal stages to regularise policy training, and a dense-jump integration schedule at inference, which uses a single-step integration to replace the multi-step integration beyond a jump point, to avoid unstable areas around 1. Essentially, our policy is an efficient one-step learner that still pushes forward performance through multi-step integration, yielding up to 23.7% performance gains over state-of-the-art baselines across diverse robotic tasks.
Abstract（参考訳）: フローマッチングは,ロボット工学における高品質な生成ポリシーを学習するための競争的枠組みとして登場したが,近年の文献の知見に則って,フロー軌跡に沿って早期に一般化と飽和が生じることが判明した。さらに、推論中のオイラー積分ステップの増大は、政策性能を反故意かつ普遍的に低下させる。私たちはこれを当てはめます一遅滞領域を一括して一括統合し、訓練軌道に対する行動を制限し、一般化を減らし、 (2)積分時間が1に近づくと、学習速度場は非リプシッツとなる。これらの課題に対処するため、政策訓練の早期・後期の段階と推論時の密ジャンプ統合スケジュールに重点を置くトレーニング中の非一様時間スケジューリング(例:U字型)を活用する新しいポリシーを提案する。基本的に、我々の政策は効率的なワンステップ学習者であり、多段階統合によるパフォーマンス向上を推し進め、様々なロボットタスクにおける最先端のベースラインよりも最大で23.7%のパフォーマンス向上をもたらす。

論文の概要: Dense-Jump Flow Matching with Non-Uniform Time Scheduling for Robotic Policies: Mitigating Multi-Step Inference Degradation

関連論文リスト