Fugu-MT 論文翻訳(概要): FODMP: Fast One-Step Diffusion of Movement Primitives Generation for Time-Dependent Robot Actions

論文の概要: FODMP: Fast One-Step Diffusion of Movement Primitives Generation for Time-Dependent Robot Actions

arxiv url: http://arxiv.org/abs/2603.24806v1
Date: Wed, 25 Mar 2026 20:38:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-27 20:52:47.987366
Title: FODMP: Fast One-Step Diffusion of Movement Primitives Generation for Time-Dependent Robot Actions
Title（参考訳）: FODMP: 時間依存型ロボット行動のための高速1ステップ移動プリミティブ生成
Authors: Xirui Shi, Arya Ebrahimi, Yi Hu, Jun Jin,
Abstract要約: 拡散モデルはロボット学習にますます使われているが、現在のデザインは明確なトレードオフに直面している。本稿では,拡散モデルをProDMPs軌道パラメータ空間に蒸留し,単一ステップデコーダを用いて動きを生成する新しいフレームワークFODMPを提案する。
参考スコア（独自算出の注目度）: 8.898076552253583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models are increasingly used for robot learning, but current designs face a clear trade-off. Action-chunking diffusion policies like ManiCM are fast to run, yet they only predict short segments of motion. This makes them reactive, but unable to capture time-dependent motion primitives, such as following a spring-damper-like behavior with built-in dynamic profiles of acceleration and deceleration. Recently, Movement Primitive Diffusion (MPD) partially addresses this limitation by parameterizing full trajectories using Probabilistic Dynamic Movement Primitives (ProDMPs), thereby enabling the generation of temporally structured motions. Nevertheless, MPD integrates the motion decoder directly into a multi-step diffusion process, resulting in prohibitively high inference latency that limits its applicability in real-time control settings. We propose FODMP (Fast One-step Diffusion of Movement Primitives), a new framework that distills diffusion models into the ProDMPs trajectory parameter space and generates motion using a single-step decoder. FODMP retains the temporal structure of movement primitives while eliminating the inference bottleneck through single-step consistency distillation. This enables robots to execute time-dependent primitives at high inference speed, suitable for closed-loop vision-based control. On standard manipulation benchmarks (MetaWorld, ManiSkill), FODMP runs up to 10 times faster than MPD and 7 times faster than action-chunking diffusion policies, while matching or exceeding their success rates. Beyond speed, by generating fast acceleration-deceleration motion primitives, FODMP allows the robot to intercept and securely catch a fast-flying ball, whereas action-chunking diffusion policy and MPD respond too slowly for real-time interception.
Abstract（参考訳）: 拡散モデルはロボット学習にますます使われているが、現在のデザインは明確なトレードオフに直面している。 ManiCMのようなアクションチャンキング拡散ポリシーは動作が速いが、動きの短い部分しか予測できない。これにより、それらは反応するが、加速と減速の動的なプロファイルが組み込まれているバネダンパーのような振る舞いに従うような、時間依存の運動プリミティブをキャプチャできない。近年,probabilistic Dynamic Movement Primitives (ProDMPs) を用いて全軌道をパラメータ化することにより,時間的に構造化された動きの発生を可能にすることで,この制限を部分的に解決している。それにもかかわらず、MPDはモーションデコーダを直接マルチステップ拡散プロセスに統合し、その結果、リアルタイム制御設定における適用性を制限する、非常に高い推論遅延をもたらす。本稿では,FODMP(Fast One-step Diffusion of Movement Primitives)を提案する。このフレームワークは,拡散モデルをProDMPs軌道パラメータ空間に蒸留し,単一ステップデコーダを用いて動きを生成する。 FODMPは、単一段階の連続蒸留による推論ボトルネックを排除しつつ、運動プリミティブの時間構造を保っている。これによりロボットは、クローズドループビジョンベースの制御に適した、時間依存プリミティブを高い推論速度で実行することができる。標準的な操作ベンチマーク(MetaWorld、ManiSkill)では、FODMPはMPDの最大10倍、アクションチャンキング拡散ポリシーの最大7倍の速度で動作し、その成功率にマッチまたは超えている。速度を超えて、高速加速減速モーションプリミティブを生成することで、FODMPはロボットが高速飛行するボールをインターセプトし、確実にキャッチすることを可能にする。

関連論文リスト

From Flow to One Step: Real-Time Multi-Modal Trajectory Policies via Implicit Maximum Likelihood Estimation-based Distribution Distillation [18.70033095161235]
Indicit Likelihood Estimation (IMLE) を用いて条件付きフローマッチングの専門家を高速な単一ステップの学生に蒸留する枠組みを提案する。双方向のチャンファー距離は、モードカバレッジと忠実度の両方を促進する設定レベルの目的を提供する。統合認識エンコーダは、さらに多視点RGB、深度、点雲、プロプレセプションを幾何学的認識表現に統合する。
論文参考訳（メタデータ） (2026-03-10T09:30:05Z)
Causal Motion Diffusion Models for Autoregressive Motion Generation [19.61051102039212]
因果運動拡散モデル(CMDM)は自己回帰運動生成のための統合されたフレームワークである。 CMDMはMAC-VAE(Motion-Language-Aligned Causal VAE)の上に構築され、動作シーケンスを時間的因果潜在表現にエンコードする。 HumanML3DとSnapMoGenの実験では、CMDMは、意味的忠実度と時間的滑らかさの両方において、既存の拡散モデルと自己回帰モデルより優れていることを示した。
論文参考訳（メタデータ） (2026-02-26T03:58:25Z)
OMP: One-step Meanflow Policy with Directional Alignment [26.114675928221974]
高忠実でリアルタイムな操作のために設計されたワンステップ平均フローポリシー(OMP)。 AdroitとMeta-Worldベンチマークの実験では、OMPは成功率と軌道精度において最先端の手法より優れていることが示された。
論文参考訳（メタデータ） (2025-12-22T12:45:35Z)
Characterizing Motion Encoding in Video Diffusion Timesteps [50.13907856401258]
本研究では,映像拡散時間ステップにおける動きのエンコードについて,外観編集と動作保存のトレードオフによって検討する。動作優位の早期体制と,その後に出現優位の体制を同定し,時間空間における動作優位の境界を導出する。
論文参考訳（メタデータ） (2025-12-18T21:20:54Z)
Diffusion Model-based Activity Completion for AI Motion Capture from Videos [2.9271399793140076]
現在のAIモーションキャプチャ法は、従来のモーションキャプチャと同様、観察されたビデオシーケンスに完全に依存している。本稿では,人間の動作系列を相補的に生成する拡散モデルに基づく動作完了手法を提案する。ゲートモジュールと位置時間埋め込みモジュールを導入することで,Human3.6Mデータセット上での競合的な結果が得られる。
論文参考訳（メタデータ） (2025-05-27T05:04:50Z)
FRMD: Fast Robot Motion Diffusion with Consistency-Distilled Movement Primitives for Smooth Action Generation [3.7351623987275873]
本研究では,スムーズかつ時間的に一貫したロボットの動きを生成するための高速ロボット運動拡散法を提案する。本手法は,移動プリミティブ(MP)と一貫性モデルを統合し,効率的な単一ステップ軌道生成を実現する。その結果,FRMDはより高速でスムーズな軌道を発生し,高い成功率を達成できた。
論文参考訳（メタデータ） (2025-03-03T20:56:39Z)
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation [57.539634387672656]
現在の最先端生成拡散モデルでは、優れた結果が得られたが、品質を犠牲にすることなく、高速な生成に苦慮している。高速かつ高品質な人体運動生成のための効率的な運動拡散モデル(EMDM)を提案する。
論文参考訳（メタデータ） (2023-12-04T18:58:38Z)
StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
将来を予測する能力を備えたモデルを提供し、ストリーミング知覚の結果を大幅に改善する。本稿では,複数の速度を駆動するシーンについて考察し,VasAP(Velocity-Awared streaming AP)を提案する。本手法は,Argoverse-HDデータセットの最先端性能を実現し,SAPとVsAPをそれぞれ4.7%,VsAPを8.2%改善する。
論文参考訳（メタデータ） (2022-07-21T12:03:02Z)
PAN: Towards Fast Action Recognition via Learning Persistence of Appearance [60.75488333935592]
最先端のほとんどの手法は、動きの表現として密度の高い光の流れに大きく依存している。本稿では,光学的フローに依存することで,高速な動作認識に光を当てる。我々はPersistence of Outearance(PA)と呼ばれる新しい動きキューを設計する。光学的流れとは対照的に,我々のPAは境界における運動情報の蒸留に重点を置いている。
論文参考訳（メタデータ） (2020-08-08T07:09:54Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。