Fugu-MT 論文翻訳(概要): Training-Free Imitation Learning with Closed-Form Diffusion Policies

論文の概要: Training-Free Imitation Learning with Closed-Form Diffusion Policies

arxiv url: http://arxiv.org/abs/2606.01238v1
Date: Sun, 31 May 2026 13:40:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 21:34:29.462996
Title: Training-Free Imitation Learning with Closed-Form Diffusion Policies
Title（参考訳）: 閉鎖型拡散政策を用いた学習自由模倣学習
Authors: Raghav Mishra, Ian R. Manchester,
Abstract要約: そこで我々は,模擬学習のための学習自由拡散型政策のクラスであるClosed-Form Diffusion Policiesを紹介する。ハードウェア実験において,モバイルCPUを用いたリアルタイム推論でCFDPをデプロイし,データセットから直接ミリ秒で模倣を実現できることを示す。本稿では, クローズドフォーム拡散ポリシが, 事前学習した神経拡散ポリシをデータ駆動型推論時間編集可能なプリミティブとして機能することを示す。
参考スコア（独自算出の注目度）: 3.151184728006369
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While diffusion-based policies have impressive performance and expressivity, their long offline training slows down the data collection and policy deployment loop. We introduce Closed-Form Diffusion Policies, a class of training-free diffusion-based policies for imitation learning using the closed-form score derived from the demonstration dataset. We deploy CFDP with real-time inference with a mobile CPU in hardware experiments, showing it can successfully perform imitation directly from the dataset in milliseconds and with faster inference than neural diffusion policies. In experiments on imitation learning benchmarks, we show that CFDP is competitive against neural baselines that require hours of training, providing a favorable tradeoff between training time and performance. Finally, we show how closed-form diffusion policies act as a composable primitive that enables data-driven inference-time editing of pre-trained neural diffusion policies, including policy guidance and novel demonstration augmentation.
Abstract（参考訳）: 拡散ベースのポリシーは優れたパフォーマンスと表現力を持っているが、長いオフライントレーニングはデータ収集とポリシー展開のループを遅くする。実演データセットから得られたクローズドフォームスコアを用いて,模擬学習のための学習自由拡散型ポリシーのクラスであるClosed-Form Diffusion Policiesを紹介する。ハードウェア実験において,モバイルCPUを用いたリアルタイム推論でCFDPをデプロイし,ミリ秒でデータセットから直接,ニューラルネットワークの拡散ポリシよりも高速な推論で再現を実現できることを示す。模倣学習ベンチマークの実験では、CFDPは、トレーニング時間とパフォーマンスの間の良好なトレードオフを提供するために、数時間のトレーニングを必要とする神経ベースラインと競合することを示した。最後に、クローズドフォーム拡散ポリシーが、事前訓練された神経拡散ポリシーのデータ駆動推論時編集を可能にする構成可能なプリミティブとして機能しているかを示す。

論文の概要: Training-Free Imitation Learning with Closed-Form Diffusion Policies

関連論文リスト