Fugu-MT 論文翻訳(概要): DiRecT: Safe Diffusion-Based Planning via Receding-Horizon Denoising

論文の概要: DiRecT: Safe Diffusion-Based Planning via Receding-Horizon Denoising

arxiv url: http://arxiv.org/abs/2606.15359v1
Date: Sat, 13 Jun 2026 15:41:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-16 16:21:33.455513
Title: DiRecT: Safe Diffusion-Based Planning via Receding-Horizon Denoising
Title（参考訳）: DiRecT: Receding-Horizon Denoisingによる安全な拡散型計画
Authors: Paolo Giaretta, Zeyang Li, Navid Azizan,
Abstract要約: 最適制御(SOC)による拡散モデルからの制約付きサンプリングのためのトレーニング不要アルゴリズムであるDiRecTを導入する。モデル予測制御にインスパイアされ、不利な制約付き SOC の定式化のために、リテーディング・ホライゾン・サロゲートを導出する。実験により、DiRecTは既存の拡散ベースの計画ベースラインよりも、デプロイメントの安全性とタスクパフォーマンスを大幅に改善することが示された。
参考スコア（独自算出の注目度）: 4.249024052507976
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have emerged as powerful tools for planning and control by learning multimodal distributions over actions and trajectories. Yet reliable inference-time safety enforcement remains a key barrier to their deployment in safety-critical tasks. Existing approaches typically project each denoising iterate onto the feasible set, even though constraints are defined only on the final clean trajectory. Enforcing feasibility on noisy intermediate samples can therefore overconstrain the sampling dynamics, substantially degrading sample quality. To address this limitation, we introduce DiRecT (Diffusion-based planning via Receding-horizon denoising with Terminal constraints), a training-free algorithm for constrained sampling from diffusion models via stochastic optimal control (SOC). DiRecT enforces constraints only on the final clean sample, avoiding unnecessary restrictions on the intermediate denoising dynamics. Inspired by model predictive control, we derive a principled receding-horizon surrogate for the otherwise intractable constrained SOC formulation, yielding an efficient algorithm that cleanly separates stochastic denoising from constraint satisfaction, progressively steering samples toward feasible final trajectories without distorting the learned diffusion dynamics. Furthermore, DiRecT is highly flexible: it can leverage off-the-shelf or domain-specific optimizers, incorporate priors over environment dynamics, and optimize additional soft rewards. Extensive experiments on safe planning benchmarks demonstrate that DiRecT substantially improves deployment safety and task performance over existing diffusion-based planning baselines.
Abstract（参考訳）: 拡散モデルは、行動や軌道上のマルチモーダル分布を学習することで、計画と制御のための強力なツールとして登場した。しかし、信頼性の高い推論時の安全対策は、安全クリティカルなタスクへの展開において、依然として重要な障壁である。既存のアプローチは通常、制約が最終的な清浄軌道にのみ定義されているにもかかわらず、それぞれの反復を実現可能な集合に射影する。したがって、ノイズのある中間試料に対する実現可能性の強制はサンプリング力学を過度に抑制し、試料の品質を著しく低下させる。この制限に対処するために、確率的最適制御(SOC)による拡散モデルからの制約付きサンプリングのためのトレーニングフリーアルゴリズムであるDiRecT(Receding-Horizonによる拡散計画)を導入する。 DiRecTは、最後のクリーンサンプルのみに制約を強制し、中間の denoising dynamics の不要な制限を避ける。モデル予測制御に着想を得て, 難解な制約付きSOC定式化を原理的に導出し, 制約満足度から確率的認知をきれいに分離し, 学習された拡散力学を歪ませることなく, サンプルを実現可能な最終軌道に向けて段階的に操舵する効率的なアルゴリズムを導出する。さらに、DiRecTは、オフザシェルフまたはドメイン固有のオプティマイザを活用でき、環境ダイナミクスよりも優先事項を取り入れ、追加のソフト報酬を最適化できる。安全な計画ベンチマークに関する大規模な実験は、DiRecTが既存の拡散ベースの計画ベースラインよりも、デプロイメントの安全性とタスクパフォーマンスを大幅に改善していることを示している。

論文の概要: DiRecT: Safe Diffusion-Based Planning via Receding-Horizon Denoising

関連論文リスト