Fugu-MT 論文翻訳(概要): Tube Diffusion Policy: Reactive Visual-Tactile Policy Learning for Contact-rich Manipulation

論文の概要: Tube Diffusion Policy: Reactive Visual-Tactile Policy Learning for Contact-rich Manipulation

arxiv url: http://arxiv.org/abs/2604.23609v1
Date: Sun, 26 Apr 2026 08:48:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.460675
Title: Tube Diffusion Policy: Reactive Visual-Tactile Policy Learning for Contact-rich Manipulation
Title（参考訳）: チューブ拡散政策:コンタクトリッチマニピュレーションのための反応性視覚触覚ポリシー学習
Authors: Teng Xue, Alberto Rigo, Bingjian Huang, Jiayi Shen, Zhengtong Xu, Nick Colonnese, Amirhossein H. Memar,
Abstract要約: Tube Diffusion Policy (TDP) は、チューブベースのフィードバック制御で模倣学習をブリッジする新しい視覚触覚ポリシー学習フレームワークである。 TDPは、名目アクションチャンクの周りに観測条件付きフィードバックフローを学習し、実行中に高速で適応的な反応を可能にするアクションチューブを形成する。
参考スコア（独自算出の注目度）: 11.359539466233137
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Contact-rich manipulation is central to many everyday human activities, requiring continuous adaptation to contact uncertainty and external disturbances through multi-modal perception, particularly vision and tactile feedback. While imitation learning has shown strong potential for learning complex manipulation behaviors, most existing approaches rely on action chunking, which fundamentally limits their ability to react to unforeseen observations during execution. This limitation becomes especially critical in contact-rich scenarios, where physical uncertainty and high-frequency tactile feedback demand rapid, reactive control. To address this challenge, we propose Tube Diffusion Policy (TDP), a novel reactive visual-tactile policy learning framework that bridges diffusion-based imitation learning with tube-based feedback control. By leveraging the expressive power of generative models, TDP learns an observation-conditioned feedback flow around nominal action chunks, forming an action tube that enables fast and adaptive reactions during execution. We evaluate TDP on the widely used Push-T benchmark and three additional challenging visual-tactile dexterous manipulation tasks. Across all benchmarks, TDP consistently outperforms state-of-the-art imitation learning baselines. Two real-world experiments further validate its robust reactivity under contact uncertainty and external disturbances. Moreover, the step-wise correction mechanism enabled by action tube significantly reduces the required denoising steps, making TDP well suited for real-time, high-frequency feedback control in contact-rich manipulation.
Abstract（参考訳）: コンタクトリッチな操作は多くの日常的な人間の活動の中心であり、コンタクトの不確実性や外乱への継続的な適応を必要とする。模倣学習は複雑な操作行動を学ぶ強力な可能性を示しているが、既存のほとんどのアプローチはアクションチャンキングに依存しており、実行中に予期せぬ観察に反応する能力を根本的に制限している。この制限は、物理的不確実性と高周波触覚フィードバックが迅速な反応性制御を要求するような、コンタクトリッチなシナリオにおいて特に重要となる。この課題に対処するために、拡散に基づく模倣学習をチューブベースのフィードバック制御でブリッジする、リアクティブな視覚触覚ポリシー学習フレームワークであるTDP(Tube Diffusion Policy)を提案する。生成モデルの表現力を活用することで、TDPは名目アクションチャンク周辺の観察条件付きフィードバックフローを学習し、実行中に高速かつ適応的な反応を可能にするアクションチューブを形成する。我々は、広く使われているPush-TベンチマークでTDPを評価し、さらに3つの挑戦的な視覚触覚のデキスタラスな操作タスクについて検討した。すべてのベンチマークにおいて、TDPは一貫して最先端の模倣学習ベースラインを上回っている。 2つの実世界の実験は、接触不確実性および外乱下での堅牢な反応性をさらに検証した。さらに、動作管により可能となるステップワイド補正機構は、必要なデノナイジングステップを著しく低減し、接触リッチな操作におけるリアルタイムで高周波なフィードバック制御に適したTDPを実現する。

論文の概要: Tube Diffusion Policy: Reactive Visual-Tactile Policy Learning for Contact-rich Manipulation

関連論文リスト