Fugu-MT 論文翻訳(概要): Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation

論文の概要: Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation

arxiv url: http://arxiv.org/abs/2606.08737v1
Date: Sun, 07 Jun 2026 17:18:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:06.42234
Title: Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation
Title（参考訳）: Dream-Tac:コンタクトリッチロボット操作のための統合触覚世界行動モデル
Authors: Yunfan Lou, Yifan Ye, Yankai Fu, Jun Cen, Xiaowei Chi, Yaoxu Lyu, Peidong Jia, Sirui Han, Zhihe Lu, Shanghang Zhang,
Abstract要約: 本研究では,統合された触覚・世界行動モデルであるDream-Tacを提案する。具体的には、 (i) 触覚信号と (ii) 接触認識の注意バイアスを選択的に統合し、(i) 交差モーダル相互作用をよりよく制御するために、 (i) 接触ゲート型ビゾタクタクチル融合を導入する。 6つのコンタクトリッチな操作タスクの中で、ドリームタックは平均でアクション精度を31.7%改善し、統合された視覚的世界モデリングの有効性を実証した。
参考スコア（独自算出の注目度）: 40.81290792381617
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: World action models inherit the predictive capability of world models, enabling action generation to be guided by anticipated future observations. However, they rely primarily on vision and often fail in contact-rich manipulation, where critical cues arise from physical interaction. In this paper, we propose Dream-Tac, a unified Tactile-World Action Model that jointly models actions, future visual observations, and tactile dynamics. Specifically, Dream-Tac introduces (i) contact-gated visuotactile fusion to selectively integrate tactile signals and (ii) a contact-aware attention bias to better regulate cross-modal interactions during manipulation. To support real-time deployment, we further design a dual-level acceleration strategy, reformulating the contact-aware bias to preserve the fused attention path during training and introducing cache-based diffusion acceleration at inference, achieving up to 2.9$\times$ faster training and 1.8$\times$ faster inference. Across six contact-rich manipulation tasks, Dream-Tac improves action accuracy by 31.7\% on average, demonstrating the effectiveness of unified visuotactile world modeling.Code is available at https://github.com/LYFCLOUDFAN/Dream-Tac.
Abstract（参考訳）: 世界行動モデルは世界モデルの予測能力を継承し、将来予想される観測によって行動生成をガイドすることができる。しかし、それらは主に視覚に依存しており、しばしば物理的な相互作用から重要な手がかりが生じるコンタクトリッチな操作に失敗する。本稿では,統合された触覚・世界行動モデルであるドリーム・タックを提案する。具体的にはDream-Tacが紹介一触覚信号を選択的に統合するための接触ゲート型ビゾタクタクタクタブル融合 (II)操作時の異種間相互作用をよりよく制御するための接触認識型注意バイアス。リアルタイム展開を支援するため、我々はさらに2段階の加速戦略を設計し、訓練中に融合した注意経路を保存するために接触認識バイアスを修正し、推論時にキャッシュベースの拡散加速度を導入し、2.9$\times$高速トレーニングと1.8$\times$高速推論を実現した。 6つのコンタクトリッチな操作タスク全体で、Dream-Tacは平均で31.7\%のアクション精度を向上し、統合されたビズオタクティルワールドモデリングの有効性を実証している。

論文の概要: Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation

関連論文リスト