Fugu-MT 論文翻訳(概要): ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching

論文の概要: ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching

arxiv url: http://arxiv.org/abs/2605.11048v1
Date: Mon, 11 May 2026 13:27:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.329554
Title: ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching
Title（参考訳）: ForceFlow: コンタクト駆動型フローマッチングによるフィールと行動を学ぶ
Authors: Shuoheng Zhang, Yifu Yuan, Hongyao Tang, Yan Zheng, Qiaojun Yu, Pengyi Li, Guowei Huang, Helong Huang, Xingyue Quan, Jianye Hao,
Abstract要約: ForceFlowは、フローマッチングに基づいて構築された、フォース対応のリアクティブフレームワークである。 ForceFlowは、強力なベースラインであるForceVLAよりも37%の成功率の向上を実現している。また、接触力の自己調節において、正確な力信号予測と優れた性能を示す。
参考スコア（独自算出の注目度）: 53.30290192030814
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.
Abstract（参考訳）: 既存の模倣学習手法により、ロボットは物理的環境と自律的に対話することができる。しかし、高精度な力フィードバックと制御を必要とする複雑な接触ダイナミクスのため、コンタクトリッチな操作タスクは依然として重要な課題である。近年、力覚をポリシーに統合する試みが試みられているが、マルチモーダル観測の下で堅牢な一般化を実現するための、シンプルで効果的なフレームワークを構築する方法は、未解決の課題である。本稿では,フローマッチングに基づく力覚応答型フレームワークであるForceFlowを提案する。接触段階の政策設計では、力信号の融合機構を調査し、力のグローバルな規制信号として扱う非対称多モード融合アーキテクチャと、政策の即時的な力と履歴情報の理解を高める共同予測パラダイムを組み合わせ、力と動きの深い結合を実現する。タスクレベルの階層的分解では,操作を視覚支配的アプローチステージ(VLM)とタッチ支配的インタラクションステージ(フォース駆動型コンタクト実行)に分割し,空間一般化と接触制御を明確に分離するビジョン・ツー・フォース(V2F)ハンドオーバ機構を提案する。実世界の6つのコンタクト豊富なタスクに対する実験結果から、ForceFlowは強力なベースラインであるForceVLAよりも37%の成功率の向上を実現し、コストを大幅に削減した。さらに、ForceFlowは正確な力信号予測を示し、接触力自己制御とゼロショットアウト・オブ・ディストリビューション(OOD)の一般化において優れた性能を示す。

論文の概要: ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching

関連論文リスト