Fugu-MT 論文翻訳(概要): Action-to-Action Flow Matching

論文の概要: Action-to-Action Flow Matching

arxiv url: http://arxiv.org/abs/2602.07322v1
Date: Sat, 07 Feb 2026 02:39:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-10 20:26:24.571431
Title: Action-to-Action Flow Matching
Title（参考訳）: Action-to-Action Flow Matching
Authors: Jindou Jia, Gen Li, Xiangyu Chen, Tuo An, Yuxuan Hu, Jingliang Li, Xinying Guo, Jianfei Yang,
Abstract要約: 拡散に基づく政策は、最近、条件付き認知過程として行動予測を定式化することで、ロボット工学において顕著な成功を収めた。本稿では,A2A(Action-to-Action Flow Match)を提案する。 A2Aは単一の推論ステップ(0.56msレイテンシ)で高品質なアクション生成を可能にし、視覚摂動に優れた堅牢性を示し、目に見えない構成に一般化する。
参考スコア（独自算出の注目度）: 25.301629044539325
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion-based policies have recently achieved remarkable success in robotics by formulating action prediction as a conditional denoising process. However, the standard practice of sampling from random Gaussian noise often requires multiple iterative steps to produce clean actions, leading to high inference latency that incurs a major bottleneck for real-time control. In this paper, we challenge the necessity of uninformed noise sampling and propose Action-to-Action flow matching (A2A), a novel policy paradigm that shifts from random sampling to initialization informed by the previous action. Unlike existing methods that treat proprioceptive action feedback as static conditions, A2A leverages historical proprioceptive sequences, embedding them into a high-dimensional latent space as the starting point for action generation. This design bypasses costly iterative denoising while effectively capturing the robot's physical dynamics and temporal continuity. Extensive experiments demonstrate that A2A exhibits high training efficiency, fast inference speed, and improved generalization. Notably, A2A enables high-quality action generation in as few as a single inference step (0.56 ms latency), and exhibits superior robustness to visual perturbations and enhanced generalization to unseen configurations. Lastly, we also extend A2A to video generation, demonstrating its broader versatility in temporal modeling. Project site: https://lorenzo-0-0.github.io/A2A_Flow_Matching.
Abstract（参考訳）: 拡散に基づく政策は、最近、条件付き認知過程として行動予測を定式化することで、ロボット工学において顕著な成功を収めた。しかし、ランダムなガウスノイズからサンプリングする標準的な手法は、クリーンなアクションを生成するために複数の反復的なステップを必要とすることが多い。本稿では,非インフォームドノイズサンプリングの必要性に挑戦し,ランダムサンプリングから初期化へ移行する新しいポリシーパラダイムであるA2Aを提案する。受容的行動フィードバックを静的な条件として扱う既存の方法とは異なり、A2Aは歴史的受容的配列を利用し、それらを高次元の潜在空間に埋め込むことで行動生成の出発点となる。この設計は、ロボットの物理的ダイナミクスと時間的連続性を効果的に捉えながら、コストのかかる反復的認知をバイパスする。大規模な実験により、A2Aは高いトレーニング効率、高速な推論速度、一般化の改善を示す。特に、A2Aは単一の推論ステップ(0.56msレイテンシ)で高品質なアクション生成を可能にし、視覚摂動に優れた堅牢性を示し、目に見えない構成に一般化する。最後に、A2Aをビデオ生成に拡張し、時間的モデリングにおける幅広い汎用性を実証する。プロジェクトサイト:https://lorenzo-0-0.github.io/A2A_Flow_Matching

論文の概要: Action-to-Action Flow Matching

関連論文リスト