Fugu-MT 論文翻訳(概要): RoboForge: Physically Optimized Text-guided Whole-Body Locomotion for Humanoids

論文の概要: RoboForge: Physically Optimized Text-guided Whole-Body Locomotion for Humanoids

arxiv url: http://arxiv.org/abs/2603.17927v2
Date: Thu, 19 Mar 2026 08:42:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-21 18:33:56.957145
Title: RoboForge: Physically Optimized Text-guided Whole-Body Locomotion for Humanoids
Title（参考訳）: RoboForge: ヒューマノイドのための物理的に最適化されたテキストガイド付き全体ロコモーション
Authors: Xichen Yuan, Zhe Li, Bofan Lyu, Kuangji Zuo, Yanshuo Lu, Gen Li, Jianfei Yang,
Abstract要約: 自然言語と全身移動を橋渡しする統合潜在駆動型フレームワークを提案する。我々のフレームワークは、テキスト誘導型ヒューマノイドインテリジェンスをデプロイするための実践的なパスを提供する。
参考スコア（独自算出の注目度）: 20.796118584632904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While generative models have become effective at producing human-like motions from text, transferring these motions to humanoid robots for physical execution remains challenging. Existing pipelines are often limited by retargeting, where kinematic quality is undermined by physical infeasibility, contact-transition errors, and the high cost of real-world dynamical data. We present a unified latent-driven framework that bridges natural language and whole-body humanoid locomotion through a retarget-free, physics-optimized pipeline. Rather than treating generation and control as separate stages, our key insight is to couple them bidirectionally under physical constraints.We introduce a Physical Plausibility Optimization (PP-Opt) module as the coupling interface. In the forward direction, PP-Opt refines a teacher-student distillation policy with a plausibility-centric reward to suppress artifacts such as floating, skating, and penetration. In the backward direction, it converts reward-optimized simulation rollouts into high-quality explicit motion data, which is used to fine-tune the motion generator toward a more physically plausible latent distribution. This bidirectional design forms a self-improving cycle: the generator learns a physically grounded latent space, while the controller learns to execute latent-conditioned behaviors with dynamical integrity.Extensive experiments on the Unitree G1 humanoid show that our bidirectional optimization improves tracking accuracy and success rates. Across IsaacLab and MuJoCo, the implicit latent-driven pipeline consistently outperforms conventional explicit retargeting baselines in both precision and stability. By coupling diffusion-based motion generation with physical plausibility optimization, our framework provides a practical path toward deployable text-guided humanoid intelligence.
Abstract（参考訳）: 生成モデルはテキストから人間のような動きを生成するのに有効になっているが、これらの動きを人型ロボットに転送して物理的に実行することは依然として困難である。既存のパイプラインはリターゲティングによって制限されることが多く、キネマティックな品質は物理的不実現性、接触遷移エラー、現実世界の動的データの高コストによって損なわれる。本稿では,自然言語と全身のヒューマノイド移動をリターゲットフリーで物理最適化されたパイプラインを通してブリッジする潜在駆動型統合フレームワークを提案する。生成と制御を別々の段階として扱うのではなく、物理的制約の下で双方向に結合させることが重要な洞察であり、結合インターフェースとしてPhysical Plausibility Optimization(PP-Opt)モジュールを導入する。前向きには、PP-Optは教師の学生による蒸留政策を洗練し、確率中心の報酬で浮かび、スケート、浸透などの人工物を抑制する。後方方向では、報酬最適化されたシミュレーションロールアウトを高品質な明示的な動きデータに変換する。この双方向設計は自己改善サイクルを形成し、ジェネレータは物理的に接地された潜在空間を学習し、コントローラは動的整合性で潜在条件付き動作を実行することを学習する。暗黙の潜在駆動パイプラインであるIsaacLabとMuJoCoは、精度と安定性の両面で、従来の明示的なリターゲットベースラインを上回っている。拡散に基づく運動生成と物理可視性最適化を結合することにより、本フレームワークは、テキスト誘導型ヒューマノイドインテリジェンスをデプロイするための実践的な経路を提供する。

論文の概要: RoboForge: Physically Optimized Text-guided Whole-Body Locomotion for Humanoids

関連論文リスト