Fugu-MT 論文翻訳(概要): TEXEDO : Test Time Scaling for Controller-aware Language-conditioned Humanoid Motion Generation

論文の概要: TEXEDO : Test Time Scaling for Controller-aware Language-conditioned Humanoid Motion Generation

arxiv url: http://arxiv.org/abs/2606.22998v1
Date: Mon, 22 Jun 2026 08:14:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-24 16:10:15.171298
Title: TEXEDO : Test Time Scaling for Controller-aware Language-conditioned Humanoid Motion Generation
Title（参考訳）: TEXEDO : 制御言語を考慮したヒューマノイドモーション生成のためのテスト時間スケーリング
Authors: Jianuo Cao, Yuxin Chen, Yuzhen Song, Masayoshi Tomizuka, Chenran Li, Thomas Tian,
Abstract要約: 我々はヒューマノイドモーション生成のためのテスト時間スケーリングフレームワークであるTEXEDOを紹介する。基礎となる強力な発電機を必要とせず、運動の質を向上させる。追跡忠実度とテキストアライメントの両方を継続的に改善することを示す。
参考スコア（独自算出の注目度）: 48.72797830886802
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-conditioned motion generation is a promising interface for programming humanoid robots, yet current generators are often trained on human motion datasets retargeted to robot morphologies. Although such data provides rich semantic and kinematic priors, it fails to capture the nuances of whole-body tracking controllers, including balance, contact dynamics, actuation limits, and controller-specific failure modes. As a result, generated motions can be semantically plausible but difficult or impossible for the robot to execute. We introduce TEXEDO, a test-time scaling framework for humanoid motion generation that improves motion quality without requiring a stronger underlying generator. Given a text prompt, TEXEDO samples multiple candidate motions from a pretrained text-conditioned generator and selects the best motion that is both executable and task-aligned. The reward model combines a dynamic feasibility verifier, distilled from whole-body tracking rollouts to predict physical executability, with a semantic alignment verifier that measures text-motion alignment in a learned co-embedding space. Our pipeline treats dynamic feasibility as a hard constraint and semantic alignment as the selection objective within the feasible set. Through large-scale simulation studies and real-world deployment on a Unitree G1 humanoid robot, we show that TEXEDO consistently improves both tracking fidelity and text alignment. These results demonstrate that grounded verification is an effective path toward deployable language-guided humanoid motion generation. Project website: https://jianuocao.github.io/TEXEDO/
Abstract（参考訳）: テキスト条件付きモーション生成は、ヒューマノイドロボットをプログラミングするための有望なインターフェースであるが、現在のジェネレータは、ロボット形態に再ターゲットされた人間のモーションデータセットに基づいて訓練されることが多い。このようなデータは、リッチなセマンティクスとキネマティックな事前情報を提供するが、バランス、コンタクトダイナミクス、アクティベーション制限、コントローラ固有の障害モードを含む、ボディ全体のトラッキングコントローラのニュアンスをキャプチャできない。その結果、生成した動きは意味論的に検証可能であるが、ロボットが実行することは困難または不可能である。 TEXEDOはヒューマノイドモーション生成のためのテスト時間スケーリングフレームワークで,基礎となる強力なジェネレータを必要とせず,動作品質を向上させる。テキストプロンプトが与えられたTEXEDOは、事前訓練されたテキストコンディショニングジェネレータから複数の候補動作をサンプリングし、実行可能かつタスクアライメントの両方で最適な動作を選択する。報酬モデルでは、身体全体の追跡ロールアウトから蒸留した動的実行可能性検証器と、学習された共埋め込み空間におけるテキスト移動アライメントを測定するセマンティックアライメント検証器を組み合わせる。我々のパイプラインは、動的実現可能性をハード制約として扱い、セマンティックアライメントを、実現可能な集合内での選択目的として扱います。本研究は,Unitree G1ヒューマノイドロボットの大規模シミュレーションと実世界展開を通じて,TEXEDOがトラッキング忠実度とテキストアライメントの両方を一貫して改善することを示す。これらの結果から, グラウンドド検証は, 言語誘導型ヒューマノイドモーション生成への効果的な経路であることが示唆された。プロジェクトウェブサイト: https://jianuocao.github.io/TEXEDO/

論文の概要: TEXEDO : Test Time Scaling for Controller-aware Language-conditioned Humanoid Motion Generation

関連論文リスト