Fugu-MT 論文翻訳(概要): QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control

論文の概要: QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control

arxiv url: http://arxiv.org/abs/2603.24021v1
Date: Wed, 25 Mar 2026 07:30:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-26 21:06:11.183017
Title: QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control
Title（参考訳）: QuadFM: 生成と制御のための基本テキスト駆動型四足歩行データセット
Authors: Li Gao, Fuzhi Yang, Jianhui Chen, Liu Liu, Yao Zheng, Yang Cai, Ziqiao Li,
Abstract要約: テキスト・トゥ・モーション・ジェネラル・モーション・ジェネラル・モーション・コントロールのための大規模な超高忠実度データセットQuadFMを紹介した。 QuadFMには、ロコモーション、インタラクティブ、感情表現行動にまたがる11,784のキュレートされたモーションクリップが含まれている。汎用モーションコントローラとテキスト・ツー・モーション・ジェネレータを共同でトレーニングする統合フレームワークであるGen2Control RLを提案する。
参考スコア（独自算出の注目度）: 18.78068897227934
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite significant advances in quadrupedal robotics, a critical gap persists in foundational motion resources that holistically integrate diverse locomotion, emotionally expressive behaviors, and rich language semantics-essential for agile, intuitive human-robot interaction. Current quadruped motion datasets are limited to a few mocap primitives (e.g., walk, trot, sit) and lack diverse behaviors with rich language grounding. To bridge this gap, we introduce Quadruped Foundational Motion (QuadFM) , the first large-scale, ultra-high-fidelity dataset designed for text-to-motion generation and general motion control. QuadFM contains 11,784 curated motion clips spanning locomotion, interactive, and emotion-expressive behaviors (e.g., dancing, stretching, peeing), each with three-layer annotation-fine-grained action labels, interaction scenarios, and natural language commands-totaling 35,352 descriptions to support language-conditioned understanding and command execution. We further propose Gen2Control RL, a unified framework that jointly trains a general motion controller and a text-to-motion generator, enabling efficient end-to-end inference on edge hardware. On a real quadruped robot with an NVIDIA Orin, our system achieves real-time motion synthesis (<500 ms latency). Simulation and real-world results show realistic, diverse motions while maintaining robust physical interaction. The dataset will be released at https://github.com/GaoLii/QuadFM.
Abstract（参考訳）: 四足歩行ロボットの大幅な進歩にもかかわらず、基本的な運動資源に重要なギャップが持続し、多様な移動、感情的に表現された振る舞い、そしてアジャイルで直感的な人間とロボットの相互作用に必要なリッチな言語意味論が一体化される。現在の4つのモーションデータセットは、いくつかのモキャッププリミティブ(例えば、ウォーク、トロット、シット)に制限されており、リッチな言語基盤を持つ多様な振る舞いを欠いている。このギャップを埋めるために、テキスト・ツー・モーション生成と一般的なモーション制御のために設計された、最初の大規模で超高忠実なデータセットであるQuadFM(QuadFM)を導入する。 QuadFMには、ロコモーション、インタラクティブ、感情表現的な行動(例えば、ダンス、ストレッチ、ピーイング)にまたがる11,784のキュレートされたモーションクリップが含まれており、それぞれに3層にアノテーションを付加したアクションラベル、インタラクションシナリオ、自然言語コマンドで記述された35,352の記述があり、言語条件の理解とコマンドの実行をサポートする。さらに,汎用モーションコントローラとテキスト・ツー・モーション・ジェネレータを併用した統合フレームワークであるGen2Control RLを提案する。 NVIDIA Orinを搭載した実四足歩行ロボットにおいて,本システムはリアルタイム動作合成(500msレイテンシ)を実現する。シミュレーションと実世界の結果は、堅牢な物理的相互作用を維持しながら、現実的で多様な動きを示している。データセットはhttps://github.com/GaoLii/QuadFMでリリースされる。

論文の概要: QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control

関連論文リスト