Fugu-MT 論文翻訳(概要): DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax

論文の概要: DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax

arxiv url: http://arxiv.org/abs/2604.18648v2
Date: Mon, 27 Apr 2026 07:52:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:06.906449
Title: DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax
Title（参考訳）: DanceCrafter:Choreographic Syntaxによる微粒テキスト駆動制御可能なダンス生成
Authors: Hang Yuan, Xiaolin Hu, Yan Wan, Menglin Gao, Wenzhe Yu, Cong Huang, Fei Xu, Qing Li, Christina Dan Wang, Zhou Yu, Kai Chen,
Abstract要約: テキスト駆動によるコントロール可能なダンス生成は未調査のままである。ダンスの特徴付けは、複雑な空間的ダイナミクス、強い方向性、そして異なる身体部分の高度に分離された動きのために困難である。アノテーションを調整した新しい理論フレームワークであるtextitChoreographic Syntaxを提案する。
参考スコア（独自算出の注目度）: 39.51425337194989
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-driven controllable dance generation remains under-explored, primarily due to the severe scarcity of high-quality datasets and the inherent difficulty of articulating complex choreographies. Characterizing dance is particularly challenging owing to its intricate spatial dynamics, strong directionality, and the highly decoupled movements of distinct body parts. To overcome these bottlenecks, we bridge principles from dance studies, human anatomy, and biomechanics to propose \textit{Choreographic Syntax}, a novel theoretical framework with a tailored annotation system. Grounded in this syntax, we combine professional dance archives with high-fidelity motion capture data to construct \textbf{DanceFlow}, the most fine-grained dance dataset to date. It encompasses 41 hours of high-quality motions paired with 6.34 million words of detailed descriptions. At the model level, we introduce \textbf{DanceCrafter}, a tailored motion transformer built upon the Momentum Human Rig. To circumvent optimization instabilities, we construct a continuous manifold motion representation paired with a hybrid normalization strategy. Furthermore, we design an anatomy-aware loss to explicitly regulate the decoupled nature of body parts. Together, these adaptations empower DanceCrafter to achieve the high-fidelity and stable generation of complex dance sequences. Extensive evaluations and user studies demonstrate our state-of-the-art performance in motion quality, fine-grained controllability, and generation naturalness.
Abstract（参考訳）: テキスト駆動による制御可能なダンス生成は、主に高品質なデータセットの不足と、複雑な振付を明瞭にすることの難しさのために、未発見のままである。ダンスの特徴付けは、複雑な空間的ダイナミクス、強い方向性、そして異なる身体部分の高度に分離された動きのために特に困難である。これらのボトルネックを克服するため、我々はダンス研究、人体解剖学、バイオメカニクスの原理を橋渡しし、アノテーションシステムを備えた新しい理論フレームワークである「textit{Choreographic Syntax}」を提案する。この構文を基礎として、我々はプロのダンスアーカイブと高忠実なモーションキャプチャーデータを組み合わせて、これまでで最もきめ細かいダンスデータセットである『textbf{DanceFlow}』を構築する。 41時間にわたる高品質な動きと6.34万ワードの詳細な記述が組み合わさっている。モデルレベルでは、Momentum Human Rig上に構築された調整されたモーショントランスフォーマーである「textbf{DanceCrafter}」を紹介する。最適化の不安定性を回避するため,ハイブリッド正規化戦略と組み合わせた連続多様体の運動表現を構築した。さらに,身体部分の疎結合性を明示的に制御するために,解剖学的に認識された損失を設計する。これらの適応により、DanceCrafterは、複雑なダンスシーケンスの高忠実で安定した生成を実現することができる。広汎な評価とユーザスタディは、動作品質、きめ細かい制御性、生成自然性における最先端のパフォーマンスを実証する。

論文の概要: DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax

関連論文リスト