Fugu-MT 論文翻訳(概要): Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image

論文の概要: Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image

arxiv url: http://arxiv.org/abs/2603.14772v1
Date: Mon, 16 Mar 2026 03:12:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:36.023484
Title: Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image
Title（参考訳）: 単体画像からの衣服ダイナミックスを用いたアニマタブル3次元アバターのゼロショット再構成
Authors: Joohyun Kwon, Geonhee Sim, Gyeongsik Moon,
Abstract要約: そこで,DynaAvatarについて述べる。3次元アバターを1枚の画像から動作依存性の布のダイナミックスで再構成するフレームワークである。大規模なマルチパーソンモーションデータセットに基づいてトレーニングされたDynaAvatarでは、Transformerベースのフィードフォワードアーキテクチャを採用している。実験により、DynaAvatarは視覚的に豊かで一般的なアニメーションを制作し、先行した手法よりも優れていることが示された。
参考スコア（独自算出の注目度）: 16.574138459960505
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Existing single-image 3D human avatar methods primarily rely on rigid joint transformations, limiting their ability to model realistic cloth dynamics. We present DynaAvatar, a zero-shot framework that reconstructs animatable 3D human avatars with motion-dependent cloth dynamics from a single image. Trained on large-scale multi-person motion datasets, DynaAvatar employs a Transformer-based feed-forward architecture that directly predicts dynamic 3D Gaussian deformations without subject-specific optimization. To overcome the scarcity of dynamic captures, we introduce a static-to-dynamic knowledge transfer strategy: a Transformer pretrained on large-scale static captures provides strong geometric and appearance priors, which are efficiently adapted to motion-dependent deformations through lightweight LoRA fine-tuning on dynamic captures. We further propose the DynaFlow loss, an optical flow-guided objective that provides reliable motion-direction geometric cues for cloth dynamics in rendered space. Finally, we reannotate the missing or noisy SMPL-X fittings in existing dynamic capture datasets, as most public dynamic capture datasets contain incomplete or unreliable fittings that are unsuitable for training high-quality 3D avatar reconstruction models. Experiments demonstrate that DynaAvatar produces visually rich and generalizable animations, outperforming prior methods.
Abstract（参考訳）: 既存の1次元人間のアバター法は主に剛性な関節変換に依存しており、現実的な布のダイナミックスをモデル化する能力を制限する。そこで我々は,1枚の画像から動きに依存した布のダイナミックスを用いて,アニマタブルな3次元アバターを再構成するゼロショットフレームワークDynaAvatarを提案する。大規模なマルチパーソンモーションデータセットに基づいてトレーニングされたDynaAvatarは、トランスフォーマーベースのフィードフォワードアーキテクチャを使用して、被験者固有の最適化なしに動的3Dガウス変形を直接予測する。大規模静的キャプチャに事前訓練されたトランスフォーマーは、動的キャプチャの軽量なLORA微調整により、動きに依存した変形に効率的に適応できる強力な幾何学的および外観的前処理を提供する。さらに、描画空間における布のダイナミックスに対して、信頼性の高い動き方向幾何学的手がかりを提供する光学的フロー誘導目標であるDynaFlow損失を提案する。最後に、既存のダイナミックキャプチャデータセットに欠けている、またはノイズの多いSMPL-Xフィッティングについて、ほとんどのパブリックなダイナミックキャプチャデータセットは、高品質な3Dアバター再構築モデルのトレーニングには適さない不完全な、または信頼性の低いフィッティングを含むため、再記述する。実験により、DynaAvatarは視覚的に豊かで一般化可能なアニメーションを制作し、従来の手法よりも優れていることが示された。

論文の概要: Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image

関連論文リスト