Fugu-MT 論文翻訳(概要): Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

論文の概要: Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

arxiv url: http://arxiv.org/abs/2403.14781v2
Date: Sat, 1 Jun 2024 08:27:23 GMT
ステータス: 翻訳完了
システム内更新日: 2024-06-04 15:47:26.981458
Title: Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Title（参考訳）: シャン:3次元パラメトリック誘導による制御可能で一貫性のある人間の画像アニメーション
Authors: Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Qingkun Su, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu,
Abstract要約: 本稿では,潜伏拡散フレームワーク内での3次元人間のパラメトリックモデルを活用することで,人間の画像アニメーションの方法論を提案する。人間の3次元パラメトリックモデルを動作誘導として表現することにより、基準画像と音源映像の動きの間に人体のパラメトリック形状アライメントを行うことができる。提案手法は,提案した組込みデータセットに対して,より優れた一般化能力を示す。
参考スコア（独自算出の注目度）: 25.346255905155424
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this study, we introduce a methodology for human image animation by leveraging a 3D human parametric model within a latent diffusion framework to enhance shape alignment and motion guidance in curernt human generative techniques. The methodology utilizes the SMPL(Skinned Multi-Person Linear) model as the 3D human parametric model to establish a unified representation of body shape and pose. This facilitates the accurate capture of intricate human geometry and motion characteristics from source videos. Specifically, we incorporate rendered depth images, normal maps, and semantic maps obtained from SMPL sequences, alongside skeleton-based motion guidance, to enrich the conditions to the latent diffusion model with comprehensive 3D shape and detailed pose attributes. A multi-layer motion fusion module, integrating self-attention mechanisms, is employed to fuse the shape and motion latent representations in the spatial domain. By representing the 3D human parametric model as the motion guidance, we can perform parametric shape alignment of the human body between the reference image and the source video motion. Experimental evaluations conducted on benchmark datasets demonstrate the methodology's superior ability to generate high-quality human animations that accurately capture both pose and shape variations. Furthermore, our approach also exhibits superior generalization capabilities on the proposed in-the-wild dataset. Project page: https://fudan-generative-vision.github.io/champ.
Abstract（参考訳）: 本研究では,3次元人間のパラメトリックモデルを潜伏拡散フレームワーク内で活用し,カーレントヒト生成技術における形状アライメントと動き誘導を強化することにより,人間の画像アニメーションの方法論を提案する。この手法は、SMPL(Skinned Multi-Person Linear)モデルを3次元人間のパラメトリックモデルとして利用し、身体形状とポーズの統一表現を確立する。これにより、ソースビデオから複雑な人間の幾何学的特徴と運動特性を正確に捉えることができる。具体的には,SMPL配列から得られた描画深度画像,正規マップ,意味マップを骨格に基づく動作誘導とともに組み込んで,包括的3次元形状と詳細なポーズ特性を持つ潜伏拡散モデルへの条件を充実させる。自己アテンション機構を統合した多層移動融合モジュールを用いて空間領域における形状と動き潜伏表現を融合する。人間の3次元パラメトリックモデルを動作誘導として表現することにより、基準画像と音源映像の動きの間に人体のパラメトリック形状アライメントを行うことができる。ベンチマークデータセットで実施された実験的評価は、ポーズと形状のバリエーションを正確にキャプチャする高品質な人間のアニメーションを生成する方法の優れた能力を示している。さらに,本手法は,提案したWildデータセットに対して,より優れた一般化能力を示す。プロジェクトページ: https://fudan-generative-vision.github.io/champ.com

関連論文リスト

MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance [23.69067438843687]
本稿では,3次元顔パラメトリックモデルを潜在拡散フレームワークに統合した映像顔再現法を提案する。本手法は,3次元顔パラメトリックモデルを動作誘導として利用することにより,運転映像から捉えた動きと参照画像との顔の同一性のパラメトリックアライメントを可能にする。
論文参考訳（メタデータ） (2025-04-30T10:30:46Z)
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses [57.17501809717155]
本研究では,骨格ポーズシーケンスのみを条件入力として,人間のイメージをアニメーションする新しい手法であるDreamDanceを提案する。私たちの重要な洞察は、人間の画像は自然に複数のレベルの相関を示すということです。我々は5Kの高品質なダンスビデオと詳細なフレームアノテーションを組み合わせたTikTok-Dance5Kデータセットを構築した。
論文参考訳（メタデータ） (2024-11-30T08:42:13Z)
Bundle Adjusted Gaussian Avatars Deblurring [31.718130377229482]
本研究では,人間の運動に起因するぼかし形成の3次元的物理指向モデルと,運動誘発ぼかし画像に見られる曖昧さを明らかにするための3次元人体運動モデルを提案する。我々は,360度同期ハイブリッド露光カメラシステムによって取得された実撮データセットとともに,既存のマルチビューキャプチャから合成されたデータセットを用いて,このタスクのベンチマークを確立した。
論文参考訳（メタデータ） (2024-11-24T10:03:24Z)
Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation [32.30055363306321]
本研究では、異なる人間のポーズや形状に関連したタスクやデータセットをシームレスに統一するパラダイムを提案する。私たちの定式化は、トレーニングとテスト時間の両方で、人間の体積の任意の点を問う能力に重点を置いています。メッシュや2D/3Dスケルトン,密度の高いポーズなど,さまざまな注釈付きデータソースを,変換することなく自然に利用することが可能です。
論文参考訳（メタデータ） (2024-07-10T10:44:18Z)
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data [36.51674664590734]
本研究では,高品質な3次元アバターの小型化を図ったEn3Dを提案する。従来の3Dデータセットの不足や、視角が不均衡な限られた2Dコレクションと異なり、本研究の目的は、ゼロショットで3D人間を作れる3Dの開発である。
論文参考訳（メタデータ） (2024-01-02T12:06:31Z)
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians [51.46168990249278]
一つのビデオから動的に3D映像を映し出すリアルな人間のアバターを作成するための効率的なアプローチを提案する。 GustafAvatarは、公開データセットと収集データセットの両方で検証されています。
論文参考訳（メタデータ） (2023-12-04T18:55:45Z)
Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling [83.76377808476039]
本研究では,人間のポーズの変形をモデル化し,それに伴う拡散に基づく動きを事前に設計する手法を提案する。動作中の3次元人間の骨格を復元する作業は3次元基準骨格の推定に分割する。混合時空間NASfMformerを用いて、各フレームの3次元基準骨格と骨格変形を2次元観測シーケンスから同時に推定する。
論文参考訳（メタデータ） (2023-08-18T16:41:57Z)
LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies [78.17425779503047]
本稿では,人体に対する新しい暗黙の表現法を提案する。完全に微分可能で、非交叉形状で最適化可能であり、潜在空間を映し出す。我々のモデルは、よく設計された損失を伴う、水密でない生データを直接訓練し、微調整することができる。
論文参考訳（メタデータ） (2021-11-30T04:10:57Z)
Human Performance Capture from Monocular Video in the Wild [50.34917313325813]
本研究では,挑戦的な身体ポーズを特徴とするモノクロ映像から動的3次元人体形状をキャプチャする手法を提案する。本手法は,現在開発中の3DPWビデオデータセットにおいて,最先端の手法よりも優れる。
論文参考訳（メタデータ） (2021-11-29T16:32:41Z)
HuMoR: 3D Human Motion Model for Robust Pose Estimation [100.55369985297797]
HuMoRは、時間的ポーズと形状のロバスト推定のための3Dヒューマンモーションモデルです。モーションシーケンスの各ステップにおけるポーズの変化の分布を学習する条件付き変分オートエンコーダについて紹介する。本モデルが大規模モーションキャプチャーデータセットのトレーニング後に多様な動きや体型に一般化することを示す。
論文参考訳（メタデータ） (2021-05-10T21:04:55Z)
S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling [103.65625425020129]
歩行者の形状、ポーズ、皮膚の重みを、データから直接学習する神経暗黙関数として表現します。各種データセットに対するアプローチの有効性を実証し,既存の最先端手法よりも再現性が優れていることを示す。
論文参考訳（メタデータ） (2021-01-17T02:16:56Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。