Fugu-MT 論文翻訳(概要): Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

論文の概要: Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

arxiv url: http://arxiv.org/abs/2312.13763v2
Date: Wed, 3 Jan 2024 09:40:56 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-04 16:07:59.649411
Title: Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Title（参考訳）: 動的3Dガウスと合成拡散モデルを用いたテキストから4D
Authors: Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis
Abstract要約: 我々は、探索されていないテキストから4D設定に焦点をあて、動的にアニメーションされた3Dオブジェクトを合成する。 4次元オブジェクト最適化において,テキスト・ツー・イメージ,テキスト・ツー・ビデオ,および3次元認識型多視点拡散モデルを組み合わせてフィードバックを提供する。
参考スコア（独自算出の注目度）: 94.07744207257653
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-guided diffusion models have revolutionized image and video generation and have also been successfully used for optimization-based 3D object synthesis. Here, we instead focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects using score distillation methods with an additional temporal dimension. Compared to previous work, we pursue a novel compositional generation-based approach, and combine text-to-image, text-to-video, and 3D-aware multiview diffusion models to provide feedback during 4D object optimization, thereby simultaneously enforcing temporal consistency, high-quality visual appearance and realistic geometry. Our method, called Align Your Gaussians (AYG), leverages dynamic 3D Gaussian Splatting with deformation fields as 4D representation. Crucial to AYG is a novel method to regularize the distribution of the moving 3D Gaussians and thereby stabilize the optimization and induce motion. We also propose a motion amplification mechanism as well as a new autoregressive synthesis scheme to generate and combine multiple 4D sequences for longer generation. These techniques allow us to synthesize vivid dynamic scenes, outperform previous work qualitatively and quantitatively and achieve state-of-the-art text-to-4D performance. Due to the Gaussian 4D representation, different 4D animations can be seamlessly combined, as we demonstrate. AYG opens up promising avenues for animation, simulation and digital content creation as well as synthetic data generation.
Abstract（参考訳）: テキスト誘導拡散モデルは画像および映像生成に革命をもたらし、最適化に基づく3dオブジェクト合成にも成功している。そこで本研究では, 時間的次元を付加したスコア蒸留法を用いて, 未熟なtext-to-4d設定に焦点をあて, ダイナミックな3dオブジェクトを合成する。従来の手法と比較して,テキスト・ツー・イメージ・テキスト・ビデオ・3d対応のマルチビュー拡散モデルを組み合わせて,4次元オブジェクト最適化時のフィードバックを提供し,時間的一貫性,高品質な視覚的外観,リアルな幾何学を実現する。我々の手法はAlign Your Gaussian (AYG) と呼ばれ、変形場を4次元表現として動的3次元ガウス散乱を利用する。 AYGは移動する3次元ガウスの分布を規則化し、最適化を安定化し、運動を誘導する新しい方法である。また,動作増幅機構と,複数の4Dシーケンスを生成し,組み合わせてより長い生成を行う新しい自己回帰合成手法を提案する。これらの技術により、鮮明な動的シーンを合成し、前作を質的かつ定量的に上回り、最先端のテキストから4Dのパフォーマンスを実現することができる。ガウスの4D表現のため、異なる4Dアニメーションをシームレスに組み合わせることができる。 AYGは、アニメーション、シミュレーション、デジタルコンテンツ作成、および合成データ生成のための有望な道を開く。

論文の概要: Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

関連論文リスト