Fugu-MT 論文翻訳(概要): CoCo4D: Comprehensive and Complex 4D Scene Generation

論文の概要: CoCo4D: Comprehensive and Complex 4D Scene Generation

arxiv url: http://arxiv.org/abs/2506.19798v1
Date: Tue, 24 Jun 2025 17:05:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-25 19:48:23.73807
Title: CoCo4D: Comprehensive and Complex 4D Scene Generation
Title（参考訳）: CoCo4D: 包括的で複雑な4Dシーン生成
Authors: Junwei Zhou, Xueting Li, Lu Qi, Ming-Hsuan Yang,
Abstract要約: 既存の4D合成法は主に、限られた新しい視点でオブジェクトレベルの生成や動的シーン合成に重点を置いている。テキストプロンプトから詳細な動的4Dシーンを生成するためのフレームワーク(CoCo4D)を提案する。
参考スコア（独自算出の注目度）: 61.25279122171029
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing 4D synthesis methods primarily focus on object-level generation or dynamic scene synthesis with limited novel views, restricting their ability to generate multi-view consistent and immersive dynamic 4D scenes. To address these constraints, we propose a framework (dubbed as CoCo4D) for generating detailed dynamic 4D scenes from text prompts, with the option to include images. Our method leverages the crucial observation that articulated motion typically characterizes foreground objects, whereas background alterations are less pronounced. Consequently, CoCo4D divides 4D scene synthesis into two responsibilities: modeling the dynamic foreground and creating the evolving background, both directed by a reference motion sequence. Given a text prompt and an optional reference image, CoCo4D first generates an initial motion sequence utilizing video diffusion models. This motion sequence then guides the synthesis of both the dynamic foreground object and the background using a novel progressive outpainting scheme. To ensure seamless integration of the moving foreground object within the dynamic background, CoCo4D optimizes a parametric trajectory for the foreground, resulting in realistic and coherent blending. Extensive experiments show that CoCo4D achieves comparable or superior performance in 4D scene generation compared to existing methods, demonstrating its effectiveness and efficiency. More results are presented on our website https://colezwhy.github.io/coco4d/.
Abstract（参考訳）: 既存の4D合成法は主にオブジェクトレベルの生成や動的シーンの合成に重点を置いており、新しいビューが限られており、複数のビューの一貫性と没入性のある動的4Dシーンを生成する能力が制限されている。これらの制約に対処するため,テキストプロンプトから詳細な動的4Dシーンを生成するためのフレームワーク(CoCo4D)を提案する。本手法は,手話動作が典型的に前景の物体を特徴付けるという重要な観察を生かしているが,背景変化は少ない。その結果、CoCo4Dは4Dシーン合成を2つの責務に分割する。テキストプロンプトとオプション参照画像が与えられた後、CoCo4Dはまず、ビデオ拡散モデルを用いた初期動作シーケンスを生成する。この動作シーケンスは、新しいプログレッシブ・アウトペイント・スキームを用いて、動的前景オブジェクトと背景の両方の合成を誘導する。動いたフォアグラウンドオブジェクトを動的背景内でシームレスに統合するために、CoCo4Dは、フォアグラウンドのパラメトリック軌道を最適化し、現実的でコヒーレントなブレンディングをもたらす。大規模な実験により,CoCo4Dは既存の方法と比較して4次元シーン生成において同等あるいは優れた性能を示し,その有効性と効率を実証した。さらなる結果が、私たちのWebサイト https://colezwhy.github.io/coco4d/で発表されています。

論文の概要: CoCo4D: Comprehensive and Complex 4D Scene Generation

関連論文リスト