Fugu-MT 論文翻訳(概要): Realistic and Controllable 3D Gaussian-Guided Object Editing for Driving Video Generation

論文の概要: Realistic and Controllable 3D Gaussian-Guided Object Editing for Driving Video Generation

arxiv url: http://arxiv.org/abs/2508.20471v1
Date: Thu, 28 Aug 2025 06:39:53 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-29 18:12:02.091583
Title: Realistic and Controllable 3D Gaussian-Guided Object Editing for Driving Video Generation
Title（参考訳）: リアルで制御可能な3次元ガウスガイドによる映像生成用オブジェクト編集
Authors: Jiusi Li, Jackson Jiang, Jinyu Miao, Miao Long, Tuopu Wen, Peijin Jia, Shengxiang Liu, Chunlei Yu, Maolin Liu, Yuzhan Cai, Kun Jiang, Mengmeng Yang, Diange Yang,
Abstract要約: G2Editorは、ビデオ駆動時の不正確で正確なオブジェクト編集のために設計されたフレームワークである。シーンレベルの3Dバウンディングボックスレイアウトを用いて、非ターゲットオブジェクトの隠蔽領域を再構築する。実験によると、G2Editorは統一されたフレームワーク内でオブジェクトの配置、挿入、削除を効果的にサポートする。
参考スコア（独自算出の注目度）: 12.982001613987315
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Corner cases are crucial for training and validating autonomous driving systems, yet collecting them from the real world is often costly and hazardous. Editing objects within captured sensor data offers an effective alternative for generating diverse scenarios, commonly achieved through 3D Gaussian Splatting or image generative models. However, these approaches often suffer from limited visual fidelity or imprecise pose control. To address these issues, we propose G^2Editor, a framework designed for photorealistic and precise object editing in driving videos. Our method leverages a 3D Gaussian representation of the edited object as a dense prior, injected into the denoising process to ensure accurate pose control and spatial consistency. A scene-level 3D bounding box layout is employed to reconstruct occluded areas of non-target objects. Furthermore, to guide the appearance details of the edited object, we incorporate hierarchical fine-grained features as additional conditions during generation. Experiments on the Waymo Open Dataset demonstrate that G^2Editor effectively supports object repositioning, insertion, and deletion within a unified framework, outperforming existing methods in both pose controllability and visual quality, while also benefiting downstream data-driven tasks.
Abstract（参考訳）: コーナーのケースは、自動運転システムの訓練と検証に不可欠だが、現実の世界からそれらを収集することは、しばしば費用がかかり危険である。キャプチャされたセンサーデータ内のオブジェクトの編集は、多種多様なシナリオを生成する効果的な代替手段を提供する。しかし、これらのアプローチは視覚的忠実度や不正確なポーズ制御に悩まされることが多い。これらの課題に対処するため,映像の写実的かつ高精度なオブジェクト編集のためのフレームワークであるG^2Editorを提案する。提案手法では, 3次元ガウス表現を高密度な事前表現として利用し, 正確なポーズ制御と空間整合性を確保する。シーンレベルの3Dバウンディングボックスレイアウトを用いて、非ターゲットオブジェクトの隠蔽領域を再構築する。さらに、編集対象の外観の詳細をガイドするために、階層的な微細な特徴を生成中の追加条件として組み込む。 Waymo Open Datasetの実験では、G^2Editorが統一されたフレームワーク内でオブジェクトの配置、挿入、削除を効果的にサポートし、コントロール可能性と視覚的品質の両方で既存のメソッドを上回り、下流のデータ駆動タスクの恩恵を受けることが示されている。

論文の概要: Realistic and Controllable 3D Gaussian-Guided Object Editing for Driving Video Generation

関連論文リスト