Fugu-MT 論文翻訳(概要): Towards Scalable and Consistent 3D Editing

論文の概要: Towards Scalable and Consistent 3D Editing

arxiv url: http://arxiv.org/abs/2510.02994v1
Date: Fri, 03 Oct 2025 13:34:55 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-06 16:35:52.39939
Title: Towards Scalable and Consistent 3D Editing
Title（参考訳）: スケーラブルで一貫性のある3D編集を目指して
Authors: Ruihao Xia, Yang Tang, Pan Zhou,
Abstract要約: 3D編集は没入型コンテンツ制作、デジタルエンターテイメント、AR/VRに広く応用されている。 2D編集とは異なり、クロスビューの一貫性、構造的忠実さ、きめ細かい制御性を必要とするため、依然として困難である。我々はこれまでで最大の3D編集ベンチマークである3DEditVerseを紹介した。モデル側では、3次元構造保存条件変換器である3DEditFormerを提案する。
参考スコア（独自算出の注目度）: 32.16698854719098
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D editing - the task of locally modifying the geometry or appearance of a 3D asset - has wide applications in immersive content creation, digital entertainment, and AR/VR. However, unlike 2D editing, it remains challenging due to the need for cross-view consistency, structural fidelity, and fine-grained controllability. Existing approaches are often slow, prone to geometric distortions, or dependent on manual and accurate 3D masks that are error-prone and impractical. To address these challenges, we advance both the data and model fronts. On the data side, we introduce 3DEditVerse, the largest paired 3D editing benchmark to date, comprising 116,309 high-quality training pairs and 1,500 curated test pairs. Built through complementary pipelines of pose-driven geometric edits and foundation model-guided appearance edits, 3DEditVerse ensures edit locality, multi-view consistency, and semantic alignment. On the model side, we propose 3DEditFormer, a 3D-structure-preserving conditional transformer. By enhancing image-to-3D generation with dual-guidance attention and time-adaptive gating, 3DEditFormer disentangles editable regions from preserved structure, enabling precise and consistent edits without requiring auxiliary 3D masks. Extensive experiments demonstrate that our framework outperforms state-of-the-art baselines both quantitatively and qualitatively, establishing a new standard for practical and scalable 3D editing. Dataset and code will be released. Project: https://www.lv-lab.org/3DEditFormer/
Abstract（参考訳）: 3D編集 – 3Dアセットの形状や外観を局所的に修正するタスク – は、没入型コンテンツ作成、デジタルエンターテイメント、AR/VRに広く応用されている。しかし、2D編集とは異なり、クロスビューの一貫性、構造的忠実さ、きめ細かい制御性を必要とするため、依然として困難である。既存のアプローチは、しばしば遅く、幾何学的歪みの傾向があり、手動で正確な3Dマスクに依存している。これらの課題に対処するため、私たちはデータとモデルの両方を前進させます。データ側では,これまでで最大の3D編集ベンチマークである3DEditVerseを導入し,高品質なトレーニングペア116,309台,キュレートされたテストペア1,500台について検討した。 3DEditVerseは、ポーズ駆動の幾何学的編集と基礎モデルによる外観編集の補完的なパイプラインによって構築され、3DEditVerseは、編集の局所性、複数ビューの一貫性、セマンティックアライメントを保証する。モデル側では、3次元構造保存条件変換器である3DEditFormerを提案する。 3DEditFormerは、デュアルガイダンスアテンションとタイムアダプティブゲーティングで画像から3D生成を向上することにより、編集可能な領域を保存された構造から切り離し、補助的な3Dマスクを必要とせずに正確な一貫した編集を可能にする。大規模な実験により、我々のフレームワークは、定量的かつ質的に最先端のベースラインを上回り、実用的でスケーラブルな3D編集のための新しい標準を確立した。データセットとコードがリリースされる。プロジェクト:https://www.lv-lab.org/3DEditFormer/

論文の概要: Towards Scalable and Consistent 3D Editing

関連論文リスト