Related papers: EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting

EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting

URL: http://arxiv.org/abs/2412.11520v1
Date: Mon, 16 Dec 2024 07:56:04 GMT
Title: EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting
Authors: Dong In Lee, Hyeongcheol Park, Jiyoung Seo, Eunbyung Park, Hyunje Park, Ha Dam Baek, Shin Sangheon, Sangmin kim, Sangpil Kim,
Abstract summary: We propose textbfEditSplat, a novel 3D editing framework that integrates Multi-view Fusion Guidance (MFG) and Attention-Guided Trimming (AGT)<n>Our MFG ensures multi-view consistency by incorporating essential multi-view information into the diffusion process, leveraging-free guidance from the text-to-image diffusion model and the geometric properties of 3DGS.<n>Our AGT leverages the explicit representation of 3DGS to selectively prune and optimize 3D Gaussians, enhancing optimization efficiency and enabling precise, semantically rich local edits.
Score: 3.9006270555948133
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advancements in 3D editing have highlighted the potential of text-driven methods in real-time, user-friendly AR/VR applications. However, current methods rely on 2D diffusion models without adequately considering multi-view information, resulting in multi-view inconsistency. While 3D Gaussian Splatting (3DGS) significantly improves rendering quality and speed, its 3D editing process encounters difficulties with inefficient optimization, as pre-trained Gaussians retain excessive source information, hindering optimization. To address these limitations, we propose \textbf{EditSplat}, a novel 3D editing framework that integrates Multi-view Fusion Guidance (MFG) and Attention-Guided Trimming (AGT). Our MFG ensures multi-view consistency by incorporating essential multi-view information into the diffusion process, leveraging classifier-free guidance from the text-to-image diffusion model and the geometric properties of 3DGS. Additionally, our AGT leverages the explicit representation of 3DGS to selectively prune and optimize 3D Gaussians, enhancing optimization efficiency and enabling precise, semantically rich local edits. Through extensive qualitative and quantitative evaluations, EditSplat achieves superior multi-view consistency and editing quality over existing methods, significantly enhancing overall efficiency.

Related papers

Mastering Regional 3DGS: Locating, Initializing, and Editing with Diverse 2D Priors [67.22744959435708]
3D semantic parsing often underperforms compared to its 2D counterpart, making targeted manipulations within 3D spaces more difficult and limiting the fidelity of edits.<n>We address this problem by leveraging 2D diffusion editing to accurately identify modification regions in each view, followed by inverse rendering for 3D localization.<n> Experiments demonstrate that our method achieves state-of-the-art performance while delivering up to a $4times$ speedup.
arXiv Detail & Related papers (2025-07-07T19:15:43Z)
InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior [3.6657066043195585]
InterGSEdit is a novel framework for high-quality 3DGS editing via interactively selecting key views with users' preferences.<n>We project $GAP3D$ to obtain 3D-constrained attention, which are fused with 2D cross-attention via Attention Fusion Network (AFN)
arXiv Detail & Related papers (2025-07-07T13:04:26Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization. We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information [4.956066467858058]
We present a novel framework for enhancing the visual fidelity and consistency of text-guided 3D Gaussian Splatting (3DGS) editing. Our method demonstrates superior performance in rendering quality and view consistency compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-03-14T17:15:26Z)
3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting [31.98493679748211]
We propose 3DSceneEditor, a fully 3D-based paradigm for real-time, precise editing of 3D scenes using Gaussian Splatting. Unlike conventional methods, 3DSceneEditor operates through a streamlined 3D pipeline, enabling direct manipulation of Gaussians for efficient, high-quality edits.
arXiv Detail & Related papers (2024-12-02T15:03:55Z)
3D Gaussian Editing with A Single Image [19.662680524312027]
We introduce a novel single-image-driven 3D scene editing approach based on 3D Gaussian Splatting. Our method learns to optimize the 3D Gaussians to align with an edited version of the image rendered from a user-specified viewpoint. Experiments show the effectiveness of our method in handling geometric details, long-range, and non-rigid deformation.
arXiv Detail & Related papers (2024-08-14T13:17:42Z)
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting. Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z)
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.<n>A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.<n>This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z)
View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.<n>By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z)
Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction [89.53963284958037]
We propose a novel motion-aware enhancement framework for dynamic scene reconstruction. Specifically, we first establish a correspondence between 3D Gaussian movements and pixel-level flow. For the prevalent deformation-based paradigm that presents a harder optimization problem, a transient-aware deformation auxiliary module is proposed.
arXiv Detail & Related papers (2024-03-18T03:46:26Z)
Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models [83.97844535389073]
A major obstacle hindering the widespread adoption of 3D content editing is its time-intensive processing. We propose that by incorporating correspondence regularization into diffusion models, the process of 3D editing can be significantly accelerated. In most scenarios, our proposed technique brings a 10$times$ speed-up compared to the baseline method and completes the editing of a 3D scene in 2 minutes with comparable quality.
arXiv Detail & Related papers (2023-12-13T23:27:17Z)
Directional Texture Editing for 3D Models [51.31499400557996]
ITEM3D is designed for automatic textbf3D object editing according to the text textbfInstructions. Leveraging the diffusion models and the differentiable rendering, ITEM3D takes the rendered images as the bridge of text and 3D representation.
arXiv Detail & Related papers (2023-09-26T12:01:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.