ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing
- URL: http://arxiv.org/abs/2411.05006v1
- Date: Thu, 07 Nov 2024 18:59:54 GMT
- Title: ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing
- Authors: Jun-Kun Chen, Yu-Xiong Wang,
- Abstract summary: ProEdit is a framework for high-quality 3D scene editing guided by diffusion distillation.
Our framework controls the size of FOS and reduces inconsistency by decomposing the overall editing task into several subtasks.
ProEdit achieves state-of-the-art results in various scenes and challenging editing tasks.
- Score: 33.42456524414643
- License:
- Abstract: This paper proposes ProEdit - a simple yet effective framework for high-quality 3D scene editing guided by diffusion distillation in a novel progressive manner. Inspired by the crucial observation that multi-view inconsistency in scene editing is rooted in the diffusion model's large feasible output space (FOS), our framework controls the size of FOS and reduces inconsistency by decomposing the overall editing task into several subtasks, which are then executed progressively on the scene. Within this framework, we design a difficulty-aware subtask decomposition scheduler and an adaptive 3D Gaussian splatting (3DGS) training strategy, ensuring high quality and efficiency in performing each subtask. Extensive evaluation shows that our ProEdit achieves state-of-the-art results in various scenes and challenging editing tasks, all through a simple framework without any expensive or sophisticated add-ons like distillation losses, components, or training procedures. Notably, ProEdit also provides a new way to control, preview, and select the "aggressivity" of editing operation during the editing process.
Related papers
- Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing [60.730661748555214]
We introduce textbfTask-textbfOriented textbfDiffusion textbfInversion (textbfTODInv), a novel framework that inverts and edits real images tailored to specific editing tasks.
ToDInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability.
arXiv Detail & Related papers (2024-08-23T22:16:34Z) - TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation [35.951718189386845]
We propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS)
TAS facilitates a tightly coupled iterative process between 2D view editing and 3D updating, preventing error accumulation yielded from text-to-image process.
We present a tuning-free View-Consistent Attention Control (VCAC) module that leverages cross-view semantic and geometric reference from the source branch to yield aligned views from the target branch during the editing of 2D views.
arXiv Detail & Related papers (2024-07-02T08:06:58Z) - View-Consistent 3D Editing with Gaussian Splatting [50.6460814430094]
View-consistent Editing (VcEdit) is a novel framework that seamlessly incorporates 3DGS into image editing processes.
By incorporating consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency.
arXiv Detail & Related papers (2024-03-18T15:22:09Z) - Free-Editor: Zero-shot Text-driven 3D Scene Editing [8.966537479017951]
Training a diffusion model specifically for 3D scene editing is challenging due to the scarcity of large-scale datasets.
We introduce a novel, training-free 3D scene editing technique called textscFree-Editor, which enables users to edit 3D scenes without the need for model retraining.
Our method effectively addresses the issue of multi-view style inconsistency found in state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2023-12-21T08:40:57Z) - Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization [21.8454418337306]
We propose Plasticine3D, a novel text-guided controlled 3D editing pipeline that can perform 3D non-rigid editing.
Our work divides the editing process into a geometry editing stage and a texture editing stage to achieve separate control of structure and appearance.
For the purpose of fine-grained control, we propose Embedding-Fusion (EF) to blend the original characteristics with the editing objectives in the embedding space.
arXiv Detail & Related papers (2023-12-15T09:01:54Z) - SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds [73.91114735118298]
Shap-Editor is a novel feed-forward 3D editing framework.
We demonstrate that direct 3D editing in this space is possible and efficient by building a feed-forward editor network.
arXiv Detail & Related papers (2023-12-14T18:59:06Z) - Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
Local-Global Iterative Training [61.984277261016146]
We propose a CustomNeRF model that unifies a text description or a reference image as the editing prompt.
To tackle the first challenge, we propose a Local-Global Iterative Editing (LGIE) training scheme that alternates between foreground region editing and full-image editing.
For the second challenge, we also design a class-guided regularization that exploits class priors within the generation model to alleviate the inconsistency problem.
arXiv Detail & Related papers (2023-12-04T06:25:06Z) - Emu Edit: Precise Image Editing via Recognition and Generation Tasks [62.95717180730946]
We present Emu Edit, a multi-task image editing model which sets state-of-the-art results in instruction-based image editing.
We train it to multi-task across an unprecedented range of tasks, such as region-based editing, free-form editing, and Computer Vision tasks.
We show that Emu Edit can generalize to new tasks, such as image inpainting, super-resolution, and compositions of editing tasks, with just a few labeled examples.
arXiv Detail & Related papers (2023-11-16T18:55:58Z) - Editing 3D Scenes via Text Prompts without Retraining [80.57814031701744]
DN2N is a text-driven editing method that allows for the direct acquisition of a NeRF model with universal editing capabilities.
Our method employs off-the-shelf text-based editing models of 2D images to modify the 3D scene images.
Our method achieves multiple editing types, including but not limited to appearance editing, weather transition, material changing, and style transfer.
arXiv Detail & Related papers (2023-09-10T02:31:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.