SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
- URL: http://arxiv.org/abs/2312.09246v1
- Date: Thu, 14 Dec 2023 18:59:06 GMT
- Title: SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
- Authors: Minghao Chen, Junyu Xie, Iro Laina, Andrea Vedaldi
- Abstract summary: Shap-Editor is a novel feed-forward 3D editing framework.
We demonstrate that direct 3D editing in this space is possible and efficient by building a feed-forward editor network.
- Score: 73.91114735118298
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel feed-forward 3D editing framework called Shap-Editor.
Prior research on editing 3D objects primarily concentrated on editing
individual objects by leveraging off-the-shelf 2D image editing networks. This
is achieved via a process called distillation, which transfers knowledge from
the 2D network to 3D assets. Distillation necessitates at least tens of minutes
per asset to attain satisfactory editing results, and is thus not very
practical. In contrast, we ask whether 3D editing can be carried out directly
by a feed-forward network, eschewing test-time optimisation. In particular, we
hypothesise that editing can be greatly simplified by first encoding 3D objects
in a suitable latent space. We validate this hypothesis by building upon the
latent space of Shap-E. We demonstrate that direct 3D editing in this space is
possible and efficient by building a feed-forward editor network that only
requires approximately one second per edit. Our experiments show that
Shap-Editor generalises well to both in-distribution and out-of-distribution 3D
assets with different prompts, exhibiting comparable performance with methods
that carry out test-time optimisation for each edited instance.
Related papers
- DragScene: Interactive 3D Scene Editing with Single-view Drag Instructions [9.31257776760014]
3D editing has shown remarkable capability in editing scenes based on various instructions.
Existing methods struggle with achieving intuitive, localized editing.
We introduce DragScene, a framework that integrates drag-style editing with diverse 3D representations.
arXiv Detail & Related papers (2024-12-18T07:02:01Z) - PrEditor3D: Fast and Precise 3D Shape Editing [100.09112677669376]
We propose a training-free approach to 3D editing that enables the editing of a single shape within a few minutes.
The edited 3D mesh aligns well with the prompts, and remains identical for regions that are not intended to be altered.
arXiv Detail & Related papers (2024-12-09T15:44:47Z) - Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images [72.70883914827687]
Tailor3D is a novel pipeline that creates customized 3D assets from editable dual-side images.
It provides a user-friendly, efficient solution for editing 3D assets, with each editing step taking only seconds to complete.
arXiv Detail & Related papers (2024-07-08T17:59:55Z) - DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting.
Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z) - DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing [72.54566271694654]
We consider the problem of editing 3D objects and scenes based on open-ended language instructions.
A common approach to this problem is to use a 2D image generator or editor to guide the 3D editing process.
This process is often inefficient due to the need for iterative updates of costly 3D representations.
arXiv Detail & Related papers (2024-04-29T17:59:30Z) - Real-time 3D-aware Portrait Editing from a Single Image [111.27169315556444]
3DPE can edit a face image following given prompts, like reference images or text descriptions.
A lightweight module is distilled from a 3D portrait generator and a text-to-image model.
arXiv Detail & Related papers (2024-02-21T18:36:26Z) - Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization [21.8454418337306]
We propose Plasticine3D, a novel text-guided controlled 3D editing pipeline that can perform 3D non-rigid editing.
Our work divides the editing process into a geometry editing stage and a texture editing stage to achieve separate control of structure and appearance.
For the purpose of fine-grained control, we propose Embedding-Fusion (EF) to blend the original characteristics with the editing objectives in the embedding space.
arXiv Detail & Related papers (2023-12-15T09:01:54Z) - SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing
Field [37.8162035179377]
We present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image.
To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space.
Our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes.
arXiv Detail & Related papers (2023-03-23T13:58:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.