3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
- URL: http://arxiv.org/abs/2405.18424v1
- Date: Tue, 28 May 2024 17:59:01 GMT
- Title: 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
- Authors: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang,
- Abstract summary: Existing methods solely focus on either 2D individual object or 3D global scene editing.
We propose 3DitScene, a novel and unified scene editing framework.
It enables seamless editing from 2D to 3D, allowing precise control over scene composition and individual objects.
- Score: 100.94916668527544
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Scene image editing is crucial for entertainment, photography, and advertising design. Existing methods solely focus on either 2D individual object or 3D global scene editing. This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity. In this work, we propose 3DitScene, a novel and unified scene editing framework leveraging language-guided disentangled Gaussian Splatting that enables seamless editing from 2D to 3D, allowing precise control over scene composition and individual objects. We first incorporate 3D Gaussians that are refined through generative priors and optimization techniques. Language features from CLIP then introduce semantics into 3D geometry for object disentanglement. With the disentangled Gaussians, 3DitScene allows for manipulation at both the global and individual levels, revolutionizing creative expression and empowering control over scenes and objects. Experimental results demonstrate the effectiveness and versatility of 3DitScene in scene image editing. Code and online demo can be found at our project homepage: https://zqh0253.github.io/3DitScene/.
Related papers
- Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts [76.73043724587679]
We propose a dialogue-based 3D scene editing approach, termed CE3D.
Hash-Atlas represents 3D scene views, which transfers the editing of 3D scenes onto 2D atlas images.
Results demonstrate that CE3D effectively integrates multiple visual models to achieve diverse editing visual effects.
arXiv Detail & Related papers (2024-07-09T13:24:42Z) - GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting [52.150502668874495]
We present GALA3D, generative 3D GAussians with LAyout-guided control, for effective compositional text-to-3D generation.
GALA3D is a user-friendly, end-to-end framework for state-of-the-art scene-level 3D content generation and controllable editing.
arXiv Detail & Related papers (2024-02-11T13:40:08Z) - BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D
Scene Generation [96.58789785954409]
We propose a practical and efficient 3D representation that incorporates an equivariant radiance field with the guidance of a bird's-eye view map.
We produce large-scale, even infinite-scale, 3D scenes via synthesizing local scenes and then stitching them with smooth consistency.
arXiv Detail & Related papers (2023-12-04T18:56:10Z) - GaussianEditor: Swift and Controllable 3D Editing with Gaussian
Splatting [66.08674785436612]
3D editing plays a crucial role in many areas such as gaming and virtual reality.
Traditional 3D editing methods, which rely on representations like meshes and point clouds, often fall short in realistically depicting complex scenes.
Our paper presents GaussianEditor, an innovative and efficient 3D editing algorithm based on Gaussian Splatting (GS), a novel 3D representation.
arXiv Detail & Related papers (2023-11-24T14:46:59Z) - 3D-GOI: 3D GAN Omni-Inversion for Multifaceted and Multi-object Editing [25.442924637806676]
We propose a 3D editing framework, 3D-GOI, to enable multifaceted editing of affine information on multiple objects.
3D-GOI realizes the complex editing function by inverting the abundance of attribute codes controlled by GIRAFFE, a renowned 3D GAN.
arXiv Detail & Related papers (2023-11-18T09:55:56Z) - OBJECT 3DIT: Language-guided 3D-aware Image Editing [27.696507467754877]
Existing image editing tools disregard the underlying 3D geometry from which the image is projected.
We formulate the newt ask of language-guided 3D-aware editing, where objects in an image should be edited according to a language instruction in context of the underlying 3D scene.
We release OBJECT: a dataset consisting of 400K editing examples created from procedurally generated 3D scenes.
Our models show impressive abilities to understand the 3D composition of entire scenes, factoring in surrounding objects, surfaces, lighting conditions, shadows, and physically-plausible object configurations.
arXiv Detail & Related papers (2023-07-20T17:53:46Z) - Blocks2World: Controlling Realistic Scenes with Editable Primitives [5.541644538483947]
We present Blocks2World, a novel method for 3D scene rendering and editing.
Our technique begins by extracting 3D parallelepipeds from various objects in a given scene using convex decomposition.
The next stage involves training a conditioned model that learns to generate images from the 2D-rendered convex primitives.
arXiv Detail & Related papers (2023-07-07T21:38:50Z) - DisCoScene: Spatially Disentangled Generative Radiance Fields for
Controllable 3D-aware Scene Synthesis [90.32352050266104]
DisCoScene is a 3Daware generative model for high-quality and controllable scene synthesis.
It disentangles the whole scene into object-centric generative fields by learning on only 2D images with the global-local discrimination.
We demonstrate state-of-the-art performance on many scene datasets, including the challenging outdoor dataset.
arXiv Detail & Related papers (2022-12-22T18:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.