Related papers: 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

URL: http://arxiv.org/abs/2405.18424v1
Date: Tue, 28 May 2024 17:59:01 GMT
Title: 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
Authors: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang,
Abstract summary: Existing methods solely focus on either 2D individual object or 3D global scene editing. We propose 3DitScene, a novel and unified scene editing framework. It enables seamless editing from 2D to 3D, allowing precise control over scene composition and individual objects.
Score: 100.94916668527544
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Scene image editing is crucial for entertainment, photography, and advertising design. Existing methods solely focus on either 2D individual object or 3D global scene editing. This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity. In this work, we propose 3DitScene, a novel and unified scene editing framework leveraging language-guided disentangled Gaussian Splatting that enables seamless editing from 2D to 3D, allowing precise control over scene composition and individual objects. We first incorporate 3D Gaussians that are refined through generative priors and optimization techniques. Language features from CLIP then introduce semantics into 3D geometry for object disentanglement. With the disentangled Gaussians, 3DitScene allows for manipulation at both the global and individual levels, revolutionizing creative expression and empowering control over scenes and objects. Experimental results demonstrate the effectiveness and versatility of 3DitScene in scene image editing. Code and online demo can be found at our project homepage: https://zqh0253.github.io/3DitScene/.

Related papers

DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation [19.817968922757007]
We present DreamScene, an end-to-end framework for high-quality and editable 3D scene generation from text or dialogue.<n>To ensure global consistent, DreamScene employs a progressive camera sampling strategy tailored to both indoor and outdoor settings.<n>Experiments demonstrate that DreamScene surpasses prior methods in quality, consistency, and flexibility.
arXiv Detail & Related papers (2025-07-18T14:45:54Z)
3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting [31.98493679748211]
We propose 3DSceneEditor, a fully 3D-based paradigm for real-time, precise editing of 3D scenes using Gaussian Splatting. Unlike conventional methods, 3DSceneEditor operates through a streamlined 3D pipeline, enabling direct manipulation of Gaussians for efficient, high-quality edits.
arXiv Detail & Related papers (2024-12-02T15:03:55Z)
Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop [32.92038804110175]
Scene Copilot is a framework combining large language models (LLMs) with a procedural 3D scene generator. Scene Codex is designed to translate textual user input into commands understandable by the 3D scene generator. BlenderGPT provides users with an intuitive and direct way to precisely control the generated 3D scene and the final output video.
arXiv Detail & Related papers (2024-11-26T19:21:57Z)
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing [114.14164860467227]
We propose Edit-Room, a framework capable of executing a variety of layout edits through natural language commands. Specifically, EditRoom leverages Large Language Models (LLMs) for command planning and generates target scenes. We have developed an automatic pipeline to augment existing 3D scene datasets and introduced EditRoom-DB, a large-scale dataset with 83k editing pairs.
arXiv Detail & Related papers (2024-10-03T17:42:24Z)
SIn-NeRF2NeRF: Editing 3D Scenes with Instructions through Segmentation and Inpainting [0.3119157043062931]
Instruct-NeRF2NeRF (in2n) is a promising method that enables editing of 3D scenes composed of Neural Radiance Field (NeRF) using text prompts. In this project, we enable geometrical changes of objects within the 3D scene by selectively editing the object after separating it from the scene.
arXiv Detail & Related papers (2024-08-23T02:20:42Z)
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts [76.73043724587679]
We propose a dialogue-based 3D scene editing approach, termed CE3D. Hash-Atlas represents 3D scene views, which transfers the editing of 3D scenes onto 2D atlas images. Results demonstrate that CE3D effectively integrates multiple visual models to achieve diverse editing visual effects.
arXiv Detail & Related papers (2024-07-09T13:24:42Z)
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting [52.150502668874495]
We present GALA3D, generative 3D GAussians with LAyout-guided control, for effective compositional text-to-3D generation. GALA3D is a user-friendly, end-to-end framework for state-of-the-art scene-level 3D content generation and controllable editing.
arXiv Detail & Related papers (2024-02-11T13:40:08Z)
GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting [66.08674785436612]
3D editing plays a crucial role in many areas such as gaming and virtual reality. Traditional 3D editing methods, which rely on representations like meshes and point clouds, often fall short in realistically depicting complex scenes. Our paper presents GaussianEditor, an innovative and efficient 3D editing algorithm based on Gaussian Splatting (GS), a novel 3D representation.
arXiv Detail & Related papers (2023-11-24T14:46:59Z)
OBJECT 3DIT: Language-guided 3D-aware Image Editing [27.696507467754877]
Existing image editing tools disregard the underlying 3D geometry from which the image is projected. We formulate the newt ask of language-guided 3D-aware editing, where objects in an image should be edited according to a language instruction in context of the underlying 3D scene. We release OBJECT: a dataset consisting of 400K editing examples created from procedurally generated 3D scenes. Our models show impressive abilities to understand the 3D composition of entire scenes, factoring in surrounding objects, surfaces, lighting conditions, shadows, and physically-plausible object configurations.
arXiv Detail & Related papers (2023-07-20T17:53:46Z)
DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis [90.32352050266104]
DisCoScene is a 3Daware generative model for high-quality and controllable scene synthesis. It disentangles the whole scene into object-centric generative fields by learning on only 2D images with the global-local discrimination. We demonstrate state-of-the-art performance on many scene datasets, including the challenging outdoor dataset.
arXiv Detail & Related papers (2022-12-22T18:59:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.