Related papers: SceneTeller: Language-to-3D Scene Generation

SceneTeller: Language-to-3D Scene Generation

URL: http://arxiv.org/abs/2407.20727v1
Date: Tue, 30 Jul 2024 10:45:28 GMT
Title: SceneTeller: Language-to-3D Scene Generation
Authors: Başak Melis Öcal, Maxim Tatarchenko, Sezer Karaoglu, Theo Gevers,
Abstract summary: Given a prompt in natural language describing the object placement in the room, our method produces a high-quality 3D scene corresponding to it. Our turnkey pipeline produces state-of-the-art 3D scenes, while being easy to use even for novices.
Score: 15.209079637302905
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Designing high-quality indoor 3D scenes is important in many practical applications, such as room planning or game development. Conventionally, this has been a time-consuming process which requires both artistic skill and familiarity with professional software, making it hardly accessible for layman users. However, recent advances in generative AI have established solid foundation for democratizing 3D design. In this paper, we propose a pioneering approach for text-based 3D room design. Given a prompt in natural language describing the object placement in the room, our method produces a high-quality 3D scene corresponding to it. With an additional text prompt the users can change the appearance of the entire scene or of individual objects in it. Built using in-context learning, CAD model retrieval and 3D-Gaussian-Splatting-based stylization, our turnkey pipeline produces state-of-the-art 3D scenes, while being easy to use even for novices. Our project page is available at https://sceneteller.github.io/.

Related papers

ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary [37.41274496314127]
ArtiScene is a training-free automated pipeline for scene design.<n>It generates 2D images from a scene description, then extract the shape and appearance of objects to create 3D models.<n>It outperforms state-of-the-art benchmarks by a large margin in layout and aesthetic quality by quantitative metrics.
arXiv Detail & Related papers (2025-05-31T23:03:54Z)
Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint [61.25279122171029]
We present a framework that allows controllable and compositional 3D generation from text prompts. Our approach leverages 2D layouts as a blueprint to facilitate precise and plausible control over 3D generation.
arXiv Detail & Related papers (2024-10-20T13:41:50Z)
SceneCraft: Layout-Guided 3D Scene Generation [29.713491313796084]
SceneCraft is a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences. Our method significantly outperforms existing approaches in complex indoor scene generation with diverse textures, consistent geometry, and realistic visual quality.
arXiv Detail & Related papers (2024-10-11T17:59:58Z)
Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches [50.51643519253066]
3D Content Generation is at the heart of many computer graphics applications, including video gaming, film-making, virtual and augmented reality, etc. This paper proposes a novel deep-learning based approach for automatically generating interactive and playable 3D game scenes.
arXiv Detail & Related papers (2024-08-08T16:27:37Z)
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts [76.73043724587679]
We propose a dialogue-based 3D scene editing approach, termed CE3D. Hash-Atlas represents 3D scene views, which transfers the editing of 3D scenes onto 2D atlas images. Results demonstrate that CE3D effectively integrates multiple visual models to achieve diverse editing visual effects.
arXiv Detail & Related papers (2024-07-09T13:24:42Z)
SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets. We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z)
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes [56.727745047799246]
3D scene understanding has gained significant attention due to its wide range of applications. This paper presents Chat-3D, which combines the 3D visual perceptual ability of pre-trained 3D representations and the impressive reasoning and conversation capabilities of advanced LLMs.
arXiv Detail & Related papers (2023-08-17T03:52:15Z)
Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene Descriptions [0.19116784879310023]
We present a system to generate stylized assets for 3D scenes described by a short phrase. It is robust to open-world concepts in a way that traditional methods trained on limited data are not, more creative freedom to the 3D artist.
arXiv Detail & Related papers (2023-06-09T19:24:39Z)
CLIP-Guided Vision-Language Pre-training for Question Answering in 3D Scenes [68.61199623705096]
We design a novel 3D pre-training Vision-Language method that helps a model learn semantically meaningful and transferable 3D scene point cloud representations. We inject the representational power of the popular CLIP model into our 3D encoder by aligning the encoded 3D scene features with the corresponding 2D image and text embeddings. We evaluate our model's 3D world reasoning capability on the downstream task of 3D Visual Question Answering.
arXiv Detail & Related papers (2023-04-12T16:52:29Z)
Learning 3D Scene Priors with 2D Supervision [37.79852635415233]
We propose a new method to learn 3D scene priors of layout and shape without requiring any 3D ground truth. Our method represents a 3D scene as a latent vector, from which we can progressively decode to a sequence of objects characterized by their class categories. Experiments on 3D-FRONT and ScanNet show that our method outperforms state of the art in single-view reconstruction.
arXiv Detail & Related papers (2022-11-25T15:03:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.