Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
- URL: http://arxiv.org/abs/2112.05644v1
- Date: Fri, 10 Dec 2021 16:17:01 GMT
- Title: Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
- Authors: Kai Wang, Xianghao Xu, Leon Lei, Selena Ling, Natalie Lindsay, Angel
X. Chang, Manolis Savva, Daniel Ritchie
- Abstract summary: We propose the task of generating novel 3D floor plans from existing 3D rooms.
One uses available 2D floor plans to guide selection and deformation of 3D rooms; the other learns to retrieve a set of compatible 3D rooms and combine them into novel layouts.
- Score: 22.188206636953794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Realistic 3D indoor scene datasets have enabled significant recent progress
in computer vision, scene understanding, autonomous navigation, and 3D
reconstruction. But the scale, diversity, and customizability of existing
datasets is limited, and it is time-consuming and expensive to scan and
annotate more. Fortunately, combinatorics is on our side: there are enough
individual rooms in existing 3D scene datasets, if there was but a way to
recombine them into new layouts. In this paper, we propose the task of
generating novel 3D floor plans from existing 3D rooms. We identify three
sub-tasks of this problem: generation of 2D layout, retrieval of compatible 3D
rooms, and deformation of 3D rooms to fit the layout. We then discuss different
strategies for solving the problem, and design two representative pipelines:
one uses available 2D floor plans to guide selection and deformation of 3D
rooms; the other learns to retrieve a set of compatible 3D rooms and combine
them into novel layouts. We design a set of metrics that evaluate the generated
results with respect to each of the three subtasks and show that different
methods trade off performance on these subtasks. Finally, we survey downstream
tasks that benefit from generated 3D scenes and discuss strategies in selecting
the methods most appropriate for the demands of these tasks.
Related papers
- Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint [61.25279122171029]
We present a framework that allows controllable and compositional 3D generation from text prompts.
Our approach leverages 2D layouts as a blueprint to facilitate precise and plausible control over 3D generation.
arXiv Detail & Related papers (2024-10-20T13:41:50Z) - ControlRoom3D: Room Generation using Semantic Proxy Rooms [48.93419701713694]
We present ControlRoom3D, a novel method to generate high-quality room meshes.
Our approach is a user-defined 3D semantic proxy room that outlines a rough room layout.
When rendered to 2D, this 3D representation provides valuable geometric and semantic information to control powerful 2D models.
arXiv Detail & Related papers (2023-12-08T17:55:44Z) - Uni3D: Exploring Unified 3D Representation at Scale [66.26710717073372]
We present Uni3D, a 3D foundation model to explore the unified 3D representation at scale.
Uni3D uses a 2D ViT end-to-end pretrained to align the 3D point cloud features with the image-text aligned features.
We show that the strong Uni3D representation also enables applications such as 3D painting and retrieval in the wild.
arXiv Detail & Related papers (2023-10-10T16:49:21Z) - 3D-LLM: Injecting the 3D World into Large Language Models [60.43823088804661]
Large language models (LLMs) and Vision-Language Models (VLMs) have been proven to excel at multiple tasks, such as commonsense reasoning.
We propose to inject the 3D world into large language models and introduce a new family of 3D-LLMs.
Specifically, 3D-LLMs can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks.
arXiv Detail & Related papers (2023-07-24T17:59:02Z) - Multi-CLIP: Contrastive Vision-Language Pre-training for Question
Answering tasks in 3D Scenes [68.61199623705096]
Training models to apply common-sense linguistic knowledge and visual concepts from 2D images to 3D scene understanding is a promising direction that researchers have only recently started to explore.
We propose a novel 3D pre-training Vision-Language method, namely Multi-CLIP, that enables a model to learn language-grounded and transferable 3D scene point cloud representations.
arXiv Detail & Related papers (2023-06-04T11:08:53Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - Learning 3D Scene Priors with 2D Supervision [37.79852635415233]
We propose a new method to learn 3D scene priors of layout and shape without requiring any 3D ground truth.
Our method represents a 3D scene as a latent vector, from which we can progressively decode to a sequence of objects characterized by their class categories.
Experiments on 3D-FRONT and ScanNet show that our method outperforms state of the art in single-view reconstruction.
arXiv Detail & Related papers (2022-11-25T15:03:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.